跳转到主要内容

训练和部署PyTorch模型

项目描述

GitlabCIPipeline GitlabCICoverage Pypi Downloads

更新 2021-10-05:Netharn作为训练循环将不再维护。请使用pytorch-lightning。这个库可能会逐渐演变成lightning的扩展。

此项目的官方网站是:https://gitlab.kitware.com/computer-vision/netharn

如果您想要一个PyTorch训练循环框架,该框架(1)根据超参数的哈希值选择目录名称,(2)可以通过静态自动提取代码中模型拓扑的定义并与其权重一起压缩为单个文件来部署您的模型,(3)具有简短的终端输出和丰富的日志输出,(4)具有基于规则的验证损失监控并可以降低学习率或提前停止,(5)具有tensorboard和/或matplotlib训练统计的可视化,并且(6)设计用于扩展,那么您可能会对NetHarn感兴趣。

名称

NetHarn(发音为“net-harn”)

框架

PyTorch

功能
  • 超参数跟踪

  • 训练目录管理

  • 基于回调的公共API

  • XPU - [cpu, gpu, multi-gpu]的代码抽象。

  • 单文件部署(自版本 0.1.0 新增)。

  • 使用 pytest 和 xdoctest 实现合理的测试覆盖率

  • 在 appveyor 和 travis 上进行 CI 测试(注意,由于一些小问题,有少数测试未通过)

  • 丰富的实用工具集

  • PyTorch 对象的扩展(例如,critions、初始化器、层、优化器、调度器)

BUILTINS
  • 训练循环模板

  • 快照/检查点

  • 进度条(后端选择:[progiter, tqdm])

  • 训练历史数据的数据来源在 train_info.json

  • tensorboard 指标可视化(可选)

设计理念

避免模板代码,在需要时自己编写,不要重复。实验应与超参数的选择紧密相关,并且框架应根据这些超参数构建目录层次结构。

SLOGAN

重整并训练。

使用模式
  1. 就像平常一样编写 torch 对象(即 Dataset、Model、Criterion、Initializer 和 Scheduler)的代码。

  2. netharn.FitHarn 对象继承,定义 run_batchon_batchon_epoch 等…

  3. 创建一个 netharn.HyperParams 实例以指定您的数据集、模型、评判标准等…

  4. 使用这些超参数创建您的 FitHarn 对象实例。

  5. 然后执行其 run 方法。

  6. ???

  7. 盈利

示例
  • 使用 netharn.models.ToyNet2d 进行 ToyData2d 分类(请参阅 netharn/fit_harn.py:__DOC__:0 中的 doctest)

  • MNIST 数字分类使用 MnistNet(netharn/examples/mnist.py)

  • Cifar10 类别分类使用 ResNet50 / dpn91(netharn/examples/cifar.py)

  • Voc2007+2012 物体检测使用 YOLOv2(netharn/examples/yolo_voc.py)

  • IBEIS 度量学习使用 SiameseLP(netharn/examples/siam_ibeis.py)

稳定性

大部分是无害的。大多数测试都通过了,当前的失败可能不是关键的。我能在我的机器上使用它(tm)。在这个开发的早期阶段,还有一些痛点。欢迎提出问题和 PR。

已知错误
  • 计算检测 mAP / AP 的指标可能不正确。

  • YOLO 示例达到约 70% mAP(使用 Girshik 的 mAP 代码),而我们应该达到 74-76%

作者评论
  • MNIST、CIFAR 和 VOC 示例将在需要时下载数据。

  • CIFAR 示例中的 ResNet50 达到 95.72% 的准确率,优于我所知的最佳 DPN92 结果(95.16%)。这个结果似乎是真的,我不认为我在测量中犯了错误(但这个结果尚未经过同行评审,因此仅供参考)。我已经重复了这一结果几次。您可以使用 examples/cifar.py 中的代码来看看是否能做到(如果做不到,请告诉我)。

  • YOLO 示例基于 EAVise 的优秀 lightnet(https://gitlab.com/EAVISE/lightnet/)包。

  • 我重新实现了 CocoAPI(请参阅 netharn.data.coco_api),因为我与原始实现有一些(可能是微小的)问题。我对其进行了大量扩展,并建议使用它。

  • 度量学习示例需要 ibeis 软件:https://github.com/Erotemic/ibeis

依赖关系
  • torch

  • numpy

  • Cython

  • ubelt

  • xdoctest

  • …(请参阅 requirements.txt)

功能(继续)

  • 超参数跟踪:您的超参数的哈希值决定了数据将被写入的目录。我们还允许以“更优雅”的方式管理目录结构。给定一个 HyperParams 对象,我们创建符号链接 {workdir}/fit/name/{name},它指向 {workdir}/fit/runs/{name}/{hashid}

  • 自动重启:默认情况下,连续两次调用 FitHarn.run 将从上次停止的地方重新启动训练(只要超参数没有更改)。

  • “智能”快照清理:维护模型权重文件可能会占用大量内存。根据 harn.preferences 的设置,netharn.FitHarn 将定期删除较旧或得分较低的快照。

  • 部署文件:模型权重和架构一起写入一个相对便携的 zip 文件。我们还打包了训练元数据,以维护数据来源并简化实验的复现。

  • 从任何预训练状态重启:使用 netharn.initializers.PretainedInitializer

  • torch 中构建网络的实用工具:如 netharn.layers.ConvNormNd 这样的层使得构建 n=1、2 或 3 维数据的网络变得容易。

  • 分析输出形状和感受野:Netharn 定义了一个 netharn.layers.AnalyticModule,如果用户定义了一个特殊的 _output_for 方法,它就可以自动定义 forwardoutput_shape_forreceptive_field_for。该方法使用 netharn.analytic_for.Outputnetharn.analytic_for.Hiddennetharn.analytic_for.OutputFor 特殊的可调用对象编写。

  • 示例任务:标准任务(如对象分割、分类和检测)的基线代码定义在 netharn.examples 中。示例还提供了 ndsamplerkwimagekwannotkwplot 的示例使用场景。

安装

未来这些说明可能与开发者设置说明不同,但到目前为止它们是相同的。

mkdir -p ~/code
git clone git@github.com:Erotemic/netharn.git ~/code/netharn
cd ~/code/netharn
./run_developer_setup.sh

尽管所有 netharn 依赖项都应该在 pypi(带有许多linux2010 轮子的二进制包)上可用,但还有其他与 netharn 同时开发的包。要安装这些依赖项的开发版本,请运行 python super_setup.py ensure 以检出存储库并确保它们位于正确的分支上,运行 python super_setup.py develop 以以开发模式构建一切,并运行 python super_setup.py pull 以更新到分支上的最新版本。

描述

PyTorch 的参数化训练 harness。

训练模型并跟踪您的超参数。

这是对我研究存储库 clab 中开发的好部分的干净移植。

请参阅 netharn/examples 文件夹以获取示例用法。doctests 也是一个很好的资源。如果能有更好的文档就更好了。

NetHarn 是一个用于训练和部署任意 PyTorch 模型的研究框架。它旨在最小化训练循环的样板代码并跟踪超参数以鼓励可重复研究。NetHarn 将训练模型的问题分为以下核心超参数组件:数据集、模型、标准、初始化器、优化器和学习率调度器。具有不同超参数的运行将自动记录到单独的目录中,这使得比较两个实验的结果变得简单。NetHarn 还具有创建独立于用于训练的系统的单个文件部署训练模型的 capability。这使得研究结果可以快速且简单地外部验证并投入生产。

核心回调结构

Netharn 是通过继承 netharn.FitHarn 类、重载多个方法,然后使用特定的超参数创建自定义 FitHarn 实例来设计的。

FitHarn 允许您通过其回调系统自定义训练循环的执行。您只需简单地重载这些方法之一即可编写一个回调。有带默认行为的回调和不带默认行为的回调。

具有默认行为的那些直接影响学习过程。虽然它们不必被覆盖,但通常应该覆盖,因为不同的任务需要通过训练管道移动数据的不同方式。

没有默认行为的那些允许开发者执行自定义代码,并在训练循环的特殊位置。这些通常用于记录自定义指标和输出可视化。

以下说明列出了回调函数,大致按照它们被FitHarn.run方法调用的顺序。树结构表示循环嵌套。

├─ after_initialize (no default) - runs after FitHarn is initialized
│  │
│  ├─ before_epochs (no default) - runs once before all train/vali/test
│  │  │    epochs on each iteration
│  │  │
│  │  ├─ prepare_epoch (no default) - runs before each train, vali,
│  │  │  │    and test epoch
│  │  │  │
│  │  │  ├─ prepare_batch (has default behavior) - transfer data from
│  │  │  │    CPU to the XPU
│  │  │  │
│  │  │  ├─ run_batch (has default behavior) - execute the forward pass
│  │  │  │    and compute the loss
│  │  │  │
│  │  │  ├─ backpropogate (has default behavior) - accumulate gradients
│  │  │  │    and take an optimization step
│  │  │  │
│  │  │  └─ on_batch (no default) - runs after `run_batch` and
│  │  │       `backpropogate` on every batch
│  │  │
│  │  └─ on_epoch (no default) - runs after each train, vali, and test
│  │         epoch finishes.  Any custom scalar metrics returned in a
│  │         dictionary will be recorded by the FitHarn loggers.
│  │
│  └─ after_epochs (no default) - runs after the all data splits are
│         finished with  the current epoch.
│
└─ on_complete (no default) - runs after the main loop is complete

给定一个自定义FitHarn类,请参阅“玩具示例”部分,了解如何构建超参数和执行训练循环(即FitHarn.run)的详细信息。

开发者设置

将来这些说明可能不同于安装说明,但到目前为止它们是相同的。

sudo apt-get install python3 python-dev python3-dev \
 build-essential libssl-dev libffi-dev \
 libxml2-dev libxslt1-dev zlib1g-dev \
 python-pip

mkdir -p ~/code
git clone git@github.com:Erotemic/netharn.git ~/code/netharn
cd ~/code/netharn

./run_developer_setup.sh

文档

Netharn的文档目前较为稀少。我通常在代码本身中使用docstrings进行文档编写。将来,这些内容很可能会在read-the-docs风格的文档页面上进行整合,但现在您需要查看代码来阅读文档。

netharn提供的主要概念是“FitHarn”,它具有相当不错的模块级docstring,以及许多优秀的类/方法级docstring:https://gitlab.kitware.com/computer-vision/netharn/-/blob/master/netharn/fit_harn.py

示例文件夹中的docstrings具有任务级文档。

最简单的是mnist示例:https://gitlab.kitware.com/computer-vision/netharn/-/blob/master/netharn/examples/mnist.py

CIFAR示例基于mnist示例:https://gitlab.kitware.com/computer-vision/netharn/-/blob/master/netharn/examples/cifar.py

我建议您查看这两个示例,因为它们具有最好的文档。

分割示例:https://gitlab.kitware.com/computer-vision/netharn/-/blob/master/netharn/examples/segmentation.py

和对象检测示例:https://gitlab.kitware.com/computer-vision/netharn/-/blob/master/netharn/examples/object_detection.py

这些示例的文档较少,但提供了更多关于如何使用netharn的真实世界示例。

有一个针对CAMVID数据集的特定应用分割示例:https://gitlab.kitware.com/computer-vision/netharn/-/blob/master/netharn/examples/sseg_camvid.py

还有一个针对VOC检测的应用示例:https://gitlab.kitware.com/computer-vision/netharn/-/blob/master/netharn/examples/yolo_voc.py

此README还包含一个玩具示例。

玩具示例

以下示例是netharn/fit_harn.py中的doctest。它演示了如何使用NetHarn训练一个模型来解决玩具问题。

在这个玩具问题中,我们没有扩展netharn.FitHarn对象,因此我们使用run_batch的默认行为。默认的on_batchon_epoch不执行任何操作,因此损失将是唯一的性能度量。

有关更多示例,请参阅示例目录。这些示例展示了如何扩展netharn.FitHarn以衡量特定问题的性能。MNIST和CIFAR示例是最简单的。YOLO示例更复杂。IBEIS示例依赖于非公开数据和/或软件,但仍然具有参考价值。它的复杂性大于CIFAR,但小于YOLO。

>>> import netharn
>>> hyper = netharn.HyperParams(**{
>>>     # ================
>>>     # Environment Components
>>>     'name'        : 'demo',
>>>     'workdir'     : ub.ensure_app_cache_dir('netharn/demo'),
>>>     'xpu'         : netharn.XPU.coerce('auto'),
>>>     # workdir is a directory where intermediate results can be saved
>>>     # "name" symlinks <workdir>/fit/name/<name> -> ../runs/<hashid>
>>>     # XPU auto select a gpu if idle and VRAM>6GB else a cpu
>>>     # ================
>>>     # Data Components
>>>     'datasets'    : {  # dict of plain ol torch.data.Dataset instances
>>>         'train': netharn.data.ToyData2d(size=3, border=1, n=256, rng=0),
>>>         'vali': netharn.data.ToyData2d(size=3, border=1, n=64, rng=1),
>>>         'test': netharn.data.ToyData2d(size=3, border=1, n=64, rng=2),
>>>     },
>>>     'loaders'     : {'batch_size': 4}, # DataLoader instances or kw
>>>     # ================
>>>     # Algorithm Components
>>>     # Note the (cls, kw) tuple formatting
>>>     'model'       : (netharn.models.ToyNet2d, {}),
>>>     'optimizer'   : (netharn.optimizers.SGD, {
>>>         'lr': 0.01
>>>     }),
>>>     # focal loss is usually better than netharn.criterions.CrossEntropyLoss
>>>     'criterion'   : (netharn.criterions.FocalLoss, {}),
>>>     'initializer' : (netharn.initializers.KaimingNormal, {
>>>         'param': 0,
>>>     }),
>>>     # The scheduler adjusts learning rate over the training run
>>>     'scheduler'   : (netharn.schedulers.ListedScheduler, {
>>>         'points': {'lr': {0: 0.1, 2: 10.0, 4: .15, 6: .05, 9: .01}},
>>>         'interpolation': 'linear',
>>>     }),
>>>     'monitor'     : (netharn.Monitor, {
>>>         'max_epoch': 10,
>>>         'patience': 7,
>>>     }),
>>>     # dynamics are a config option that modify the behavior of the main
>>>     # training loop. These parameters effect the learned model.
>>>     'dynamics'   : {'batch_step': 2},
>>> })
>>> harn = netharn.FitHarn(hyper)
>>> # non-algorithmic behavior preferences (do not change learned models)
>>> harn.preferences['num_keep'] = 10
>>> harn.preferences['auto_prepare_batch'] = True
>>> # start training.
>>> harn.initialize(reset='delete')  # delete removes an existing run
>>> harn.run()  # note: run calls initialize it hasn't already been called.
>>> # xdoc: +IGNORE_WANT

运行此代码产生以下输出

RESET HARNESS BY DELETING EVERYTHING IN TRAINING DIR
Symlink: /home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum -> /home/joncrall/.cache/netharn/demo/_mru
... already exists
Symlink: /home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum -> /home/joncrall/.cache/netharn/demo/fit/name/demo
... already exists
... and points to the right place
INFO: Initializing tensorboard (dont forget to start the tensorboard server)
INFO: Model has 824 parameters
INFO: Mounting ToyNet2d model on GPU(0)
INFO: Exported model topology to /home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/ToyNet2d_2a3f49.py
INFO: Initializing model weights with: <netharn.initializers.nninit_core.KaimingNormal object at 0x7fc67eff0278>
INFO:  * harn.train_dpath = '/home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum'
INFO:  * harn.name_dpath  = '/home/joncrall/.cache/netharn/demo/fit/name/demo'
INFO: Snapshots will save to harn.snapshot_dpath = '/home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/torch_snapshots'
INFO: ARGV:
    /home/joncrall/.local/conda/envs/py36/bin/python /home/joncrall/.local/conda/envs/py36/bin/ipython
INFO: dont forget to start:
    tensorboard --logdir ~/.cache/netharn/demo/fit/name
INFO: === begin training 0 / 10 : demo ===
epoch lr:0.0001 │ vloss is unevaluated  0/10... rate=0 Hz, eta=?, total=0:00:00, wall=19:36 EST
train loss:0.173 │ 100.00% of 64x8... rate=11762.01 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
vali loss:0.170 │ 100.00% of 64x4... rate=9991.94 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
test loss:0.170 │ 100.00% of 64x4... rate=24809.37 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
INFO: === finish epoch 0 / 10 : demo ===
epoch lr:0.00505 │ vloss: 0.1696 (n_bad=00, best=0.1696)  1/10... rate=1.24 Hz, eta=0:00:07, total=0:00:00, wall=19:36 EST
train loss:0.175 │ 100.00% of 64x8... rate=13522.14 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
vali loss:0.167 │ 100.00% of 64x4... rate=23598.31 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
test loss:0.167 │ 100.00% of 64x4... rate=20354.22 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
INFO: === finish epoch 1 / 10 : demo ===
epoch lr:0.01 │ vloss: 0.1685 (n_bad=00, best=0.1685)  2/10... rate=1.28 Hz, eta=0:00:06, total=0:00:01, wall=19:36 EST
train loss:0.177 │ 100.00% of 64x8... rate=15723.99 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
vali loss:0.163 │ 100.00% of 64x4... rate=29375.56 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
test loss:0.163 │ 100.00% of 64x4... rate=29664.69 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
INFO: === finish epoch 2 / 10 : demo ===

<JUST MORE OF THE SAME; REMOVED FOR BREVITY>

epoch lr:0.001 │ vloss: 0.1552 (n_bad=00, best=0.1552)  9/10... rate=1.11 Hz, eta=0:00:00, total=0:00:08, wall=19:36 EST
train loss:0.164 │ 100.00% of 64x8... rate=13795.93 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
vali loss:0.154 │ 100.00% of 64x4... rate=19796.72 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
test loss:0.154 │ 100.00% of 64x4... rate=21396.73 Hz, eta=0:00:00, total=0:00:00, wall=19:36 EST
INFO: === finish epoch 9 / 10 : demo ===
epoch lr:0.001 │ vloss: 0.1547 (n_bad=00, best=0.1547) 10/10... rate=1.13 Hz, eta=0:00:00, total=0:00:08, wall=19:36 EST




INFO: Maximum harn.epoch reached, terminating ...
INFO:



INFO: training completed
INFO: harn.train_dpath = '/home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum'
INFO: harn.name_dpath  = '/home/joncrall/.cache/netharn/demo/fit/name/demo'
INFO: view tensorboard results for this run via:
    tensorboard --logdir ~/.cache/netharn/demo/fit/name
[DEPLOYER] Deployed zipfpath=/home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/deploy_ToyNet2d_lnejaaum_009_GAEYQT.zip
INFO: wrote single-file deployment to: '/home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/deploy_ToyNet2d_lnejaaum_009_GAEYQT.zip'
INFO: exiting fit harness.

此外,如果您在'--verbose' in sys.argv时运行该代码,则会产生以下更详细的操作描述

RESET HARNESS BY DELETING EVERYTHING IN TRAINING DIR
Symlink: /home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum -> /home/joncrall/.cache/netharn/demo/_mru
... already exists
Symlink: /home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum -> /home/joncrall/.cache/netharn/demo/fit/name/demo
... already exists
... and points to the right place
DEBUG: Initialized logging
INFO: Initializing tensorboard (dont forget to start the tensorboard server)
DEBUG: harn.train_info[hyper] = {
    'model': (
        'netharn.models.toynet.ToyNet2d',
        {
            'input_channels': 1,
            'num_classes': 2,
        },
    ),
    'initializer': (
        'netharn.initializers.nninit_core.KaimingNormal',
        {
            'mode': 'fan_in',
            'param': 0,
        },
    ),
    'optimizer': (
        'torch.optim.sgd.SGD',
        {
            'dampening': 0,
            'lr': 0.0001,
            'momentum': 0,
            'nesterov': False,
            'weight_decay': 0,
        },
    ),
    'scheduler': (
        'netharn.schedulers.scheduler_redesign.ListedScheduler',
        {
            'interpolation': 'linear',
            'optimizer': None,
            'points': {'lr': {0: 0.0001, 2: 0.01, 5: 0.015, 6: 0.005, 9: 0.001}},
        },
    ),
    'criterion': (
        'netharn.criterions.focal.FocalLoss',
        {
            'focus': 2,
            'ignore_index': -100,
            'reduce': None,
            'reduction': 'mean',
            'size_average': None,
            'weight': None,
        },
    ),
    'loader': (
        'torch.utils.data.dataloader.DataLoader',
        {
            'batch_size': 64,
        },
    ),
    'dynamics': (
        'Dynamics',
        {
            'batch_step': 4,
            'grad_norm_max': None,
        },
    ),
}
DEBUG: harn.hyper = <netharn.hyperparams.HyperParams object at 0x7fb19b4b8748>
DEBUG: make XPU
DEBUG: harn.xpu = <XPU(GPU(0)) at 0x7fb12af24668>
DEBUG: Criterion: FocalLoss
DEBUG: Optimizer: SGD
DEBUG: Scheduler: ListedScheduler
DEBUG: Making loaders
DEBUG: Making model
DEBUG: ToyNet2d(
  (layers): Sequential(
    (0): Conv2d(1, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (1): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (2): ReLU(inplace)
    (3): Conv2d(8, 8, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
    (4): BatchNorm2d(8, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
    (5): ReLU(inplace)
    (6): Conv2d(8, 2, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  )
  (softmax): Softmax()
)
INFO: Model has 824 parameters
INFO: Mounting ToyNet2d model on GPU(0)
DEBUG: Making initializer
DEBUG: Move FocalLoss() model to GPU(0)
DEBUG: Make optimizer
DEBUG: Make scheduler
DEBUG: Make monitor
DEBUG: Make dynamics
INFO: Exported model topology to /home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/ToyNet2d_2a3f49.py
INFO: Initializing model weights with: <netharn.initializers.nninit_core.KaimingNormal object at 0x7fb129e732b0>
DEBUG: calling harn.initializer=<netharn.initializers.nninit_core.KaimingNormal object at 0x7fb129e732b0>
INFO:  * harn.train_dpath = '/home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum'
INFO:  * harn.name_dpath  = '/home/joncrall/.cache/netharn/demo/fit/name/demo'
INFO: Snapshots will save to harn.snapshot_dpath = '/home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/torch_snapshots'
INFO: ARGV:
    /home/joncrall/.local/conda/envs/py36/bin/python /home/joncrall/.local/conda/envs/py36/bin/ipython --verbose
INFO: dont forget to start:
    tensorboard --logdir ~/.cache/netharn/demo/fit/name
INFO: === begin training 0 / 10 : demo ===
DEBUG: epoch lr:0.0001 │ vloss is unevaluated
epoch lr:0.0001 │ vloss is unevaluated  0/10... rate=0 Hz, eta=?, total=0:00:00, wall=19:56 EST
DEBUG: === start epoch 0 ===
DEBUG: log_value(epoch lr, 0.0001, 0
DEBUG: log_value(epoch momentum, 0, 0
DEBUG: _run_epoch 0, tag=train, learn=True
DEBUG:  * len(loader) = 8
DEBUG:  * loader.batch_size = 64
train loss:-1.000 │ 0.00% of 64x8... rate=0 Hz, eta=?, total=0:00:00, wall=19:56 ESTDEBUG: Making batch iterator
DEBUG: Starting batch iteration for tag=train, epoch=0
train loss:0.224 │ 100.00% of 64x8... rate=12052.25 Hz, eta=0:00:00, total=0:00:00, wall=19:56 EST
DEBUG: log_value(train epoch loss, 0.22378234565258026, 0
DEBUG: Finished batch iteration for tag=train, epoch=0
DEBUG: _run_epoch 0, tag=vali, learn=False
DEBUG:  * len(loader) = 4
DEBUG:  * loader.batch_size = 64
vali loss:-1.000 │ 0.00% of 64x4... rate=0 Hz, eta=?, total=0:00:00, wall=19:56 ESTDEBUG: Making batch iterator
DEBUG: Starting batch iteration for tag=vali, epoch=0
vali loss:0.175 │ 100.00% of 64x4... rate=23830.75 Hz, eta=0:00:00, total=0:00:00, wall=19:56 EST
DEBUG: log_value(vali epoch loss, 0.1749105490744114, 0
DEBUG: Finished batch iteration for tag=vali, epoch=0
DEBUG: epoch lr:0.0001 │ vloss: 0.1749 (n_bad=00, best=0.1749)
DEBUG: _run_epoch 0, tag=test, learn=False
DEBUG:  * len(loader) = 4
DEBUG:  * loader.batch_size = 64
test loss:-1.000 │ 0.00% of 64x4... rate=0 Hz, eta=?, total=0:00:00, wall=19:56 ESTDEBUG: Making batch iterator
DEBUG: Starting batch iteration for tag=test, epoch=0
test loss:0.176 │ 100.00% of 64x4... rate=28606.65 Hz, eta=0:00:00, total=0:00:00, wall=19:56 EST
DEBUG: log_value(test epoch loss, 0.17605290189385414, 0
DEBUG: Finished batch iteration for tag=test, epoch=0
DEBUG: Saving snapshot to /home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/torch_snapshots/_epoch_00000000.pt
DEBUG: Snapshot saved to /home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/torch_snapshots/_epoch_00000000.pt
DEBUG: new best_snapshot /home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/torch_snapshots/_epoch_00000000.pt
DEBUG: Plotting tensorboard data
Populating the interactive namespace from numpy and matplotlib
INFO: === finish epoch 0 / 10 : demo ===

<JUST MORE OF THE SAME; REMOVED FOR BREVITY>

INFO: === finish epoch 8 / 10 : demo ===
DEBUG: epoch lr:0.001 │ vloss: 0.2146 (n_bad=08, best=0.1749)
epoch lr:0.001 │ vloss: 0.2146 (n_bad=08, best=0.1749)  9/10... rate=1.20 Hz, eta=0:00:00, total=0:00:07, wall=19:56 EST
DEBUG: === start epoch 9 ===
DEBUG: log_value(epoch lr, 0.001, 9
DEBUG: log_value(epoch momentum, 0, 9
DEBUG: _run_epoch 9, tag=train, learn=True
DEBUG:  * len(loader) = 8
DEBUG:  * loader.batch_size = 64
train loss:-1.000 │ 0.00% of 64x8... rate=0 Hz, eta=?, total=0:00:00, wall=19:56 ESTDEBUG: Making batch iterator
DEBUG: Starting batch iteration for tag=train, epoch=9
train loss:0.207 │ 100.00% of 64x8... rate=13580.13 Hz, eta=0:00:00, total=0:00:00, wall=19:56 EST
DEBUG: log_value(train epoch loss, 0.2070118673145771, 9
DEBUG: Finished batch iteration for tag=train, epoch=9
DEBUG: _run_epoch 9, tag=vali, learn=False
DEBUG:  * len(loader) = 4
DEBUG:  * loader.batch_size = 64
vali loss:-1.000 │ 0.00% of 64x4... rate=0 Hz, eta=?, total=0:00:00, wall=19:56 ESTDEBUG: Making batch iterator
DEBUG: Starting batch iteration for tag=vali, epoch=9
vali loss:0.215 │ 100.00% of 64x4... rate=29412.91 Hz, eta=0:00:00, total=0:00:00, wall=19:56 EST
DEBUG: log_value(vali epoch loss, 0.21514184772968292, 9
DEBUG: Finished batch iteration for tag=vali, epoch=9
DEBUG: epoch lr:0.001 │ vloss: 0.2148 (n_bad=09, best=0.1749)
DEBUG: _run_epoch 9, tag=test, learn=False
DEBUG:  * len(loader) = 4
DEBUG:  * loader.batch_size = 64
test loss:-1.000 │ 0.00% of 64x4... rate=0 Hz, eta=?, total=0:00:00, wall=19:56 ESTDEBUG: Making batch iterator
DEBUG: Starting batch iteration for tag=test, epoch=9
test loss:0.216 │ 100.00% of 64x4... rate=25906.58 Hz, eta=0:00:00, total=0:00:00, wall=19:56 EST
DEBUG: log_value(test epoch loss, 0.21618007868528366, 9
DEBUG: Finished batch iteration for tag=test, epoch=9
DEBUG: Saving snapshot to /home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/torch_snapshots/_epoch_00000009.pt
DEBUG: Snapshot saved to /home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/torch_snapshots/_epoch_00000009.pt
DEBUG: Plotting tensorboard data
INFO: === finish epoch 9 / 10 : demo ===
DEBUG: epoch lr:0.001 │ vloss: 0.2148 (n_bad=09, best=0.1749)
epoch lr:0.001 │ vloss: 0.2148 (n_bad=09, best=0.1749) 10/10... rate=1.21 Hz, eta=0:00:00, total=0:00:08, wall=19:56 EST




INFO: Maximum harn.epoch reached, terminating ...
INFO:



INFO: training completed
INFO: harn.train_dpath = '/home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum'
INFO: harn.name_dpath  = '/home/joncrall/.cache/netharn/demo/fit/name/demo'
INFO: view tensorboard results for this run via:
    tensorboard --logdir ~/.cache/netharn/demo/fit/name
[DEPLOYER] Deployed zipfpath=/home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/deploy_ToyNet2d_lnejaaum_000_JWPNDC.zip
INFO: wrote single-file deployment to: '/home/joncrall/.cache/netharn/demo/fit/runs/demo/lnejaaum/deploy_ToyNet2d_lnejaaum_000_JWPNDC.zip'
INFO: exiting fit harness.

消费者包

bioharn包(https://gitlab.kitware.com/jon.crall/bioharn)实现了netharn/examples文件夹中分类器和检测器示例的扩展,以及预测和评估脚本。

项目详情


下载文件

下载适合您平台的文件。如果您不确定选择哪个,请了解有关安装包的更多信息。

源分发

此版本没有可用的源分发文件。请参阅生成分发存档的教程。

构建分发

netharn-0.6.2-py3-none-any.whl (451.9 kB 查看散列

上传时间 Python 3

由以下机构支持

AWS AWS 云计算和安全赞助商 Datadog Datadog 监控 Fastly Fastly CDN Google Google 下载分析 Microsoft Microsoft PSF赞助商 Pingdom Pingdom 监控 Sentry Sentry 错误日志 StatusPage StatusPage 状态页面