CLANA是一个用于分类器分析的工具包。
项目描述
clana
clana
是一个库和命令行应用程序,用于可视化具有大量类别的分类器的混淆矩阵。clana的两个主要贡献是,如卷积神经网络架构的分析和优化的第5章中解释的混淆矩阵排序(CMO),以及实现它的优化算法。CMO技术可以应用于任何多类分类器,有助于了解哪些类别的组最相似。
安装
安装clana的推荐方法是
$ pip install clana --user --upgrade
如果您想安装最新版本
$ git clone https://github.com/MartinThoma/clana.git; cd clana
$ pip install -e . --user
使用
$ clana --help
Usage: clana [OPTIONS] COMMAND [ARGS]...
Clana is a toolkit for classifier analysis.
See https://arxiv.org/abs/1707.09725, Chapter 4.
Options:
--version Show the version and exit.
--help Show this message and exit.
Commands:
distribution Get the distribution of classes in a dataset.
get-cm Generate a confusion matrix from predictions and ground...
get-cm-simple Generate a confusion matrix.
visualize Optimize and visualize a confusion matrix.
可视化命令会给出这样的图像
MNIST示例
$ cd docs/
$ python mnist_example.py # creates `train-pred.csv` and `test-pred.csv`
$ clana get-cm --gt gt-train.csv --predictions train-pred.csv --n 10
2019-09-14 09:47:30,655 - root - INFO - cm was written to 'cm.json'
$ clana visualize --cm cm.json --zero_diagonal
Score: 13475
2019-09-14 09:49:41,593 - root - INFO - n=10
2019-09-14 09:49:41,593 - root - INFO - ## Starting Score: 13475.00
2019-09-14 09:49:41,594 - root - INFO - Current: 13060.00 (best: 13060.00, hot_prob_thresh=100.0000%, step=0, swap=False)
[...]
2019-09-14 09:49:41,606 - root - INFO - Current: 9339.00 (best: 9339.00, hot_prob_thresh=100.0000%, step=238, swap=False)
Score: 9339
Perm: [0, 6, 5, 8, 3, 2, 1, 7, 9, 4]
2019-09-14 09:49:41,639 - root - INFO - Classes: [0, 6, 5, 8, 3, 2, 1, 7, 9, 4]
Accuracy: 93.99%
2019-09-14 09:49:41,725 - root - INFO - Save figure at '/home/moose/confusion_matrix.tmp.pdf'
2019-09-14 09:49:41,876 - root - INFO - Found threshold for local connection: 398
2019-09-14 09:49:41,876 - root - INFO - Found 9 clusters
2019-09-14 09:49:41,877 - root - INFO - silhouette_score=-0.012313948323292875
1: [0]
1: [6]
1: [5]
1: [8]
1: [3]
1: [2]
1: [1]
2: [7, 9]
1: [4]
这将给出
标签操作
准备一个具有强制的表头的labels.csv
文件
$ clana visualize --cm cm.json --zero_diagonal --labels mnist/labels.csv
数据分布
$ clana distribution --gt gt.csv --labels labels.csv [--out out/] [--long]
每个标签打印一行,例如:
60% cat (56789 elements)
20% dog (12345 elements)
5% mouse (1337 elements)
1% tux (314 elements)
如果指定了--out
,则创建一个水平条形图。第一个条形是最常见的类别,第二个条形是第二常见的类别,...
它使用简短标签,除非在命令中添加了--long
。
可视化
查看可视化
作为库的使用
>>> import numpy as np
>>> arr = np.array([[9, 4, 7, 3, 8, 5, 2, 8, 7, 6],
[4, 9, 2, 8, 5, 8, 7, 3, 6, 7],
[7, 2, 9, 1, 6, 3, 0, 8, 5, 4],
[3, 8, 1, 9, 4, 7, 8, 2, 5, 6],
[8, 5, 6, 4, 9, 6, 3, 7, 8, 7],
[5, 8, 3, 7, 6, 9, 6, 4, 7, 8],
[2, 7, 0, 8, 3, 6, 9, 1, 4, 5],
[8, 3, 8, 2, 7, 4, 1, 9, 6, 5],
[7, 6, 5, 5, 8, 7, 4, 6, 9, 8],
[6, 7, 4, 6, 7, 8, 5, 5, 8, 9]])
>>> from clana.optimize import simulated_annealing
>>> result = simulated_annealing(arr)
>>> result.cm
array([[9, 8, 7, 6, 5, 4, 3, 2, 1, 0],
[8, 9, 8, 7, 6, 5, 4, 3, 2, 1],
[7, 8, 9, 8, 7, 6, 5, 4, 3, 2],
[6, 7, 8, 9, 8, 7, 6, 5, 4, 3],
[5, 6, 7, 8, 9, 8, 7, 6, 5, 4],
[4, 5, 6, 7, 8, 9, 8, 7, 6, 5],
[3, 4, 5, 6, 7, 8, 9, 8, 7, 6],
[2, 3, 4, 5, 6, 7, 8, 9, 8, 7],
[1, 2, 3, 4, 5, 6, 7, 8, 9, 8],
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]])
>>> result.perm
array([2, 7, 0, 4, 8, 9, 5, 1, 3, 6])
您可以使用result.cm
进行可视化,并使用result.perm
以相同的顺序获取您的标签。
# Just some example labels
# ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
>>> labels = [str(el) for el in range(11)]
>>> np.array(labels)[result.perm]
array(['2', '7', '0', '4', '8', '9', '5', '1', '3', '6'], dtype='<U2')
项目详情
下载文件
下载适合您平台的文件。如果您不确定选择哪个,请了解有关安装包的更多信息。
源分发
clana-0.4.1.tar.gz (21.5 kB 查看哈希值)
构建分发
clana-0.4.1-py3-none-any.whl (23.9 kB 查看哈希值)
关闭
clana-0.4.1.tar.gz的哈希值
算法 | 哈希摘要 | |
---|---|---|
SHA256 | ae17fe68c210ff2234ebf067e37087204f0230af386c00c43e316c1362d40f5c |
|
MD5 | aa278f9ef30c7aeba2c3b335ef480cba |
|
BLAKE2b-256 | 7ed605952905917360df922718a0b15bcc123565ee21c46bf42493115070fd45 |
关闭
clana-0.4.1-py3-none-any.whl的哈希值
算法 | 哈希摘要 | |
---|---|---|
SHA256 | e0d81d2c7eb054527f1a36ae0a66e91713f6eada7fc06a5581d6e5bd6e4ba2de |
|
MD5 | 3585cef585c8a21d251a9c785e7a397f |
|
BLAKE2b-256 | 75681d9b34e4085da8595108dc994864cd66f0b1285bcdaa3bf3bf1ae9a30a94 |