跳转到主要内容

Python 接口到 KPartiteKClique

项目描述

PyKPartiteKClique

https://github.com/kliem/KPartiteKClique 的 Python 封装。

遍历 k-分割图中所有 k-完全子图。

需求

  • setuptools
  • Cython

快速入门

一个简单的例子

>>> from kpkc import KCliqueIterator
>>> edges = [[1, 2]]
>>> parts = [[1], [2]]
>>> it = KCliqueIterator(edges, parts)
>>> list(it)
[[1, 2]]

默认算法是 kpkc,它首先选择具有较少边的节点

>>> parts = [[1, 2, 3, 4], [5, 6, 7, 8, 9]]
>>> edges = [[1, 6], [5, 2], [5, 3]]
>>> edges += [[i, j] for i in range(2, 5) for j in range(6, 10)]
>>> it = KCliqueIterator(edges, parts)
>>> list(it)[:3]
[[1, 6], [3, 5], [2, 5]]

算法 FindClique 首先选择具有较少节点的部分

>>> parts = [[1, 2, 3, 4], [5, 6, 7, 8, 9]]
>>> edges = [[1, 6], [5, 2], [5, 3]]
>>> edges += [[i, j] for i in range(2, 5) for j in range(6, 10)]
>>> it = KCliqueIterator(edges, parts, algorithm='FindClique')
>>> list(it)
[[1, 6], [2, 5], [2, 6], [2, 7], [2, 8], [2, 9], [3, 5], [3, 6], [3, 7], [3, 8], [3, 9], [4, 6], [4, 7], [4, 8], [4, 9]]

基准测试

我们对以下算法/实现进行了基准测试

  • kpkc(我们的实现)
  • FindClique(我们的实现)
  • Cliquer(通过 SageMath 暴露)
  • networkx
  • mcqd(通过 SageMath 暴露)

为此,我们使用了三种类型的图

  • 位于 sample_graphs/ 中的图,可以使用 kpkc.test.load_tester 进行测试。

  • 我们还对具有参数 (k, min_s, max_s, a_1, a_2) 的随机图进行了基准测试,其中 k 是部分的数量,每个部分的大小在 [min_s, max_s] 范围内均匀分布。每个顶点 v 被分配一个从 [a_1, a_2] 均匀分布中选择的随机浮点数 p(v)。对于来自不同部分的任意一对 vw,边以概率 (p(v) + p(w))/2 生成。

    这种方法在以下文献中描述

    • Grünert, Tore & Irnich, Stefan & Zimmermann, Hans-Jürgen & Schneider, Markus & Wulfhorst, Burkhard. (2001). K-分割图中的团及其在纺织工程中的应用

    可以使用 kpkc.test.get_random_k_partite_graph(k, min_s, max_s, a_1, a_2) 获取此类随机图。

  • 此外,我们使用参数 (k, max_s, a) 对示例进行了基准测试,其中 k 表示部分数量。对于 i 在 1, ..., k 的范围内,部分的大小为 1 + ((max_s -1) * i) // k

    f 是由 f(1) = 1f(max_s) = a 确定的仿射函数。对于来自不同部分且大小分别为 st 的所有成对 vw,边以概率 f(min(s, t)) 生成。

    这意味着大小为 1 的部分将具有所有邻居,而部分中的顶点越多,其边的密度就越低。

    可以通过 kpkc.test.get_random_k_partite_graph_2(k, max_s, a) 获取这样的随机图。

    在许多情况下,这比上面的随机图更自然。如果 k-完全图对应于某些匹配,那么这对应于选择越少,人们就越不挑剔的事实。例如

    假设该地区只有一个水泥厂,两个混凝土泵,二十辆混凝土搅拌车和二十个混凝土施工队。没有人会质疑水泥厂的质量,因为没有替代品。由于只有两个混凝土泵,卡车司机通常会愿意与两者都合作。同样,混凝土施工队通常会忍受两个泵操作员。然而,混凝土施工队可能会拒绝与一些卡车司机(总是迟到)合作,或者卡车司机可能会拒绝与一些施工队合作(总是订购比需要的更多的卡车)。

    特别是,sample_graphs/ 中的图表现得有点像这样:较小部分中的顶点比较大部分中的顶点有更多的邻居。有超过 1400 万个这样的图是我们想要检查的。这可以通过 kpkc 实现,而其他实现则似乎不可行。

结果是在 Intel i7-7700 CPU @3.60GHz 上获得的。

检查 k-完全图

我们测量确定完全图数量或找到第一个 k-完全图(如果有的话)所需的时间。

请注意,sample_graphs 中的图没有 k-完全图。

nan 表示计算在 1000 秒后中断(未确定)。

kpkc FindClique networkx Cliquer mcqd
0 1.73e+01 nan nan nan nan
1 1.73e+01 nan nan nan nan
2 1.72e+01 nan nan nan nan
20 5.07e+00 nan nan nan nan
100 1.74e+01 nan nan nan nan
1000 4.89e-01 2.15e+01 nan nan nan
10000 1.78e-01 3.62e+00 nan 6.26e+02 nan
1000000 1.32e+00 nan nan nan nan
2000000 1.58e-01 7.63e-01 nan 9.44e+00 nan
5000000 1.87e-01 7.95e+00 nan nan nan
10000000 2.25e-02 6.11e-03 nan 1.24e+00 nan
(5, 50, 50, 0.14, 0.14) 5.61e-04 2.31e-05 1.08e-02 1.28e-03 1.20e-03
(5, 50, 50, 0.15, 0.15) 2.78e-04 2.00e-05 3.57e-03 1.39e-03 1.27e-03
(5, 50, 50, 0.2, 0.2) 4.17e-05 5.01e-06 1.77e-03 1.67e-03 1.60e-03
(5, 50, 50, 0.25, 0.25) 5.44e-05 3.34e-06 1.36e-03 2.21e-03 2.06e-03
(5, 50, 50, 0.0, 0.3) 1.93e-04 4.77e-06 1.22e-03 1.27e-03 1.26e-03
(5, 50, 50, 0.0, 0.4) 3.12e-05 3.34e-06 1.15e-03 1.57e-03 1.58e-03
(5, 50, 50, 0.0, 0.45) 1.00e-05 3.34e-06 1.34e-03 2.13e-03 1.97e-03
(5, 50, 50, 0.0, 0.5) 1.22e-05 3.34e-06 1.42e-03 2.12e-03 2.00e-03
(10, 26, 37, 0.49, 0.49) 3.48e-03 5.84e-05 2.59e-01 4.55e-02 4.47e-02
(10, 26, 37, 0.5, 0.5) 9.58e-05 5.25e-06 3.51e-02 5.70e-02 3.83e-02
(10, 26, 37, 0.51, 0.51) 1.39e-03 1.14e-05 7.50e-01 1.10e-01 6.55e-02
(10, 26, 37, 0.4, 0.6) 1.88e-03 5.96e-06 2.17e-01 7.19e-02 4.76e-02
(10, 26, 37, 0.3, 0.7) 6.03e-04 1.03e-05 1.28e-02 1.52e-01 5.06e-02
(10, 50, 50, 0.42, 0.42) 1.33e-02 2.52e-04 1.09e+00 1.07e-01 1.13e-01
(10, 50, 50, 0.43, 0.43) 4.42e-03 1.56e-04 1.70e+00 1.80e-01 1.31e-01
(10, 50, 50, 0.44, 0.44) 2.25e-02 8.18e-05 3.16e-01 2.14e-01 1.70e-01
(10, 50, 50, 0.46, 0.46) 5.23e-03 2.17e-05 2.67e-02 3.24e-01 2.75e-01
(10, 50, 50, 0.48, 0.48) 9.79e-04 3.22e-05 4.92e-02 4.04e-01 3.95e-01
(10, 50, 50, 0.5, 0.5) 7.41e-04 9.78e-06 1.58e-02 1.16e+00 6.94e-01
(50, 5, 15, 0.91, 0.91) nan 6.35e-02 nan nan nan
(50, 5, 15, 0.918, 0.918) nan 2.81e-02 nan nan nan
(50, 5, 15, 0.92, 0.92) nan 7.89e-03 nan nan nan
(20, 23, 39, 0.7, 0.7) 1.87e+02 1.06e-01 nan nan nan
(20, 23, 39, 0.71, 0.71) 4.07e+01 5.18e-02 nan nan nan
(20, 23, 39, 0.72, 0.72) 5.12e+00 1.81e-03 nan nan nan
(20, 23, 39, 0.7, 0.73) 2.57e+00 3.55e-03 nan nan nan
(20, 23, 39, 0.65, 0.78) 1.04e+00 7.39e-05 nan nan nan
(30, 11, 30, 0.6, 0.6) 4.29e-01 1.29e-04 nan 7.96e+02 1.21e+02
(30, 11, 30, 0.7, 0.7) 1.63e+01 1.40e-03 nan nan nan
(30, 11, 30, 0.8, 0.8) nan 1.16e+00 nan nan nan
(30, 11, 30, 0.81, 0.81) nan 1.33e+00 nan nan nan
(30, 11, 30, 0.82, 0.82) nan 4.36e-01 nan nan nan
(30, 11, 30, 0.84, 0.84) nan 5.73e-03 nan nan nan
(30, 11, 30, 0.88, 0.88) 1.44e+00 1.22e-05 nan nan nan
(100, 10, 10, 0.7, 0.7) 5.20e-02 4.67e-05 nan nan nan
(100, 10, 10, 0.8, 0.8) 4.76e+00 3.47e-04 nan nan nan
(100, 10, 10, 0.85, 0.85) 2.21e+02 1.94e-03 nan nan nan
(100, 10, 10, 0.9, 0.9) nan 1.49e-01 nan nan nan
(100, 10, 10, 0.92, 0.92) nan 4.19e+00 nan nan nan
(100, 10, 10, 0.94, 0.94) nan nan nan nan nan
(100, 10, 10, 0.95, 0.95) nan nan nan nan nan
(100, 10, 10, 0.97, 0.97) nan 2.61e-03 nan nan nan
(3, 100, 100, 0.1, 0.1) 1.00e-05 2.62e-06 9.16e-04 1.29e-03 1.20e-03
(4, 100, 100, 0.15, 0.15) 1.45e-05 3.81e-06 2.01e-03 3.61e-03 3.47e-03
(5, 100, 100, 0.2, 0.2) 3.41e-05 3.34e-06 4.94e-03 8.59e-03 8.80e-03
(6, 100, 100, 0.25, 0.25) 1.02e-04 6.44e-06 8.21e-03 2.33e-02 2.31e-02
(7, 50, 50, 0.35, 0.35) 3.46e-05 6.91e-06 7.86e-03 1.26e-02 1.15e-02
(8, 50, 50, 0.4, 0.4) 1.40e-04 6.68e-06 1.17e-02 3.85e-02 3.41e-02
(9, 50, 50, 0.45, 0.45) 1.50e-04 5.98e-05 8.81e-03 1.71e-01 1.25e-01
(10, 50, 50, 0.5, 0.5) 8.56e-04 1.24e-05 1.43e-01 1.48e+00 6.43e-01
(5, 10, 0.1) 6.91e-06 2.86e-06 1.08e-04 8.94e-05 1.07e-04
(5, 10, 0.2) 5.48e-06 2.62e-06 9.66e-05 7.39e-05 9.49e-05
(5, 20, 0.05) 6.91e-06 2.86e-06 2.21e-04 2.05e-04 2.45e-04
(5, 20, 0.1) 6.44e-06 2.86e-06 2.56e-04 2.08e-04 2.54e-04
(5, 50, 0.01) 1.22e-05 3.10e-06 9.59e-04 1.09e-03 1.17e-03
(5, 50, 0.02) 1.10e-05 3.10e-06 9.73e-04 1.06e-03 1.20e-03
(10, 10, 0.4) 1.19e-05 3.34e-06 4.68e-04 2.12e-04 2.93e-04
(10, 10, 0.6) 1.53e-05 3.58e-06 3.96e-04 3.01e-04 3.04e-04
(10, 20, 0.3) 2.00e-05 4.29e-06 2.03e-03 8.79e-04 1.06e-03
(10, 20, 0.5) 2.31e-05 3.81e-06 1.33e-03 8.62e-04 3.01e-03
(10, 50, 0.05) 4.84e-05 3.81e-06 1.91e-02 3.87e-03 4.93e-03
(10, 50, 0.1) 6.82e-05 7.15e-06 4.94e-03 4.09e-03 5.39e-03
(10, 100, 0.01) 6.79e-05 1.12e-05 1.90e-02 1.74e-02 7.28e-02
(10, 100, 0.02) 7.37e-05 5.25e-06 1.35e-02 1.78e-02 7.60e-02
(20, 10, 0.6) 2.16e-04 8.54e-05 1.62e+00 3.42e-03 2.46e-03
(20, 10, 0.7) 6.68e-05 5.48e-06 4.99e-01 1.90e-02 8.59e-03
(20, 20, 0.5) 6.92e-04 5.39e-05 6.10e+00 3.48e-03 6.79e-02
(20, 20, 0.6) 1.01e-04 7.63e-06 1.37e-02 3.11e-03 9.25e+00
(20, 50, 0.3) 8.63e+00 1.55e+01 nan 4.66e-02 8.06e+00
(20, 50, 0.35) 2.53e-02 3.24e-02 nan 2.27e-02 9.18e+00
(20, 100, 0.2) 1.40e-01 1.40e+02 nan 1.21e-01 6.80e+01
(20, 100, 0.25) 1.03e-01 8.80e-01 nan 8.89e-02 5.48e+02
(50, 10, 0.83) 1.04e+00 2.67e-03 nan 1.36e+02 2.13e+02
(50, 10, 0.85) 2.09e-01 2.95e-03 nan nan 9.45e+02
(50, 20, 0.5) 2.08e-01 2.00e-02 nan nan nan
(50, 20, 0.6) 2.52e+00 4.22e-01 nan nan nan
(50, 20, 0.7) 1.58e+02 2.38e+01 nan nan nan
(50, 20, 0.71) 3.04e+02 2.48e+02 nan nan nan
(50, 20, 0.72) nan 7.12e+02 nan nan nan
(50, 20, 0.73) nan 7.61e+02 nan nan nan
(50, 20, 0.75) nan nan nan nan nan
(50, 20, 0.76) nan nan nan nan nan
(50, 20, 0.77) nan 1.56e+01 nan nan nan
(50, 20, 0.78) nan 1.50e+00 nan nan nan
(50, 20, 0.79) nan 1.89e-01 nan nan nan
(50, 20, 0.8) nan 6.43e-03 nan nan nan
(50, 50, 0.1) 3.63e-02 4.46e+00 nan nan nan
(50, 50, 0.2) 1.28e-01 3.33e+01 nan nan nan
(50, 50, 0.3) 2.09e+00 1.74e+02 nan nan nan
(50, 50, 0.4) 2.46e+01 nan nan nan nan
(50, 50, 0.5) 8.03e+02 nan nan nan nan
(50, 50, 0.6) nan nan nan nan nan
(50, 50, 0.71) nan nan nan nan nan
(50, 50, 0.72) nan 2.99e+01 nan 1.83e+02 nan
(50, 50, 0.73) nan 1.17e+01 nan nan nan
(50, 50, 0.74) nan 1.55e+00 nan nan nan
(50, 100, 0.1) 2.43e-01 nan nan nan nan
(50, 100, 0.2) 1.11e+01 nan nan nan nan
(50, 100, 0.3) 2.35e+02 nan nan nan nan
(50, 100, 0.4) nan nan nan nan nan
(50, 100, 0.64) nan nan nan nan nan
(50, 100, 0.65) nan nan nan 1.19e+02 nan
(50, 100, 0.66) nan nan nan nan nan
(50, 100, 0.67) nan nan nan nan nan
(50, 100, 0.68) nan nan nan nan nan
(50, 100, 0.69) nan 2.28e+01 nan nan nan
(50, 100, 0.7) nan 2.27e+00 nan 4.29e+01 nan
(100, 10, 0.6) 6.79e-03 9.54e-06 nan nan nan
(100, 10, 0.7) 2.35e-02 4.43e-05 nan nan nan
(100, 10, 0.8) 6.69e-01 1.71e-03 nan nan nan
(100, 20, 0.4) 4.81e-02 1.65e-03 nan nan nan
(100, 20, 0.5) 4.35e-01 1.33e-02 nan nan nan
(100, 20, 0.6) 4.65e+00 9.83e-02 nan nan nan
(100, 20, 0.7) 4.75e+02 9.96e+00 nan nan nan
(100, 20, 0.8) nan nan nan nan nan
(100, 20, 0.89) nan nan nan nan nan
(100, 20, 0.9) nan 2.43e+00 nan nan nan
(100, 50, 0.1) 1.78e-01 7.15e+00 nan nan nan
(100, 50, 0.2) 2.92e-01 8.19e+01 nan nan nan
(100, 50, 0.3) 8.12e+00 496 nan nan nan
(100, 50, 0.4) 46.1 nan nan nan nan
(100, 50, 0.5) nan nan nan nan nan
(100, 50, 0.87) nan nan nan nan nan
(100, 50, 0.88) nan 4.24 nan nan nan
(100, 50, 0.89) nan 0.00185 nan nan nan
(100, 50, 0.9) nan 0.000231 nan nan nan
(100, 100, 0.1) 0.892 nan nan nan nan
(100, 100, 0.2) 416 nan nan nan nan
(100, 100, 0.3) 260 nan nan nan nan
(100, 100, 0.4) nan nan nan nan nan
(100, 100, 0.85) nan nan nan nan nan
(100, 100, 0.86) nan 7.75 nan nan nan
(100, 100, 0.87) nan 0.00372 nan nan nan
(100, 100, 0.88) nan 0.0000973 nan nan nan
(100, 100, 0.89) nan 0.0000906 nan nan nan
(100, 100, 0.9) nan 0.0000777 nan nan nan

找到所有k-完全子图

我们测量找到所有k-完全子图所需的时间,如果这个时间与上述不同。

kpkc FindClique networkx
(5, 50, 50, 0.15, 0.15) 0.0000586 0.0000334 0.0127
(5, 50, 50, 0.2, 0.2) 0.000809 0.000106 0.0239
(5, 50, 50, 0.25, 0.25) 0.00229 0.000621 0.00434
(5, 50, 50, 0.0, 0.3) 0.000628 0.0000596 1.33e-02
(5, 50, 50, 0.0, 0.4) 1.09e-03 0.000258 0.00248
(5, 50, 50, 0.0, 0.45) 0.0032 0.00132 0.00434
(5, 50, 50, 0.0, 0.5) 0.00415 2.01e-03 0.00448
(10, 26, 37, 0.49, 0.49) 0.00404 0.000825 9.43
(10, 26, 37, 0.5, 0.5) 0.0348 0.000726 8.26
(10, 26, 37, 0.51, 0.51) 0.069 1.17e-03 14.2
(10, 26, 37, 0.4, 0.6) 0.00468 0.0000955 13.9
(10, 26, 37, 0.3, 0.7) 0.0658 0.0000412 12.7
(10, 50, 50, 0.42, 0.42) 0.0931 0.0000138 21.4
(10, 50, 50, 0.43, 0.43) 1.21e-01 1.94e-03 27.8
(10, 50, 50, 0.44, 0.44) 0.151 0.00248 34.6
(10, 50, 50, 0.46, 0.46) 0.304 0.000425 55.2
(10, 50, 50, 0.48, 0.48) 0.573 0.000796 83.9
(10, 50, 50, 0.5, 0.5) 0.997 0.0197 172
(50, 5, 15, 0.918, 0.918) nan 10.6 nan
(50, 5, 15, 0.92, 0.92) nan 2.91 nan
(20, 23, 39, 0.7, 0.7) 320 0.193 nan
(20, 23, 39, 0.71, 0.71) 535 0.345 nan
(20, 23, 39, 0.72, 0.72) 955 0.642 nan
(20, 23, 39, 0.7, 0.73) nan 0.867 nan
(20, 23, 39, 0.65, 0.78) 454 4.36e-01 nan
(30, 11, 30, 0.82, 0.82) nan 3.45 nan
(30, 11, 30, 0.84, 0.84) nan 513 nan
(3, 100, 100, 0.1, 0.1) 0.00257 0.00164 0.000651
(4, 100, 100, 0.15, 0.15) 0.000471 0.000208 0.00401
(5, 100, 100, 0.2, 0.2) 1.17e-02 0.000223 0.19
(6, 100, 100, 0.25, 0.25) 0.00356 0.0028 0.996
(7, 50, 50, 0.35, 0.35) 0.0156 0.0000815 0.738
(8, 50, 50, 0.4, 0.4) 0.0442 0.00191 3.89
(9, 50, 50, 0.45, 0.45) 0.174 0.0045 20.5
(10, 50, 50, 0.5, 0.5) 9.25e+00 0.00167 131
(5, 10, 0.1) 0.0000842 5.44e-05 0.0000451
(5, 10, 0.2) 0.0000161 1.08e-04 0.0000516
(5, 20, 0.05) 1.26e-03 0.0000948 0.00376
(5, 20, 0.1) 0.00338 0.0027 0.000613
(5, 50, 0.01) 0.00916 0.00771 0.0155
(5, 50, 0.02) 0.0111 0.0942 0.0166
(10, 10, 0.4) 0.000448 0.00199 0.00219
(10, 10, 0.6) 0.00222 1.15e-02 0.00511
(10, 20, 0.3) 0.00496 0.00272 0.79
(10, 20, 0.5) 1.16e+00 0.78 3.28
(10, 50, 0.05) 4.92e-02 6.55e-02 368
(10, 50, 0.1) 0.0237 1.89e-01 476
(10, 100, 0.01) 5.87 7.79 nan
(10, 100, 0.02) 8.49 9.93 nan
(20, 10, 0.6) 0.000358 0.000823 23.6
(20, 10, 0.7) 1.23 0.468 111
(20, 20, 0.5) 0.0657 0.0446 nan
(20, 20, 0.6) 28.2 161 nan
(20, 50, 0.35) 4.09 697 nan
(20, 100, 0.2) 958 nan nan
(20, 100, 0.25) 470 nan nan
(50, 10, 0.83) 13 4.04e-01 nan
(50, 10, 0.85) nan 350 nan

结论

根据上述时间,kpkcFindClique似乎是寻找k-部分图中的k-完全子图的最佳选择。

  • 如果所有顶点的预期邻居数量大致相同,那么FindClique是最佳选择。
  • 如果有许多边并且预期的k-完全子图数量很大,那么FindClique是获取一些k-完全子图的最佳选择。
  • 如果只预期有少数k-完全子图(如果有),且较大部分的顶点比较小部分的顶点有更少的邻居,那么kpkc是获取所有k-完全子图的最佳选择。

请注意,我们的FindClique实现似乎在查找所有k-完全子图方面比原始实现(未发表)更快

  • Grünert, Tore & Irnich, Stefan & Zimmermann, Hans-Jürgen & Schneider, Markus & Wulfhorst, Burkhard. (2001). K-分割图中的团及其在纺织工程中的应用

  • Mirghorbani, M. & Krokhmal, P.. (2013). 在k-部分图中寻找k-完全子图. 优化信件. 7. 10.1007/s11590-012-0536-y

对于具有参数(k, min_s, max_s, a_1, a_2)k = 100的随机图,我们通过一个因子100提高了Grünert等人的基准。请注意,我们还将3.6 GHz用于100 MHz,从而提高了约28个改进因子。

Mirghorbani等人将FindClique的实现提高了至9倍,与Grünert等人相比。

项目详情


下载文件

下载适用于您平台的文件。如果您不确定选择哪个,请了解有关安装包的更多信息。

源代码分布

kpkc-0.1.1.tar.gz (84.7 kB 查看哈希值)

上传时间 源代码

由以下组织支持

AWS AWS 云计算和安全赞助商 Datadog Datadog 监控 Fastly Fastly CDN Google Google 下载分析 Microsoft Microsoft PSF赞助商 Pingdom Pingdom 监控 Sentry Sentry 错误日志 StatusPage StatusPage 状态页面