使用Numba的快速N维聚合函数
项目描述
Numbagg: 使用Numba和NumPy的通用ufunc编写的快速、灵活的N维数组函数。
使用Numba和NumPy的通用ufunc编写的快速、灵活的N维数组函数。
为什么使用numbagg?
性能
- 优于pandas
- 在单个核心上,移动窗口函数的速度是2-10倍,聚合和分组函数的速度是1-2倍
- 当使用多个核心并行时,速度是4-30倍
- 在多个核心上优于bottleneck
- 在单个核心上与bottleneck相匹配
- 当使用多个核心并行时,速度是3-7倍
- 在多个核心上优于numpy
- 在单个核心上与numpy相匹配
- 使用多核并行处理时,速度可提高5-15倍
- ...尽管numbagg的函数是即时编译的,所以第一次运行速度较慢
多功能性
- 更多函数(尽管瓶颈有一些我们没有的函数,而pandas的函数有更多的参数)
- 函数适用于>3维。所有函数都接受任意轴或轴元组来计算
- 使用numba编写——代码更少,检查简单,改进简单
函数和基准测试
总结基准测试
以下两个基准测试总结了numbagg的性能——第一个是不并行化、含有100万个元素的1维数组,第二个是包含100x10K元素的2维数组并行化。numbagg在可能并行化的地方相对性能更高。下面列出了更广泛的数组范围的全套基准测试。
表中的值是针对给定形状的数组,在最终轴上计算numbagg的性能与其他库的倍数。(因此,1.00x表示numbagg与库相同,更高表示numbagg更快。)
func | 1D pandas |
1D bottleneck |
1D numpy |
2D pandas |
2D bottleneck |
2D numpy |
---|---|---|---|---|---|---|
bfill |
1.17x | 1.18x | n/a | 12.24x | 4.36x | n/a |
ffill |
1.17x | 1.12x | n/a | 12.76x | 4.34x | n/a |
group_nanall |
1.44x | n/a | n/a | 10.84x | n/a | n/a |
group_nanany |
1.20x | n/a | n/a | 5.25x | n/a | n/a |
group_nanargmax |
2.88x | n/a | n/a | 9.89x | n/a | n/a |
group_nanargmin |
2.82x | n/a | n/a | 9.96x | n/a | n/a |
group_nancount |
1.01x | n/a | n/a | 4.70x | n/a | n/a |
group_nanfirst |
1.39x | n/a | n/a | 11.80x | n/a | n/a |
group_nanlast |
1.16x | n/a | n/a | 5.36x | n/a | n/a |
group_nanmax |
1.14x | n/a | n/a | 5.22x | n/a | n/a |
group_nanmean |
1.19x | n/a | n/a | 5.64x | n/a | n/a |
group_nanmin |
1.13x | n/a | n/a | 5.26x | n/a | n/a |
group_nanprod |
1.15x | n/a | n/a | 4.95x | n/a | n/a |
group_nanstd |
1.18x | n/a | n/a | 5.03x | n/a | n/a |
group_nansum_of_squares |
1.35x | n/a | n/a | 8.11x | n/a | n/a |
group_nansum |
1.21x | n/a | n/a | 5.95x | n/a | n/a |
group_nanvar |
1.19x | n/a | n/a | 5.65x | n/a | n/a |
move_corr |
19.04x | n/a | n/a | 92.48x | n/a | n/a |
move_cov |
14.58x | n/a | n/a | 71.61x | n/a | n/a |
move_exp_nancorr |
6.73x | n/a | n/a | 35.30x | n/a | n/a |
move_exp_nancount |
2.35x | n/a | n/a | 10.56x | n/a | n/a |
move_exp_nancov |
5.77x | n/a | n/a | 31.75x | n/a | n/a |
move_exp_nanmean |
2.03x | n/a | n/a | 11.07x | n/a | n/a |
move_exp_nanstd |
1.89x | n/a | n/a | 10.07x | n/a | n/a |
move_exp_nansum |
1.88x | n/a | n/a | 9.70x | n/a | n/a |
move_exp_nanvar |
1.82x | n/a | n/a | 9.71x | n/a | n/a |
move_mean |
3.82x | 0.87x | n/a | 16.61x | 4.01x | n/a |
move_std |
5.96x | 1.29x | n/a | 24.52x | 6.04x | n/a |
move_sum |
3.80x | 0.83x | n/a | 15.95x | 3.70x | n/a |
move_var |
5.78x | 1.27x | n/a | 25.41x | 5.85x | n/a |
nanargmax [^5] |
2.45x | 1.00x | n/a | 2.16x | 1.00x | n/a |
nanargmin [^5] |
2.19x | 1.01x | n/a | 2.05x | 1.02x | n/a |
nancount |
1.40x | n/a | 1.06x | 11.00x | n/a | 4.16x |
nanmax [^5] |
3.26x | 1.00x | 0.11x | 3.62x | 3.24x | 0.11x |
nanmean |
2.42x | 0.98x | 2.83x | 13.58x | 4.54x | 13.13x |
nanmin [^5] |
3.27x | 1.00x | 0.11x | 3.62x | 3.24x | 0.11x |
nanquantile |
0.94x | n/a | 0.78x | 5.45x | n/a | 5.01x |
nanstd |
1.50x | 1.51x | 2.75x | 8.29x | 7.35x | 13.27x |
nansum |
2.28x | 0.97x | 2.52x | 17.71x | 6.24x | 16.05x |
nanvar |
1.50x | 1.49x | 2.81x | 8.18x | 6.97x | 13.32x |
完整基准测试
func | shape | size | pandas | bottleneck | numpy | numbagg | pandas_ratio | bottleneck_ratio | numpy_ratio | numbagg_ratio |
---|---|---|---|---|---|---|---|---|---|---|
bfill |
(1000,) | 1000 | 0ms | 0ms | n/a | 0ms | 1.59x | 0.03x | n/a | 1.00x |
(10000000,) | 10000000 | 20ms | 20ms | n/a | 17ms | 1.17x | 1.18x | n/a | 1.00x | |
(100, 100000) | 10000000 | 57ms | 20ms | n/a | 5ms | 12.24x | 4.36x | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 21ms | n/a | 5ms | n/a | 4.40倍 | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 248ms | n/a | 44ms | n/a | 5.70倍 | n/a | 1.00x | |
ffill |
(1000,) | 1000 | 0ms | 0ms | n/a | 0ms | 1.53倍 | 0.02倍 | n/a | 1.00x |
(10000000,) | 10000000 | 20ms | 19ms | n/a | 17ms | 1.17x | 1.12x | n/a | 1.00x | |
(100, 100000) | 10000000 | 56ms | 19ms | n/a | 4ms | 12.76x | 4.34x | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 19ms | n/a | 4ms | n/a | 4.33倍 | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 219ms | n/a | 42ms | n/a | 5.25x | n/a | 1.00x | |
group_nanall |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.79倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 68ms | n/a | n/a | 47ms | 1.44x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 17ms | n/a | n/a | 2ms | 10.84x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 1ms | n/a | n/a | n/a | 1.00x | |
group_nanany |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.78倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 68ms | n/a | n/a | 56ms | 1.20x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 18ms | n/a | n/a | 3ms | 5.25x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 3ms | n/a | n/a | n/a | 1.00x | |
group_nanargmax |
(1000,) | 1000 | 1ms | n/a | n/a | 0ms | 17.60倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 171ms | n/a | n/a | 59ms | 2.88x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 40ms | n/a | n/a | 4ms | 9.89x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 4ms | n/a | n/a | n/a | 1.00x | |
group_nanargmin |
(1000,) | 1000 | 1ms | n/a | n/a | 0ms | 17.56倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 166ms | n/a | n/a | 59ms | 2.82x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 41ms | n/a | n/a | 4ms | 9.96x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 4ms | n/a | n/a | n/a | 1.00x | |
group_nancount |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.68倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 56ms | n/a | n/a | 55ms | 1.01x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 15ms | n/a | n/a | 3ms | 4.70x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 3ms | n/a | n/a | n/a | 1.00x | |
group_nanfirst |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.88x | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 63ms | n/a | n/a | 45ms | 1.39x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 15ms | n/a | n/a | 1ms | 11.80x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 1ms | n/a | n/a | n/a | 1.00x | |
group_nanlast |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.87倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 62ms | n/a | n/a | 53ms | 1.16x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 15ms | n/a | n/a | 3ms | 5.36x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 2ms | n/a | n/a | n/a | 1.00x | |
group_nanmax |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.89x | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 66ms | n/a | n/a | 57ms | 1.14x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 17ms | n/a | n/a | 3ms | 5.22x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 3ms | n/a | n/a | n/a | 1.00x | |
group_nanmean |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.81倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 67ms | n/a | n/a | 57ms | 1.19x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 19ms | n/a | n/a | 3ms | 5.64x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 3ms | n/a | n/a | n/a | 1.00x | |
group_nanmin |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.84倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 66ms | n/a | n/a | 58ms | 1.13x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 17ms | n/a | n/a | 3ms | 5.26x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 3ms | n/a | n/a | n/a | 1.00x | |
group_nanprod |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.86倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 63ms | n/a | n/a | 55ms | 1.15x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 16ms | n/a | n/a | 3ms | 4.95x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 3ms | n/a | n/a | n/a | 1.00x | |
group_nanstd |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.73倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 70ms | n/a | n/a | 59ms | 1.18x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 20ms | n/a | n/a | 4ms | 5.03x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 4ms | n/a | n/a | n/a | 1.00x | |
group_nansum |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.89x | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 67ms | n/a | n/a | 56ms | 1.21x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 19ms | n/a | n/a | 3ms | 5.95x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 3ms | n/a | n/a | n/a | 1.00x | |
group_nanvar |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.71倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 69ms | n/a | n/a | 58ms | 1.19x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 20ms | n/a | n/a | 4ms | 5.65x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 3ms | n/a | n/a | n/a | 1.00x | |
group_nansum_of_squares |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 2.36倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 75ms | n/a | n/a | 55ms | 1.35x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 26ms | n/a | n/a | 3ms | 8.11x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 3ms | n/a | n/a | n/a | 1.00x | |
move_corr |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 10.85倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 909ms | n/a | n/a | 48ms | 19.04x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 869ms | n/a | n/a | 9ms | 92.48x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 9ms | n/a | n/a | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | n/a | n/a | 79ms | n/a | n/a | n/a | 1.00x | |
move_cov |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 10.05倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 623ms | n/a | n/a | 43ms | 14.58x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 603ms | n/a | n/a | 8ms | 71.61x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 8ms | n/a | n/a | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | n/a | n/a | 72ms | n/a | n/a | n/a | 1.00x | |
move_mean |
(1000,) | 1000 | 0ms | 0ms | n/a | 0ms | 1.84倍 | 0.03x | n/a | 1.00x |
(10000000,) | 10000000 | 120ms | 27ms | n/a | 31ms | 3.82x | 0.87x | n/a | 1.00x | |
(100, 100000) | 10000000 | 113ms | 27ms | n/a | 7ms | 16.61x | 4.01x | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 27ms | n/a | 7ms | n/a | 3.96倍 | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 296ms | n/a | 58ms | n/a | 5.08倍 | n/a | 1.00x | |
move_std |
(1000,) | 1000 | 0ms | 0ms | n/a | 0ms | 2.21倍 | 0.08倍 | n/a | 1.00x |
(10000000,) | 10000000 | 178ms | 39ms | n/a | 30ms | 5.96x | 1.29x | n/a | 1.00x | |
(100, 100000) | 10000000 | 157ms | 39ms | n/a | 6ms | 24.52x | 6.04x | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 39ms | n/a | 7ms | n/a | 5.88倍 | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 411ms | n/a | 58ms | n/a | 7.13倍 | n/a | 1.00x | |
move_sum |
(1000,) | 1000 | 0ms | 0ms | n/a | 0ms | 1.81倍 | 0.02倍 | n/a | 1.00x |
(10000000,) | 10000000 | 121ms | 26ms | n/a | 32ms | 3.80x | 0.83x | n/a | 1.00x | |
(100, 100000) | 10000000 | 113ms | 26ms | n/a | 7ms | 15.95x | 3.70x | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 26ms | n/a | 7ms | n/a | 3.59倍 | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 281ms | n/a | 59ms | n/a | 4.77倍 | n/a | 1.00x | |
move_var |
(1000,) | 1000 | 0ms | 0ms | n/a | 0ms | 2.04倍 | 0.08倍 | n/a | 1.00x |
(10000000,) | 10000000 | 168ms | 37ms | n/a | 29ms | 5.78x | 1.27x | n/a | 1.00x | |
(100, 100000) | 10000000 | 161ms | 37ms | n/a | 6ms | 25.41x | 5.85x | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 37ms | n/a | 6ms | n/a | 5.85x | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 398ms | n/a | 56ms | n/a | 7.07倍 | n/a | 1.00x | |
move_exp_nancorr |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 7.27倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 464ms | n/a | n/a | 69ms | 6.73x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 471ms | n/a | n/a | 13ms | 35.30x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 13ms | n/a | n/a | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | n/a | n/a | 111ms | n/a | n/a | n/a | 1.00x | |
move_exp_nancount |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 2.04倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 77ms | n/a | n/a | 33ms | 2.35x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 69ms | n/a | n/a | 7ms | 10.56x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 6ms | n/a | n/a | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | n/a | n/a | 59ms | n/a | n/a | n/a | 1.00x | |
move_exp_nancov |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 7.07倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 298ms | n/a | n/a | 52ms | 5.77x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 333ms | n/a | n/a | 10ms | 31.75x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 10ms | n/a | n/a | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | n/a | n/a | 87ms | n/a | n/a | n/a | 1.00x | |
move_exp_nanmean |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.40x | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 67ms | n/a | n/a | 33ms | 2.03x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 74ms | n/a | n/a | 7ms | 11.07x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 7ms | n/a | n/a | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | n/a | n/a | 60ms | n/a | n/a | n/a | 1.00x | |
move_exp_nanstd |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 2.33倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 88ms | n/a | n/a | 46ms | 1.89x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 95ms | n/a | n/a | 9ms | 10.07x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 9ms | n/a | n/a | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | n/a | n/a | 78ms | n/a | n/a | n/a | 1.00x | |
move_exp_nansum |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.36倍 | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 62ms | n/a | n/a | 33ms | 1.88x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 71ms | n/a | n/a | 7ms | 9.70x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 6ms | n/a | n/a | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | n/a | n/a | 60ms | n/a | n/a | n/a | 1.00x | |
move_exp_nanvar |
(1000,) | 1000 | 0ms | n/a | n/a | 0ms | 1.40x | n/a | n/a | 1.00x |
(10000000,) | 10000000 | 77ms | n/a | n/a | 42ms | 1.82x | n/a | n/a | 1.00x | |
(100, 100000) | 10000000 | 84ms | n/a | n/a | 9ms | 9.71x | n/a | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | n/a | 9ms | n/a | n/a | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | n/a | n/a | 73ms | n/a | n/a | n/a | 1.00x | |
nanargmax [^5] |
(1000,) | 1000 | 0ms | 0ms | n/a | 0ms | 13.07倍 | 0.21倍 | n/a | 1.00x |
(10000000,) | 10000000 | 31ms | 12ms | n/a | 12ms | 2.45x | 1.00x | n/a | 1.00x | |
(100, 100000) | 10000000 | 28ms | 13ms | n/a | 13ms | 2.16x | 1.00x | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 13ms | n/a | 13ms | n/a | 1.05倍 | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 133ms | n/a | 127ms | n/a | 1.05倍 | n/a | 1.00x | |
nanargmin [^5] |
(1000,) | 1000 | 0ms | 0ms | n/a | 0ms | 12.72倍 | 0.21倍 | n/a | 1.00x |
(10000000,) | 10000000 | 27ms | 13ms | n/a | 12ms | 2.19x | 1.01x | n/a | 1.00x | |
(100, 100000) | 10000000 | 26ms | 13ms | n/a | 12ms | 2.05x | 1.02x | n/a | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 13ms | n/a | 13ms | n/a | 1.05倍 | n/a | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 135ms | n/a | 129ms | n/a | 1.05倍 | n/a | 1.00x | |
nancount |
(1000,) | 1000 | 0ms | n/a | 0ms | 0ms | 2.24倍 | n/a | 0.05倍 | 1.00x |
(10000000,) | 10000000 | 5ms | n/a | 4ms | 3ms | 1.40x | n/a | 1.06x | 1.00x | |
(100, 100000) | 10000000 | 9ms | n/a | 3ms | 1ms | 11.00x | n/a | 4.16x | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | 4ms | 1ms | n/a | n/a | 3.58倍 | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | n/a | 45ms | 7ms | n/a | n/a | 6.74倍 | 1.00x | |
nanmax [^5] |
(1000,) | 1000 | 0ms | 0ms | 0ms | 0ms | 8.21倍 | 0.21倍 | 0.38倍 | 1.00x |
(10000000,) | 10000000 | 41ms | 12ms | 1ms | 13ms | 3.26x | 1.00x | 0.11x | 1.00x | |
(100, 100000) | 10000000 | 45ms | 41ms | 1ms | 13ms | 3.62x | 3.24x | 0.11x | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 40ms | 1ms | 12ms | n/a | 3.31倍 | 0.12倍 | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 402ms | 15ms | 121ms | n/a | 3.31倍 | 0.12倍 | 1.00x | |
nanmean |
(1000,) | 1000 | 0ms | 0ms | 0ms | 0ms | 1.32倍 | 0.02倍 | 0.20倍 | 1.00x |
(10000000,) | 10000000 | 23ms | 9ms | 27ms | 10ms | 2.42x | 0.98x | 2.83x | 1.00x | |
(100, 100000) | 10000000 | 28ms | 9ms | 27ms | 2ms | 13.58x | 4.54x | 13.13x | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 9ms | 27ms | 2ms | n/a | 4.56倍 | 13.69倍 | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 91ms | 310ms | 17ms | n/a | 5.39倍 | 18.39倍 | 1.00x | |
nanmin [^5] |
(1000,) | 1000 | 0ms | 0ms | 0ms | 0ms | 8.09倍 | 0.21倍 | 0.38倍 | 1.00x |
(10000000,) | 10000000 | 41ms | 12ms | 1ms | 13ms | 3.27x | 1.00x | 0.11x | 1.00x | |
(100, 100000) | 10000000 | 45ms | 41ms | 1ms | 13ms | 3.62x | 3.24x | 0.11x | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 40ms | 1ms | 12ms | n/a | 3.28倍 | 0.12倍 | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 401ms | 15ms | 122ms | n/a | 3.30倍 | 0.12倍 | 1.00x | |
nanquantile |
(1000,) | 1000 | 0ms | n/a | 0ms | 0ms | 1.46倍 | n/a | 0.57倍 | 1.00x |
(10000000,) | 10000000 | 186ms | n/a | 155ms | 198ms | 0.94x | n/a | 0.78x | 1.00x | |
(100, 100000) | 10000000 | 197ms | n/a | 181ms | 36ms | 5.45x | n/a | 5.01x | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | n/a | 425ms | 34ms | n/a | n/a | 12.50倍 | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | n/a | 4254ms | 331ms | n/a | n/a | 12.85倍 | 1.00x | |
nanstd |
(1000,) | 1000 | 0ms | 0ms | 0ms | 0ms | 1.06x | 0.06倍 | 0.46倍 | 1.00x |
(10000000,) | 10000000 | 29ms | 29ms | 53ms | 19ms | 1.50x | 1.51x | 2.75x | 1.00x | |
(100, 100000) | 10000000 | 33ms | 29ms | 53ms | 4ms | 8.29x | 7.35x | 13.27x | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 28ms | 55ms | 4ms | n/a | 7.25倍 | 14.43倍 | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 294ms | 600ms | 37ms | n/a | 8.02倍 | 16.35倍 | 1.00x | |
nansum |
(1000,) | 1000 | 0ms | 0ms | 0ms | 0ms | 1.28倍 | 0.02倍 | 0.08倍 | 1.00x |
(10000000,) | 10000000 | 22ms | 9ms | 24ms | 10ms | 2.28x | 0.97x | 2.52x | 1.00x | |
(100, 100000) | 10000000 | 27ms | 9ms | 24ms | 2ms | 17.71x | 6.24x | 16.05x | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 9ms | 25ms | 1ms | n/a | 6.05倍 | 16.66倍 | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 90ms | 282ms | 13ms | n/a | 6.71倍 | 21.07倍 | 1.00x | |
nanvar |
(1000,) | 1000 | 0ms | 0ms | 0ms | 0ms | 1.08倍 | 0.06倍 | 0.45倍 | 1.00x |
(10000000,) | 10000000 | 28ms | 28ms | 53ms | 19ms | 1.50x | 1.49x | 2.81x | 1.00x | |
(100, 100000) | 10000000 | 33ms | 28ms | 54ms | 4ms | 8.18x | 6.97x | 13.32x | 1.00x | |
(10, 10, 10, 10, 1000) | 10000000 | n/a | 28ms | 56ms | 4ms | n/a | 7.13倍 | 14.28倍 | 1.00x | |
(100, 1000, 1000) | 100000000 | n/a | 281ms | 601ms | 32ms | n/a | 8.71倍 | 18.65倍 | 1.00x |
[^1][^2][^3][^4]
注1:基准测试于2023年12月在Mac M1笔记本电脑上在numbagg的HEAD上运行,使用pandas 2.1.1、bottleneck 1.3.7、numpy 1.25.2,通过python numbagg/test/run_benchmarks.py -- --benchmark-max-time=10
执行。它们在CI中运行,尽管GHA的CPU数量较低,这意味着我们无法看到并行化的全部好处。
注2:虽然我们将设置和函数的运行分开,但pandas仍然需要做一些工作来创建其结果数据框,而numbagg在python中进行一些检查,而bottleneck在C中进行或在C中不进行。因此,为了使我们的总结能够关注计算速度,我们使用较大的数组进行基准测试,这样我们可以关注不会趋于无穷大的计算速度。欢迎任何有助于改进基准测试的贡献。
注3:在某些情况下,库可能没有确切的函数——例如,pandas没有等效的move_exp_nancount
函数,所以我们使用其sum
函数对一个包含1
的数组进行求和。同样,对于group_nansum_of_squares
,我们使用两个单独的操作。
[^4]: anynan
和 allnan
也是 numbagg 中的函数,但在此处未列出,因为它们需要不同的基准设置。
^[5]: 该函数目前尚未并行化,因此在可并行化的数组上性能较差。
示例实现
Numbagg 使得使用纯 Python/NumPy 编写灵活的聚合函数变得简单,这些函数由 Numba 加速。所有的工作都由 Numba 的 JIT 编译器和 NumPy 的 gufunc 机制(由 Numba 封装)来完成。
例如,以下是我们的 nansum
实现方式
import numpy as np
from numbagg.decorators import ndreduce
@ndreduce.wrap()
def nansum(a):
asum = 0.0
for ai in a.flat:
if not np.isnan(ai):
asum += ai
return asum
实现细节
Numbagg 包含了一些解决 NumPy/Numba 缺失功能的繁琐方法
- 它实现了自己的缓存,用于 Numba 的
guvectorize
封装函数,因为该装饰器相当慢。 - 它对其 数组转置处理进行了自己的处理,以处理降维函数中的
axis
参数。 - 它将普通函数重写为 gufunc,以允许在保持 gufunc 的多维优势的同时编写传统函数。
这里的一些想法已经流入 numba(例如,轴参数),我们希望其他想法也会随之而来。
许可
3-clause BSD 许可。包含 Bottleneck 的一部分,Bottleneck 是在简化版 BSD 许可下分发的。
项目详情
下载文件
下载您平台的文件。如果您不确定选择哪个,请了解更多关于 安装包 的信息。