-
Notifications
You must be signed in to change notification settings - Fork 341
Closed
Labels
Description
On my computer
let a = Array3::<f64>::zeros((500,600,700));
let b = a.sum_axis(Axis(2));
is about 9 times slower (500ms vs 55ms) than
let a = Array3::<f64>::zeros((500,600,700));
let mut c = Array2::<f64>::zeros((500,600));
Zip::from(&mut c).and(a.lanes(Axis(2))).for_each(|c,a| {
*c = a.sum();
});
On the other hand when summing on Axis(0)
sum_axis is about two times faster than Zip.
The gain seems big enough to me to warrant an optimization but I'm not sure how to deal with intermediate cases (summing on some middle axis). Or can we just add the optimization for the summation on the last axis as it's a very common case ?
(Of course it's the other way around for a column major array.)