Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Padding performance #4

Open
nilgoyette opened this issue May 31, 2022 · 1 comment
Open

Padding performance #4

nilgoyette opened this issue May 31, 2022 · 1 comment

Comments

@nilgoyette
Copy link
Contributor

nilgoyette commented May 31, 2022

Compared to NumPy, pad is slow. correlate1d is less slow but we still need to make it faster. Those functions are particularly important because they are the basis for several other functions:

  • correlate1d: convolve1d, gaussian_filter/1d, prewitt, sobel
  • pad: correlate1d, min/max filters

I'm not sure what NumPy does to be so fast. Of course, Cython is used, so it's compiled, optimized and there's no bound checks. but still, we're around 2-6x slower, depending in the mode. IMO, we should be able to be as fast, if not faster. It's a mystery to me how NumPy manages to be so fast (around 10x) on the reflect, symmetric and wrap modes.

To replicate the results, please use

use std::time::Instant;

use ndarray::{Array1, Array3, ShapeBuilder};

use ndarray_ndimage::{pad, PadMode};

fn main() {
    println!("C order");
    let data_c = (0..200 * 200 * 200)
        .map(|v| v as u32)
        .collect::<Array1<_>>()
        .into_shape((200, 200, 200))
        .unwrap();
    test(&data_c);

    println!("\nF order");
    let mut data_f = Array3::zeros((200, 200, 200).f());
    data_f.assign(&data_c);
    test(&data_f);
}

fn test(data: &Array3<u32>) {
    let now = Instant::now();
    let padded = pad(&data, &[[1, 1]], PadMode::Constant(0));
    let elapsed = now.elapsed();
    println!(
        "Constant 0 {}s {}ms  {}",
        elapsed.as_secs(),
        elapsed.subsec_millis(),
        padded[(0, 0, 0)]
    );

    let now = Instant::now();
    let padded = pad(&data, &[[1, 1]], PadMode::Constant(42));
    let elapsed = now.elapsed();
    println!(
        "Constant 42 {}s {}ms  {}",
        elapsed.as_secs(),
        elapsed.subsec_millis(),
        padded[(0, 0, 0)]
    );

    let now = Instant::now();
    let padded = pad(&data, &[[1, 1]], PadMode::Edge);
    let elapsed = now.elapsed();
    println!("Edge      {}s {}ms  {:?}", elapsed.as_secs(), elapsed.subsec_millis(), padded.dim());

    let now = Instant::now();
    let padded = pad(&data, &[[1, 1]], PadMode::Maximum);
    let elapsed = now.elapsed();
    println!("Maximum   {}s {}ms  {:?}", elapsed.as_secs(), elapsed.subsec_millis(), padded.dim());

    let now = Instant::now();
    let padded = pad(&data, &[[1, 1]], PadMode::Mean);
    let elapsed = now.elapsed();
    println!("Mean      {}s {}ms  {:?}", elapsed.as_secs(), elapsed.subsec_millis(), padded.dim());

    let now = Instant::now();
    let padded = pad(&data, &[[1, 1]], PadMode::Median);
    let elapsed = now.elapsed();
    println!("Median    {}s {}ms  {:?}", elapsed.as_secs(), elapsed.subsec_millis(), padded.dim());

    let now = Instant::now();
    let padded = pad(&data, &[[1, 1]], PadMode::Minimum);
    let elapsed = now.elapsed();
    println!("Minimum   {}s {}ms  {:?}", elapsed.as_secs(), elapsed.subsec_millis(), padded.dim());

    let now = Instant::now();
    let padded = pad(&data, &[[1, 1]], PadMode::Reflect);
    let elapsed = now.elapsed();
    println!("Reflect   {}s {}ms  {:?}", elapsed.as_secs(), elapsed.subsec_millis(), padded.dim());

    let now = Instant::now();
    let padded = pad(&data, &[[1, 1]], PadMode::Symmetric);
    let elapsed = now.elapsed();
    println!("Symmetric {}s {}ms  {:?}", elapsed.as_secs(), elapsed.subsec_millis(), padded.dim());

    let now = Instant::now();
    let padded = pad(&data, &[[1, 1]], PadMode::Wrap);
    let elapsed = now.elapsed();
    println!("Wrap      {}s {}ms  {:?}", elapsed.as_secs(), elapsed.subsec_millis(), padded.dim());
}

And for Python

from datetime import datetime
import time

import numpy as np


def test(a):
    start = datetime.now()
    np.pad(a, 1, mode='constant', constant_values=42)
    print(' constant', datetime.now() - start)

    start = datetime.now()
    np.pad(a, 1, mode='edge')
    print('     edge', datetime.now() - start)

    start = datetime.now()
    np.pad(a, 1, mode='maximum')
    print('  maximum', datetime.now() - start)

    start = datetime.now()
    np.pad(a, 1, mode='mean')
    print('     mean', datetime.now() - start)

    start = datetime.now()
    np.pad(a, 1, mode='median')
    print('   median', datetime.now() - start)

    start = datetime.now()
    np.pad(a, 1, mode='minimum')
    print('  minimum', datetime.now() - start)

    start = datetime.now()
    np.pad(a, 1, mode='reflect')
    print('  reflect', datetime.now() - start)

    start = datetime.now()
    np.pad(a, 1, mode='symmetric')
    print('symmetric', datetime.now() - start)

    start = datetime.now()
    np.pad(a, 1, mode='wrap')
    print('     wrap', datetime.now() - start)
    

print('C order')
a = np.arange(200*200*200).reshape((200, 200, 200)).astype(np.uint32)
test(a)

print('\nF order')
a = np.asfortranarray(a)
test(a)
@nilgoyette nilgoyette mentioned this issue May 31, 2022
@nilgoyette
Copy link
Contributor Author

I did some tests with Constant because it's the simpler

  • Constant(0.0) is always faster than Constant(1.0) I don't get it.
  • I modified Constant(1) to simply fill the array (not copy the input into the smaller region) and NumPy is still faster than us. Of course I can't beat that!
  • I did the same with Constant(0), it takes 0ms. Some stuff might be optimized away.

I read NumPy code in numpy/lib/arraypad.py. There's nothing special, no magic. It's doing the same thing I am doing. I don't think it's possible to be faster than NumPy with ndarray 0.15. I think NumPy has simply been optimized by a lot of excellent programmers over the years, while ndarray hasn't.

@nilgoyette nilgoyette changed the title pad is slow pad and correlate1d are slow Sep 14, 2022
@nilgoyette nilgoyette changed the title pad and correlate1d are slow Performance Sep 14, 2022
@nilgoyette nilgoyette changed the title Performance Padding performance Nov 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant