Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equivalent of numpy.where or numpy.nonzero #466

Open
tsoernes opened this issue Jun 23, 2018 · 8 comments
Open

Equivalent of numpy.where or numpy.nonzero #466

tsoernes opened this issue Jun 23, 2018 · 8 comments

Comments

@tsoernes
Copy link

Is there an efficient way to get the indecies of non-zero or true elements?

@jturner314
Copy link
Member

This is an example that provides solutions for both problems:

#[macro_use]
extern crate ndarray;

fn main() {
    let numbers = array![[0, 1, 0], [2, 0, 3]];
    let nonzero: Vec<_> = numbers
        .indexed_iter()
        .filter_map(|(index, &item)| if item != 0 { Some(index) } else { None })
        .collect();
    assert_eq!(nonzero, vec![(0, 1), (1, 0), (1, 2)]);

    let bools = array![[false, true, false], [true, false, true]];
    let nonzero: Vec<_> = bools
        .indexed_iter()
        .filter_map(|(index, &item)| if item { Some(index) } else { None })
        .collect();
    assert_eq!(nonzero, vec![(0, 1), (1, 0), (1, 2)]);
}

It's probably not particularly fast, especially for arrays that don't have a row-major memory layout, but it works.

Out of curiosity, what's your use-case for this? There may be a better way to solve your problem than using a list of indices.

@tsoernes
Copy link
Author

The use is (partially) to select a non-zero entry at random and set it to zero, as part of a simulator. where and nonzero are pretty commonly used functions in numpy, it would be nice with a fast, built in function. Thanks anyhow

@jturner314
Copy link
Member

Yeah, I'd like to add something like where or nonzero to ndarray. I was curious about your use-case because where and nonzero didn't seem very useful without support for slicing with index arrays (which I'd also like to add to ndarray), but using them for random selection make sense.

@bluss
Copy link
Member

bluss commented Nov 22, 2018

I wonder if this problem would be more efficient if we had something like a bitmap. Storing indices just sounds so inefficient and an ndimensional bitmap could be a faster (just a wild idea?) way to represent a set of locations in an array.

@tsoernes
Copy link
Author

I have also found that masks can often substitute for lists of indexes. But there are certainly use cases where this is not so. For example, I often have the need to get the values at a set of non-continuous indexes in a large matrix. If the matrix is large compared to the number of values you want to extract; e.g. you don't want to multiply a million zeroes to leave only 10 entries as-is. See e.g. the latter half of this file: https://github.com/tsoernes/rustdca/blob/master/src/gridfuncs.rs where any reference to neigh(bor) or ch(annel) is an array of indexes. Most of these algorithms cannot be feasibly be converted to the use of bitmasks. np.where is used all over in machine learning.

@bluss
Copy link
Member

bluss commented Nov 23, 2018

So, a sparse bitmask maybe?

@tsoernes
Copy link
Author

tsoernes commented Nov 23, 2018 via email

@bluss
Copy link
Member

bluss commented Nov 24, 2018

I guess a good bitmask can handle both scenarios. There's an alternative to list of indices here https://github.com/brettwooldridge/SparseBitSet/blob/master/SparseBitSet.pdf and then there's roaring, and I'm not sure exactly what data structure it is using. https://github.com/RoaringBitmap/RoaringBitmap

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants