-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support array of &str / unicode_ / string #141
Comments
Arrays containing |
The main problem with the |
This would be really helpful to us if anyone is attempting to do it. I can give it a shot as well.
In our case (writing a parser) we know the value (e.g. it's 2 or 12) at compile time. And we now have const generics. |
From my point of view this is basically blocked on #186 which is blocked on more versatile buffer protocol support in PyO3, c.f. #321
Are you sure that NumPy will not dynamically use a size between 2 or 12 if the strings in the array are all shorter than 12 code points? Do you set the dtype explicitly when constructing the arrays?
I would still advise on waiting a bit until we bump our MSRV from 1.48 to most likely 1.56 when Debian Bookworm is released as you will need to provide some Of course, creating a draft PR which works expect for the MSRV build might still be a good thing to start discussing the work. I think impl<const N: usize> Element for [Py_UCS4; N] { .. } and a test of your use case would be the minimum required? |
Hmm, I can't see a relation between supporting e.g. a U2 arrays and record types (which are, in my opinion, not very standard practice in the numpy world), but I could be easily missing the context.
NumPy doesn't appear to automatically compact the dtype of arrays, i.e. it supports the case where the dtype is U<M and all strings are of max size N < M: In [9]: x = np.array(['a', 'b'], dtype='<U12')
In [10]: x
Out[10]: array(['a', 'b'], dtype='<U12') As to explicitly setting the dtype: I'm not entirely sure what you mean, but we can; our explicit use case would be writing (in Rust using PyO3/maturin) a function that takes a file and returns a numpy array of type
Makes sense, thanks for the context. Do you have a vague guess as to when that'll be?
Sounds good. This isn't a priority for us right now so I can't promise this will be done with any haste, but thanks for pointing out where to start. |
The main thing is that we need a more general integration with Python's buffer protocol where we do not assume that the element type
I meant you did, pass >>> np.array(['a', 'b'])
array(['a', 'b'], dtype='<U1') where NumPy automatically chooses the smallest possible
Debian Bookworm is expected to be released on 2023-06-10 which is what we are waiting for to bump our MSRV. |
@rachtsingh Maybe #378 is already useful for you? Merging will have to wait for our MSRV bump though as discussed above. |
Like described, it would allow storing numpy native operations.
The text was updated successfully, but these errors were encountered: