-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Request: extract up to a full (underlying) word type from BitFieldVec #52
Comments
For reference, I also added this to my own 'packed' wrapper type, which I think I created specifically because https://github.com/RagnarGrootKoerkamp/minimizers/blob/master/src/par/packed.rs#L134 In the future, I would also like to have a |
What about getting an iterator, rather then a slice? Then you could collect it. I realize that the method you suggest is more convenient tho. I think there's a way to get an unchecked iterator. So get_slice should be trivial to implement using that. |
Hi Sebastiano, Indeed, the solution I am currently implementing to obtain what I need given the existing interface is to call --Rob |
Ah ok I mistakenly understood you wanted them unpacked. I already have this code in Java (the sublist code of the list view of a bit vector). I think it shouldn't be too difficult to port it. And yes, it can be significantly optimized. Or, we can adapt Ragnar's code. |
Hey, is there still interest for this? |
I think I worked around it for now in my own code, but for the future, yes this is definitely still useful. I think the API that Rob suggested, taking a start index and number of elements to cover, makes sense. This is easy when the value fits in up to 7 bytes (or up to 58 bits, to be precise, or better: not 59, 61, 62, or 63 bits), since then a single unaligned read suffices. After that you may need two reads. |
Yes, I would still be interested in this feature and think it would be valuable in several contexts. |
I was wondering if it would be possible to add a function to BitFieldVec that would allow extract multiple elements and returning them in the underlying word type. My specific use-case is this (but I think it's a common pattern).
I have a long string of DNA (alphabet of size 4) that I am packing into a
BitFieldVec
of width 2. Later, however, I want to extract entire k-mers (substrings of fixed lengthk
). It would be great to be able to extract an entirek
-mer at once, which for most common values ofk
(i.e. k <= 32) would either be entirely in one underlyingu64
, or would span at most 2. Right now, I have to extract the individual 2-bit characters one-by-one and build up thek
-mer, which is quite inefficient.I would propose something like
get_slice(start, num_elem)
orget_slice_in_word(start, num_elem)
, but am happy to leave the design aspects to your preference. Let me know if you have any questions or feedback about the request.Thanks!
Rob
The text was updated successfully, but these errors were encountered: