Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add feature to interop with bytes crate, add encode_buf and decode_buf? #322

Open
shahn opened this issue Sep 13, 2024 · 7 comments
Open

Comments

@shahn
Copy link
Contributor

shahn commented Sep 13, 2024

I guess a lot of people are working with the bytes crate when using tokio and network protocols. It would be nice if rasn could work directly with the relevant types, with an API similar to https://docs.rs/tokio/latest/tokio/io/trait.AsyncReadExt.html#method.read_buf (so instead of encode you'd call encode_buf), to allow re-using buffers.

@XAMPPRocky
Copy link
Collaborator

Thank you for your issue! However we already do support Bytes, we just have it aliased as OctetString.

@shahn
Copy link
Contributor Author

shahn commented Sep 14, 2024

Hi XAMPPRocky, thanks for your quick reply! I am not sure if maybe there's a misunderstanding: It seems you're using the OctetString as an alias for bytes::Bytes, but not to decode from/encode into, but rather to support the ASN1 data type. As far as I see, right now one would decode from a &[u8] and encode into a Vec, rather than using the bytes crate for that.

@Nicceboy
Copy link
Contributor

I have a working branch which introduces new lifetime type for Encoder and allows reusing the same buffer recursively quite much with the new AnyEncoder associated type, without Rc or RefCell. It works with plain &mut Vec<u8> type, for OER and maybe for PER too with .view_bits_mut::<Msb0>(); (haven't tested yet).

From performance wise, it would also make sense to introduce encode_buf method since it can more than half the required allocations if the same buffer can be reused. However, I am not sure if this type could be Bytes, at least internally, since it is much slower. Generic type internally would also make things complicated. Would there be some other approach?

@shahn
Copy link
Contributor Author

shahn commented Oct 30, 2024

Interesting, do you know why Bytes is so much slower? Perhaps that is something that also would be of interest to the tokio folks, since I assume they care about Bytes performance a lot

@Nicceboy
Copy link
Contributor

This might a bit unfair comparision since Vec might be as fast as it gets, if you don't need some features of Bytes type. Bytes type has an advantage that you can split or slice the buffer without any new allocations (just by getting new "view" to the same buffer), so if you need to do that often, it might be faster in the end. It uses reference counters, and internal structure is more complicated, so just adding data to the buffer is slower in general, and the tradeoff comes here. It also allows shared ownership and zero-copy cloning.

There are actually few cases where I haven't tested yet where that splitting could be valuable and maybe the performance difference is not that clear. In general, there are many cases where you need to store the encoded values in separate buffer before you can calculate the length, and splitting can be useful if it saves some allocations or moving/cloning. But even if it would be possible to use Bytes in OER, for example , PER has a high dependency of BitVec so that change is unlikely and would require major rework and encode_buf should work somewhat similarly for all codecs.

Of course encode_buf could be also implemented "inefficiently" so that the caller provides the buffer but the codec does not use it internally and just moves the data in there in the end.

@shahn
Copy link
Contributor Author

shahn commented Oct 30, 2024

I mean, I think the last option is not really what I was after, the goal was to reduce allocations for sure :) I use Bytes a lot for receiving data from the network and then working with it, to avoid allocations (I had some trouble with memory fragmentation in a different project, so kind of defaulting to it). But even providing a Vec which can be re-used would be helpful for that.

Thanks for educating me on the intricacies of different encodings, I am pretty new to ASN.1.

@Nicceboy
Copy link
Contributor

Just a note, OER has now encode_buf for &mut Vec<u8> type for starters. There is internally second buffer for length calculations but other than that, allocations are now almost at theoretical minimum. Traits should support adding the same for other codecs too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants