-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: Add new efficient API read_to_vec #246
Conversation
|
fn read_to_vec(&self, offset: u64, len: usize, dst: &mut std::vec::Vec<u8>) { | ||
dst.resize(len, 0); | ||
self.read(offset, &mut dst[..]); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure if we want to have this be part of the Memory
trait. It depends on how religious we want to be here.
The Memory
trait is supposed to model the operations that are available for stable memory which in turn models the operations available for heap memory in Wasm. There's no read_to_vec
so following that logic, maybe we avoid adding it to the trait.
We can probably have it as a free function or you think there's a significant advantage in having it on the trait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benefit of having it on the trait is that implementors of the trait can decide to override it in an efficient way. For example, Ic0StableMemory
can read directly read into a Vec
skipping initialization.
is supposed to model the operations that are available for stable memory
That's a valid point. Another alternative could be changing the read method to read into dst: &mut [MaybeUninit<u8>]
which is technically the truth. It's unfortunately an unsolved problem in Rust to create uninitialized [u8]
slices (something totally acceptable in C or C++). The documentation is very explicit about warning people not to create slices with uninitialized values even if those values will never be read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One more alternative would be changing the read method to take a raw pointer and a length. In such cases, Rust does not require any initialization. Callers could then do:
let mut v = Vec::with_capacity(len);
memory.read(offset, len, v.as_mut_ptr());
v.set_len(x);
However, read
would likely need to be marked as unsafe since clients would need to guarantee that len
number of bytes after the passed pointer are writable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested what if we replace read_to_vec with unsafe fn read_unsafe(&self, offset: u64, dst: *mut u8, count: usize)
It allows for even better optimizations than read_to_vec
leading to up to 40% instruction reduction. This function is unsafe because clients must promise that count
number of bytes after dst are writeable.
(We could still keep read_to_vec
but make it into an associated method like you mentioned. It would provide a safe and efficient way for clients to read and would use read_unsafe
under the hood.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, to make sure I understand your suggestion correctly, it would be:
- Add
unsafe fn read_unsafe
for maximum performance if clients know what they are doing -- not super excited because I think it can be a big trap but as long as we provide safe alternatives I'm not strongly opposed. - Add
read_to_vec
as an associated method, not on the trait. - We would also leave the original
read
but it's the slowest of them all?
Thanks for all the exploration around possible options.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's correct. I'd keep the original read
because sometimes you have a fixed-size array that you want to fill. If the array is small, initializing it is fast and using a vector is unnecessary (and slower). read
is the best option for this use-case.
When reading a larger blob, it's better not to allocate on the stack, so reading into a Vec
is the best approach. read_to_vec
can be used for that.
And Memory
implementations must be able to read into uninitialized buffers (pointer + length) which is trivial for the mostly used Ic0Memory implementation. The default implementation will just initialize the buffer with 0-s first and then call the existing read()
. If we make read_to_vec
into an associated function, we don't have to pollute the Memory interface with Vec
.
if clients know what they are doing
The API will be the same as for example std::ptr::copy
with the same requirements. And we can provide safe wrappers while staying efficient.
I'll send a new PR with this new approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bit of a random question, but we have several canisters that use stable-structures, and we'd love to benefit from these improvements - would we be able to just by upgrading the crate once this change is out?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rikonor Yeah, we'll need to cut a new release because there's a few optimizations already included that are unreleased. Maybe we can target to make a release after this change with read_to_vec
is merged.
Superseded by #248 |
I found that a source of significant performance loss is the
read
method ofMemory
. Theread
method takes a mutable buffer which it fills with values read from the stable memory. According to Rust rules, the buffer passed toread
must be initialized before it's passed toread
(buffers containing uninitialized values are unsound and can cause UB).The usual pattern is to create a properly sized
Vec
, eg. by usingvec![0; size]
orvec.resize(size, 0)
and pass that toread
. However, initializing the bytes with values that get overwritten byread
is only necessary in order to be sound and requires significant number of instructions.This PR introduces a new method
read_to_vec
which allows passing in aVec
with any size (most of the time an emptyVec
will be used). Implementations can be more efficient by reading directly into theVec
and skipping initialization. This can lead to instruction reductions of up to 20%.