Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid double memory allocation when encoding/decoding state witness #11064

Closed
Tracked by #46
pugachAG opened this issue Apr 12, 2024 · 3 comments
Closed
Tracked by #46

Avoid double memory allocation when encoding/decoding state witness #11064

pugachAG opened this issue Apr 12, 2024 · 3 comments
Assignees
Labels
A-stateless-validation Area: stateless validation C-good-first-issue Category: issues that are self-contained and easy for newcomers to work on.

Comments

@pugachAG
Copy link
Contributor

Currently we allocate memory for borsh bytes before compressing state witness.
This can be avoided with streaming, see this comment
The implementation would look something like this:

        let mut buf = Vec::new().writer();
        let mut encoder = zstd::stream::Encoder::new(&mut buf, STATE_WITNESS_COMPRESSION_LEVEL)?;
        borsh::to_writer(&mut encoder, witness)?;
        encoder.finish()?;

The only missing bit is counting bytes in borsh representation of state witness. This can be implemented with wrapper on top of Write to count bytes and then write it downstream. Maybe bytes crate already has something like this.

Decoding is similar, see this comment.

@pugachAG pugachAG added C-good-first-issue Category: issues that are self-contained and easy for newcomers to work on. A-stateless-validation Area: stateless validation labels Apr 12, 2024
@nagisa
Copy link
Collaborator

nagisa commented Apr 13, 2024

A sketch of an implementation:

struct LimitedWrite<W: Write> {
    limit: usize,
    written: usize,
    inner: W,
}

impl<W: Write> Write for LimitedWrite {
    fn write(&mut self, mut buf: &[u8]) -> io::Result<usize> {
        let mut to_write = buf.len();
        if self.written.saturating_add(to_write) >= self.limit {
            to_write = self.limit - self.written;
        }
        let written = self.inner.write(&buf[..to_write])?;
        self.written = self.written.saturating_add(written);
        Ok(written)
    }
}

@tayfunelmas
Copy link
Contributor

Looks like a good starting task and already has the solution, so claiming it :P

@tayfunelmas tayfunelmas self-assigned this Apr 15, 2024
github-merge-queue bot pushed a commit that referenced this issue May 6, 2024
…g state witness (#11232)

Use streaming reads and writes for encoding and decoding the state
witness.

Introduce CountingRead and CountingWrite as wrappers for Read and Write,
respectively, to count the number of bytes read/written. These are used
to pipe the state witness through compression and Borsh serialization,
while counting the number of bytes of Borsh serialized version of the
witness.

For encoding, CountingWrite connects `borsh::to_writer` to
`zstd::stream::Encoder`, while counting the bytes from Borsh
serialization in between.

For decoding, CountingRead connects `zstd::stream::Decoder` to
`borsh::from_reader`, while counting the bytes from Borsh
deserialization in between. It also applies a limit, when exceeded fails
the decoding.


Issue: #11064
@tayfunelmas
Copy link
Contributor

tayfunelmas commented May 8, 2024

Addressed in #11232.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-stateless-validation Area: stateless validation C-good-first-issue Category: issues that are self-contained and easy for newcomers to work on.
Projects
None yet
Development

No branches or pull requests

3 participants