Avoid double memory allocation when encoding/decoding state witness #11064

pugachAG · 2024-04-12T20:03:50Z

Currently we allocate memory for borsh bytes before compressing state witness.
This can be avoided with streaming, see this comment
The implementation would look something like this:

        let mut buf = Vec::new().writer();
        let mut encoder = zstd::stream::Encoder::new(&mut buf, STATE_WITNESS_COMPRESSION_LEVEL)?;
        borsh::to_writer(&mut encoder, witness)?;
        encoder.finish()?;

The only missing bit is counting bytes in borsh representation of state witness. This can be implemented with wrapper on top of Write to count bytes and then write it downstream. Maybe bytes crate already has something like this.

Decoding is similar, see this comment.

The text was updated successfully, but these errors were encountered:

nagisa · 2024-04-13T09:02:56Z

A sketch of an implementation:

struct LimitedWrite<W: Write> {
    limit: usize,
    written: usize,
    inner: W,
}

impl<W: Write> Write for LimitedWrite {
    fn write(&mut self, mut buf: &[u8]) -> io::Result<usize> {
        let mut to_write = buf.len();
        if self.written.saturating_add(to_write) >= self.limit {
            to_write = self.limit - self.written;
        }
        let written = self.inner.write(&buf[..to_write])?;
        self.written = self.written.saturating_add(written);
        Ok(written)
    }
}

tayfunelmas · 2024-04-15T20:29:58Z

Looks like a good starting task and already has the solution, so claiming it :P

…g state witness (#11232) Use streaming reads and writes for encoding and decoding the state witness. Introduce CountingRead and CountingWrite as wrappers for Read and Write, respectively, to count the number of bytes read/written. These are used to pipe the state witness through compression and Borsh serialization, while counting the number of bytes of Borsh serialized version of the witness. For encoding, CountingWrite connects `borsh::to_writer` to `zstd::stream::Encoder`, while counting the bytes from Borsh serialization in between. For decoding, CountingRead connects `zstd::stream::Decoder` to `borsh::from_reader`, while counting the bytes from Borsh deserialization in between. It also applies a limit, when exceeded fails the decoding. Issue: #11064

tayfunelmas · 2024-05-08T19:24:16Z

Addressed in #11232.

pugachAG added C-good-first-issue Category: issues that are self-contained and easy for newcomers to work on. A-stateless-validation Area: stateless validation labels Apr 12, 2024

pugachAG mentioned this issue Apr 12, 2024

feat: compress state witness #10715

Merged

walnut-the-cat mentioned this issue Apr 15, 2024

[ProjectTracking]: Stateless validation Mainnet Release near/near-one-project-tracking#46

Open

52 tasks

tayfunelmas self-assigned this Apr 15, 2024

github-actions bot mentioned this issue May 1, 2024

Monthly issue metrics report #11194

Open

tayfunelmas mentioned this issue May 3, 2024

feat: Stream Borsh serialization and compression for encoding/decoding state witness #11232

Merged

tayfunelmas closed this as completed May 8, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Avoid double memory allocation when encoding/decoding state witness #11064

Avoid double memory allocation when encoding/decoding state witness #11064

pugachAG commented Apr 12, 2024

nagisa commented Apr 13, 2024

tayfunelmas commented Apr 15, 2024

tayfunelmas commented May 8, 2024 •

edited

Loading

Avoid double memory allocation when encoding/decoding state witness #11064

Avoid double memory allocation when encoding/decoding state witness #11064

Comments

pugachAG commented Apr 12, 2024

nagisa commented Apr 13, 2024

tayfunelmas commented Apr 15, 2024

tayfunelmas commented May 8, 2024 • edited Loading

tayfunelmas commented May 8, 2024 •

edited

Loading