Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A better way to resize the buffer for the snappy encode/decode #6276

Closed
ShiKaiWi opened this issue Aug 20, 2024 · 3 comments
Closed

A better way to resize the buffer for the snappy encode/decode #6276

ShiKaiWi opened this issue Aug 20, 2024 · 3 comments
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate

Comments

@ShiKaiWi
Copy link
Member

ShiKaiWi commented Aug 20, 2024

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

In my use case where reading data from a big parquet file on disk, the snappy decoding costs too much cpu resources in my flamegraph. And digging into the codebase, I find that resize is used to initialize the buffer for the decoding destination buffer, but resize shows a bad performance when the expected size is large enough.

Here is the code location about the resize:

fn decompress(
&mut self,
input_buf: &[u8],
output_buf: &mut Vec<u8>,
uncompress_size: Option<usize>,
) -> Result<usize> {
let len = match uncompress_size {
Some(size) => size,
None => decompress_len(input_buf)?,
};
let offset = output_buf.len();
output_buf.resize(offset + len, 0);
self.decoder
.decompress(input_buf, &mut output_buf[offset..])
.map_err(|e| e.into())
}

And here is an example showing the bad performance of resize a vector:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=5321aaeda859ea9f664ae7952ade2fb6

And the example's output says:

init with macro, num_elems:10000, cost:7.74µs
init with resize, num_elems:20000, cost:196.694µs
init with set_len, num_elems:30000, cost:5.171µs

It proves resize a vector to 10000 costs too much more time than vec! macro or set_len.

Describe the solution you'd like

In the example above, set_len shows a good performance, so I guess resize can be replaced with set_len if the capacity is enough:

let new_len = offset + len;
if output_buf.capacity() >= new_len {
    unsafe {
        output_buf.set_len(new_len);
    }
} else {
    output_buf.resize(new_len, 0);
}

Although unsafe block is introduced here, it is actually safe considering:

  1. the element of the vector is byte (no worry about the destruction);
  2. the uninitialized content will be overwritten immediately.

Describe alternatives you've considered

Additional context

@ShiKaiWi ShiKaiWi added the enhancement Any new improvement worthy of a entry in the changelog label Aug 20, 2024
@ShiKaiWi
Copy link
Member Author

In the linked PR, I propose a new way to do resize without initialization:

fn resize_without_init(buf: &mut Vec<u8>, n: usize) {
    if n > buf.capacity() {
        buf.reserve(n - buf.len());
    }
    unsafe { buf.set_len(n) };
}

@ShiKaiWi
Copy link
Member Author

Close. The conclusion for this ticket is here.

@alamb alamb added the parquet Changes to the parquet crate label Aug 31, 2024
@alamb
Copy link
Contributor

alamb commented Aug 31, 2024

label_issue.py automatically added labels {'parquet'} from #6281

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Any new improvement worthy of a entry in the changelog parquet Changes to the parquet crate
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants