A better way to resize the buffer for the snappy encode/decode #6276

ShiKaiWi · 2024-08-20T09:36:10Z

Is your feature request related to a problem or challenge? Please describe what you are trying to do.

In my use case where reading data from a big parquet file on disk, the snappy decoding costs too much cpu resources in my flamegraph. And digging into the codebase, I find that resize is used to initialize the buffer for the decoding destination buffer, but resize shows a bad performance when the expected size is large enough.

Here is the code location about the resize:

arrow-rs/parquet/src/compression.rs

Lines 211 to 226 in 25d39c1

    
           fn decompress( 
        
               &mut self, 
        
               input_buf: &[u8], 
        
               output_buf: &mut Vec<u8>, 
        
               uncompress_size: Option<usize>, 
        
           ) -> Result<usize> { 
        
               let len = match uncompress_size { 
        
                   Some(size) => size, 
        
                   None => decompress_len(input_buf)?, 
        
               }; 
        
               let offset = output_buf.len(); 
        
               output_buf.resize(offset + len, 0); 
        
               self.decoder 
        
                   .decompress(input_buf, &mut output_buf[offset..]) 
        
                   .map_err(|e| e.into()) 
        
           }

And here is an example showing the bad performance of resize a vector:
https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=5321aaeda859ea9f664ae7952ade2fb6

And the example's output says:

init with macro, num_elems:10000, cost:7.74µs
init with resize, num_elems:20000, cost:196.694µs
init with set_len, num_elems:30000, cost:5.171µs

It proves resize a vector to 10000 costs too much more time than vec! macro or set_len.

Describe the solution you'd like

In the example above, set_len shows a good performance, so I guess resize can be replaced with set_len if the capacity is enough:

let new_len = offset + len;
if output_buf.capacity() >= new_len {
    unsafe {
        output_buf.set_len(new_len);
    }
} else {
    output_buf.resize(new_len, 0);
}

Although unsafe block is introduced here, it is actually safe considering:

the element of the vector is byte (no worry about the destruction);
the uninitialized content will be overwritten immediately.

Describe alternatives you've considered

Additional context

The text was updated successfully, but these errors were encountered:

ShiKaiWi · 2024-08-21T02:36:39Z

In the linked PR, I propose a new way to do resize without initialization:

fn resize_without_init(buf: &mut Vec<u8>, n: usize) {
    if n > buf.capacity() {
        buf.reserve(n - buf.len());
    }
    unsafe { buf.set_len(n) };
}

ShiKaiWi · 2024-08-22T13:13:30Z

Close. The conclusion for this ticket is here.

alamb · 2024-08-31T13:34:29Z

label_issue.py automatically added labels {'parquet'} from #6281

ShiKaiWi added the enhancement Any new improvement worthy of a entry in the changelog label Aug 20, 2024

ShiKaiWi mentioned this issue Aug 21, 2024

avoid resizing if capacity is enough when (de)compressing #6281

Closed

ShiKaiWi closed this as completed Aug 22, 2024

alamb added the parquet Changes to the parquet crate label Aug 31, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A better way to resize the buffer for the snappy encode/decode #6276

A better way to resize the buffer for the snappy encode/decode #6276

ShiKaiWi commented Aug 20, 2024 •

edited

Loading

ShiKaiWi commented Aug 21, 2024

ShiKaiWi commented Aug 22, 2024

alamb commented Aug 31, 2024

A better way to resize the buffer for the snappy encode/decode #6276

A better way to resize the buffer for the snappy encode/decode #6276

Comments

ShiKaiWi commented Aug 20, 2024 • edited Loading

ShiKaiWi commented Aug 21, 2024

ShiKaiWi commented Aug 22, 2024

alamb commented Aug 31, 2024

ShiKaiWi commented Aug 20, 2024 •

edited

Loading