Support `de/compress_into(ptr, len)` #23

milesgranger · 2021-02-15T14:34:59Z

de/compress directly into a Python buffer

milesgranger · 2021-02-15T20:40:40Z

@martindurant will something like this work for you?

>>> import numpy as np
>>> from cramjam import snappy
>>> values = np.zeros(100, dtype=np.uint8)
>>> snappy.compress_into(b"bytes", values)
>>> values
array([255,   6,   0,   0, 115,  78,  97,  80, 112,  89,   1,   9,   0,
         0, 181, 139, 168, 219,  98, 121, 116, 101, 115,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,   0,
         0,   0,   0,   0,   0,   0,   0,   0,   0], dtype=uint8)
>>> values.tobytes()
b'\xff\x06\x00\x00sNaPpY\x01\t\x00\x00\xb5\x8b\xa8\xdbbytes\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\
x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

then decompress_into(bytes, array) as well?

martindurant · 2021-02-15T20:56:22Z

Yes, perfect. In practice, I only expect to use decompress_into, although compress_into might have an interesting application with memmapped files. I would have either function return the number of bytes written.

Am I right in assuming that the zero padding you have if the length isn't exactly right would cause the decompress to panic?

milesgranger · 2021-02-15T21:11:03Z

Ok, when prototyping that, felt like one might have wanted to compress_into(source_array, dst_array) but if it's typically compress_into(bytes, array) then I'm good with that as well. 👍

and yes, just noticed that myself, should probably return the number of bytes written. Good catch.

Most of the de/compression APIs implement the std::io::Read, meaning that if passed a slice, we can read the input which will de/encode until it reaches the end of the output buffer. While a Vec<u8> would continue to grow until all bytes are de/encoded with read_to_end. The point being, with a reference to the array, we can only have a slice, thus only have the ability to write to the end of the buffer.

So long as it successfully gets the references, it would not panic, regardless if the output buffer was too short, or too long.

martindurant · 2021-02-15T21:34:31Z

The point being, with a reference to the array, we can only have a slice, thus only have the ability to write to the end of the buffer.

There's no particular reason to handle this on the rust side, since making a slice/view/memoryview on the python side is no-copy, almost zero-cost. More a case of idle curiosity. Probably needs just a couple of comments in the eventual test-suite - i.e., we'll get the number of bytes back at compression, and then slice the resultant data for decompression and get the original back.

milesgranger mentioned this issue Feb 15, 2021

Reduce allocations & accept bytes and bytearray inputs #22

Merged

milesgranger mentioned this issue Feb 16, 2021

Support de/compress_into #26

Merged

milesgranger closed this as completed in #26 Feb 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `de/compress_into(ptr, len)` #23

Support `de/compress_into(ptr, len)` #23

milesgranger commented Feb 15, 2021

milesgranger commented Feb 15, 2021 •

edited

Loading

martindurant commented Feb 15, 2021

milesgranger commented Feb 15, 2021 •

edited

Loading

martindurant commented Feb 15, 2021

Support de/compress_into(ptr, len) #23

Support de/compress_into(ptr, len) #23

Comments

milesgranger commented Feb 15, 2021

milesgranger commented Feb 15, 2021 • edited Loading

martindurant commented Feb 15, 2021

milesgranger commented Feb 15, 2021 • edited Loading

martindurant commented Feb 15, 2021

Support `de/compress_into(ptr, len)` #23

Support `de/compress_into(ptr, len)` #23

milesgranger commented Feb 15, 2021 •

edited

Loading

milesgranger commented Feb 15, 2021 •

edited

Loading