[WIP] Efficient movement of Vec<u8> to PyBytes #1427

milesgranger · 2021-02-13T11:52:18Z

Hi!

This is related to #1385

I have a de/compression lib which gets some bytes in from Python, then passes them back.
Really like how PyBytes::new_with works; unfortunately, I don't know the exact size of the slice I want until after the de/compression operation. Using this approach, I end up with trailing null bytes.

>>> compress(b'bytes')
b'\xff\x06\x00\x00sNaPpY\x01\t\x00\x00\xb5\x8b\xa8\xdbbytes\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'

Using the standard PyBytes::new from a slice, ends up making the new allocation and thus just puts the lib slower than the existing Python/C options. However, the approach above can be reliably 1.5x - 2x faster; so really pulling that something like this is possible. 🤞

As you see in the PR, I naively attempted to use ffi::_PyBytes_Resize but it seems to have no effect?

Thanks for the help!

(related to benchmarks: milesgranger/cramjam#12 (comment))

birkenfeld · 2021-02-13T12:05:32Z

The fact that _PyBytes_Resize gets a double pointer is a hint to how it works: it does a realloc, so the object's address can change, and the old address is invalid.

This is definitely not safe to use with the Py reference, I think you'd have to do everything in PyObject* form.

However, this (overallocating and then truncating) sounds more like a task for bytearray, which natively supports truncation?

milesgranger · 2021-02-13T14:51:35Z

Thanks for the pointers! 👍

indygreg · 2021-02-27T18:47:12Z

I was going to file an issue requesting a higher-level API utilizing _PyBytes_Resize because I want it for https://github.com/indygreg/python-zstandard so (de)compression uses 0-copy and stumbled upon this issue. (The existing C extension code uses _PyBytes_Resize heavily.)

While I agree bytearray is a better type for mutable binary buffers, the reason I don't want to use it is because it is a separate type from bytes. python-zstandard today uses bytes as its output type and switching this to bytearray would likely cause issues with existing code. Yes, you can coerce from bytearray to bytes easily. However, this isn't 0-copy and is therefore prone to performance overhead.

I could certainly call the Python C APIs directly. But I was hoping there would be room in PyO3 to support some (possibly unsafe) methods on its PyBytes wrapper to support resizing buffers. But from the looks of this conversation it appears the mutable object address (via a potential realloc) is a deal-breaker due to how PyO3 tracks objects?

davidhewitt · 2021-02-28T08:21:47Z

@indygreg we actually once considered an extension to the PyBytes::new_with API to allow safely resizing at the end of the initialization step - see #1074 (comment).

You could potentially roll something yourself which does that - or add it as a PR to pyo3. There's a long discussion in that thread why we decided not to do it at the time.

In addition, we could support unsafe fn resize(&mut self, new_size: usize) on PyBytes<'py> after the owning object model in #1308.

Intial trial; _PyBytes_Resize doesn't appear to have an effect

8014652

milesgranger closed this Feb 13, 2021

milesgranger deleted the optimize-rust-to-python-bytes branch February 23, 2021 07:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Efficient movement of Vec<u8> to PyBytes #1427

[WIP] Efficient movement of Vec<u8> to PyBytes #1427

milesgranger commented Feb 13, 2021

birkenfeld commented Feb 13, 2021

milesgranger commented Feb 13, 2021

indygreg commented Feb 27, 2021

davidhewitt commented Feb 28, 2021

[WIP] Efficient movement of Vec<u8> to PyBytes #1427

[WIP] Efficient movement of Vec<u8> to PyBytes #1427

Conversation

milesgranger commented Feb 13, 2021

birkenfeld commented Feb 13, 2021

milesgranger commented Feb 13, 2021

indygreg commented Feb 27, 2021

davidhewitt commented Feb 28, 2021