Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

specialize collections holding bytes to turn into PyBytes rather than PyList #4417

Closed
wants to merge 7 commits into from

Conversation

Icxolu
Copy link
Contributor

@Icxolu Icxolu commented Aug 4, 2024

Followup to #4060
XRef #4182

This implements specialization for collections holding bytes (in the new IntoPyObject trait) by a provided, sealed iter_into_pyobject method. The method is intended to be overwritten by the element type (for now u8 only) and turns an iterator of elements into an appropriate Python type in type erased form. The default implementation creates a PyList while the u8 override returns PyBytes.

This method is then used by the generic IntoPyObject::into_pyobject implementation of collection types. I updated the implementations of Vec<T>, &[T], [T; N] and SmallVec<A> accordingly. (I hope did not forget something)

From a quick test everything seems to work out. If this seems sensible to work with, I will add a few tests and the newsfragment.

Big thanks to @diliop for the previous work in #4182!

@Icxolu Icxolu force-pushed the pybytes-specialization branch from bec3c73 to 16120ec Compare August 6, 2024 17:31
Copy link
Member

@davidhewitt davidhewitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'm very excited to see this and fix the longstanding footgun! We should probably mention this in the migration guide, as this will be a breaking change for return values of types like Vec<u8>.

Comment on lines +228 to +236
PyBytes::new_with(py, len, |buf| {
let mut counter = 0;
for (slot, byte) in buf.iter_mut().zip(&mut iter) {
*slot = byte;
counter += 1;
}

assert!(iter.next().is_none(), "Attempted to create PyBytes but `iter` was larger than reported by its `ExactSizeIterator` implementation.");
assert_eq!(len, counter, "Attempted to create PyBytes but `iter` was smaller than reported by its `ExactSizeIterator` implementation.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this is interesting, I wonder, does this have a performance impact over just using PyBytes::new? As far as I can see, the problem is that we can't trivially get a slice, hence this approach.

I think the way to use PyBytes::new would be to make a trait like SliceableIterator: IntoIterator<Item = Self> which can either return a slice of all the elements in one go, or just call .into_iterator() to do normal iteration which transfers ownership. Then we use this trait instead of I: IntoIterator as the input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question, I haven't tested whether this has any performance impact. I initially tried to do this with slices rather than iterators, but ran into problems with the generics (there were always T and &T involved which caused all sorts of problems)

I'm not sure if I get the idea of SliceableIterator. How would we specialize between slices and other IntoIterators?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm yeah playing around with this locally I see exactly what you mean by the interaction between T and &T. I feel like there should be a solution so I'm going to continue to mull this over for a day or two.

So far I got to something like this:

pub trait SliceableIntoPyObjectIterator<'py>: Sized {
    type SliceItem;
    type IntoPyObjectItem: IntoPyObject<'py>;
    type IntoIter: ExactSizeIterator<Item = Self::IntoPyObjectItem>;
    fn into_py_object_iterator(self) -> Self::IntoIter;
    fn as_slice(&self) -> &[Self::SliceItem];
}

where the bytes case then becomes

fn iter_into_pyobject<I>(
    iter: I,
    py: Python<'py>,
    _: crate::conversion::private::Token,
) -> Result<Bound<'py, PyAny>, PyErr>
where
    I: crate::conversion::SliceableIntoPyObjectIterator<'py, SliceItem = Self>,
{
    Ok(PyBytes::new(py, iter.as_slice()).into_any())
}

... and the default implementation calls the .into_py_object_iterator() method instead to consume the elements.

But as you found, I'm having trouble with T and &T. 😂

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so I gave this a really deep investigation and found that yes, new_with is a bit slower than PyBytes::new, but it's not so much slower that it's a deal breaker.

I opened #4423 and #4424 which are two attempts to try to optimise the conversion a little bit. Both branches include two commits; the first is a set of benchmarks which I was using to study performance.

Overall, I think I'm more comfortable with the approach taken in #4424 which doesn't attempt to use unsafe to be able to then use PyBytes::new, and instead is essentially a small tidy up on top of this branch. If users report performance issues we can always revisit the unsafe option later. But this is already so much faster than the current Vec<u8> conversion which creates a list of integers that I doubt users will really notice the overhead of us using PyBytes::new_with here.

Would be interested to hear what you think of those options.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the throughout investigation! I'm also way more comfortable with the iterator approach in #4424. Unsafely casting all these types seems like at footgun to me at best, even if this would be fully contained within PyO3.

For the general case I'm unsure about whether it matters for soundness to have an instantiation of impl or whether it needs to be valid for all T in general. Having a safety guarantee about the instantiation of a generic type sounds a bit sketchy to me. If we're on the cautious side and say it has to be valid for all T I think this is at least invalid for zero sized types, since these would only be valid for zero sized reads on their dangling pointer and not any following address.

And also for this specific case I think this is unsound for something like a Vec<&u8> which would use <&u8>::iter_into_pyobject, which itself would use Vec<&u8>::as_bytes_slice and thus cast &u8 to u8

But this is already so much faster than the current Vec<u8> conversion which creates a list of integers that I doubt users will really notice the overhead of us using PyBytes::new_with here.

I agree here. This was also a reason why I initially did not put too much thought into it, since it looked like I would have at worst a similar performance characteristic to the list case we have currently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we want to proceed here @davidhewitt? Should we merge this as is and then followup in #4424 or should we just cleanup #4424 and merge that directly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay, I just had a potentially better idea in #4442 which uses PyBytes::new and allows &[u8] target type to be PyBytes. If you like it, maybe let's merge that one directly?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, I just wasn't sure whether we were still thinking here of we reached consent 🙃

src/conversion.rs Outdated Show resolved Hide resolved
@Icxolu
Copy link
Contributor Author

Icxolu commented Aug 6, 2024

You're right, this has an immediate effect, because of the macro specialization. I totally forgot that 😅 , I will add a migration entry 👍

@davidhewitt
Copy link
Member

I mean I guess we can probably add another layer to the autoderef specialization to prefer IntoPy only for these types, and emit a deprecation warning unless they wrap in a PyBytes::new or similar 😂

@Icxolu
Copy link
Contributor Author

Icxolu commented Aug 6, 2024

I kind of hope that it won't be necessary 🙃 , since it's also fairly easy to work around, if that conversion is not desired.

@Icxolu Icxolu force-pushed the pybytes-specialization branch from b8978fc to e1fe5ec Compare August 6, 2024 19:01
@Icxolu
Copy link
Contributor Author

Icxolu commented Aug 15, 2024

Superseded by #4442

@Icxolu Icxolu closed this Aug 15, 2024
@Icxolu Icxolu deleted the pybytes-specialization branch August 15, 2024 20:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants