Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Stream.readuntil with non-bytes buffer objects #117653

Closed
wants to merge 1 commit into from

Conversation

bmerry
Copy link
Contributor

@bmerry bmerry commented Apr 8, 2024

PR #16429 introduced support for an iterable of separators in Stream.readuntil. Since bytes-like types are themselves iterable, this can introduce ambiguities in deciding whether the argument is an iterator of separators or a singleton separator. In #16429, only 'bytes' was considered a singleton, but this will break code that passes other buffer object types.

The Python library docs don't indicate what separator types were permitted in Python <=3.12, but comments in typeshed indicate that it would work with types that implement the buffer protocol and provide a len(). To keep those cases working the way they did before, I've changed the detection logic to consider any instance of collections.abc.Buffer as a singleton separator.

There may still be corner cases where this doesn't do what the user wants e.g. a numpy array of byte strings will implement the buffer protocol and hence be treated as a singleton; but at least those corner cases should behave the same in 3.13 as they did in 3.12.

Relates to #81322.

PR python#16429 introduced support for an iterable of separators in
Stream.readuntil. Since bytes-like types are themselves iterable, this
can introduce ambiguities in deciding whether the argument is an
iterator of separators or a singleton separator. In python#16429, only 'bytes'
was considered a singleton, but this will break code that passes other
buffer object types.

The Python library docs don't indicate what separator types were
permitted in Python <=3.12, but comments in typeshed indicate that it
would work with types that implement the buffer protocol and provide a
len(). To keep those cases working the way they did before, I've changed
the detection logic to consider any instance of collections.abc.Buffer
as a singleton separator.

There may still be corner cases where this doesn't do what the user
wants e.g. a numpy array of byte strings will implement the buffer
protocol and hence be treated as a singleton; but at least those corner
cases should behave the same in 3.13 as they did in 3.12.

Relates to python#81322.
@gvanrossum
Copy link
Member

Let's hold off until we've decided that we really need a general iterable, rather than just a tuple (see my comment on the issue).

@bmerry
Copy link
Contributor Author

bmerry commented Apr 8, 2024

Let's hold off until we've decided that we really need a general iterable, rather than just a tuple (see my comment on the issue).

Agreed. I'll close this and raise a new PR to only support tuples.

@bmerry bmerry closed this Apr 8, 2024
@gvanrossum
Copy link
Member

Also, I'm sorry -- when I first saw the PR I immediately had the thought "why allow arbitrary iterators" but suppressed it because there was no reason in the code itself to disallow it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants