Skip to content

Conversation

tecki
Copy link
Contributor

@tecki tecki commented Sep 10, 2025

Within asyncio, several write, send and sendto methods contained a check whether the data parameter is one of the right types, complaining that the data should be bytes-like.

This error message is very misleading if one hands over a byte-like object that happens not to be one of the right types.

Usually, these kinds of tests are not necessary at all, as the code the data is handed to will complain itself if the type is wrong. But in the case of the mentioned methods, the data may be put into a buffer if the data cannot be sent immediately. Any type errors will only be raised much later, when the buffer is finally sent.

Interestingly, the test is incorrect as well: any memoryview is considered right, although it may be passed to functions that cannot deal with non-contiguous memoryviews, in which case the all the problems that the test should protect from reappear.

There are several options one can go forward:

  • one could update the documentation to reflect the fact that not all bytes-like objects can be passed, but only special ones. This would be unfortunate as the underlying code actually can deal with all bytes-like data.
  • actually test whether the data is bytes-like. This is unfortunately not easy. The correct test would be to check whether the data can be parsed as by PyArg_Parse with a y* format. I am not aware of a simple test like this.
  • Remove the test. In this case one should assure that only bytes-like data will be put into the buffers if the data cannot be sent immediately.

For simplicity I opted for the last option. One should note another problem here: if we accept objects like memoryview or bytearray (which we currently do), the user may modify the data after-the-fact, which will lead to weird, unexpected behavior. This could be mitigated by always copying the data. This is done in some of the modified methods, but not all, most likely for performance reasons. I think it would be beneficial to deal with this problem in a systematic way, but this is beyond the scope of this patch.

Within asyncio, several `write`, `send` and `sendto` methods
contained a check whether the `data` parameter is one of the
*right* types, complaining that the data should be bytes-like.

This error message is very misleading if one hands over a byte-like
object that happens not to be one of the *right* types.

Usually, these kinds of tests are not necessary at all, as the
code the data is handed to will complain itself if the type is wrong.
But in the case of the mentioned methods, the data may be put into
a buffer if the data cannot be sent immediately. Any type errors
will only be raised much later, when the buffer is finally sent.

Interestingly, the test is incorrect as well: any memoryview is
considered *right*, although it may be passed to functions that
cannot deal with non-contiguous memoryviews, in which case the all
the problems that the test should protect from reappear.

There are several options one can go forward:

* one could update the documentation to reflect the fact that not
  all bytes-like objects can be passed, but only special ones. This
  would be unfortunate as the underlying code actually can deal with
  all bytes-like data.
* actually test whether the data is bytes-like. This is unfortunately
  not easy. The correct test would be to check whether the data can
  be parsed as by PyArg_Parse with a y* format. I am not aware of
  a simple test like this.
* Remove the test. In this case one should assure that only bytes-like
  data will be put into the buffers if the data cannot be sent
  immediately.

For simplicity I opted for the last option. One should note another
problem here: if we accept objects like memoryview or bytearray
(which we currently do), the user may modify the data after-the-fact,
which will lead to weird, unexpected behavior. This could be mitigated
by always copying the data. This is done in some of the modified
methods, but not all, most likely for performance reasons. I think
it would be beneficial to deal with this problem in a systematic way,
but this is beyond the scope of this patch.
@picnixz
Copy link
Member

picnixz commented Sep 10, 2025

Please, avoid opening a PR if no decision has been reached. Kumar said that the docs indicated what should be done. IMO, it's a garbage-in / garbage-out case, although the garbage-out happens later.

@tecki
Copy link
Contributor Author

tecki commented Sep 10, 2025

@picnixz I followed the instructions at https://devguide.python.org/, I am currently at step 7 of that guide. Should I have acted differently, please add that to the developer guide.

@picnixz
Copy link
Member

picnixz commented Sep 10, 2025

Should I have acted differently, please add that to the developer guide.

I don't think we should jump towards a PR unless a decision is achieved and discussed. We were still in the phase of discussing. It's actually written in https://devguide.python.org/getting-started/fixing-issues/:

It could also still be open because no consensus has been reached on how to fix the issue, although having a pull request that proposes a fix can turn the tides of the discussion to help bring it to a close. Regardless of why the issue is open, you can also always provide useful comments if you do attempt a fix, successful or not.

Since the discussion was only at its early stage I wouldn't have considered a PR yet.

Note: the first page of the devguide is more for "reminders" & quick refs, but the exact workflow requires to read many sections (which we could mention at the introduction).

@tecki
Copy link
Contributor Author

tecki commented Sep 10, 2025

@picnixz Well, the docs that you cited are for different situation, where there is an existing issue that somebody else wants to fix. This situation here is completely different: I found the problem, and fixed it. Then I wanted to present the fix as a PR, so I followed the guideline, which told me I have to write an issue, which I did.

I consider writing the issue a complete waste of time, it just doubled my work, I had to describe in an abstract way the stuff that I wanted to do in a concrete PR. This is not easy for me, I am a programmer, English is not my native language, discussing about code is so much easier than discussing in a void.

This issue stage is also uncommon in the open source programming world, in all other projects I worked on you just submit a PR and discuss about code. But at least it is much better than it used to be, for CPython one had to jump through many hoops to get anything done, so this is much better now.

So given the course of events, how should I have acted?

@picnixz
Copy link
Member

picnixz commented Sep 10, 2025

In general, when there is a behavior change, we should open an issue and ask how it should be fixed. It doesn't matter whether one has a fix or not for the specific issue because it might not be considered an issue. Then wait for the maintainer to acknowledge the issue. As you saw on the issue, we decided to do something else.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants