You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This raises "TypeError: encoding without a string argument" due to the call to bytes(name, "utf-8"), and is a regression from Python 2, which handles this case correctly:
The current implementation makes it impossible to use these functions with a name that cannot be decoded as a valid UTF-8 string. RFC 4122 makes it clear in section 4.3 that this restriction should not be imposed:
"The concept of name and name space should be broadly construed, and not limited to textual names."
It goes on to state that the name space may define how the name is converted to bytes, leaving the developer completely out of luck if the name space has been defined by someone else:
"Convert the name to a canonical sequence of octets (as defined by the standards or conventions of its name space)"
This is reinforced by the reference implementation, which takes void * and a length as arguments, rather than any string type, and by the definition of an X.500 DN name space that allows DER-encoded names, which also cannot be guaranteed to be representable as UTF-8. The availability of an X.500 DN name space allowing DER-encoded names is also repeated in the uuid module documentation.
Your environment
I have encountered this bug in the following Python versions:
Python 3.10.5 (tags/v3.10.5:f377153, Jun 6 2022, 16:14:13) [MSC v.1929 64 bit (AMD64)] on win32
Python 3.7.2 (tags/v3.7.2:9a3ffc0492, Dec 23 2018, 23:09:28) [MSC v.1916 64 bit (AMD64)] on win32
Python 3.6.6 (v3.6.6:4cf1f54eb7, Jun 27 2018, 03:37:03) [MSC v.1900 64 bit (AMD64)] on win32
Python 3.5.3 (default, Apr 5 2021, 09:00:41) [GCC 6.3.0 20170516] on linux
And it appears to be present in the python/cpython GitHub repository as of 2022-10-04.
The text was updated successfully, but these errors were encountered:
These did not appear in my initial search for related issues. Those issues also do not cite the RFC text that explicitly calls out the functionality in question, so they do not appear to have been classified as a bug. The commit attached to #94709 appears to resolve it.
Bug report
Consider a name space fc48656f-2196-4866-ad70-0cf68bf80146 which defines a name as the concatenation of the byte representation of two or more UUIDs.
This raises "TypeError: encoding without a string argument" due to the call to bytes(name, "utf-8"), and is a regression from Python 2, which handles this case correctly:
The current implementation makes it impossible to use these functions with a name that cannot be decoded as a valid UTF-8 string. RFC 4122 makes it clear in section 4.3 that this restriction should not be imposed:
It goes on to state that the name space may define how the name is converted to bytes, leaving the developer completely out of luck if the name space has been defined by someone else:
This is reinforced by the reference implementation, which takes void * and a length as arguments, rather than any string type, and by the definition of an X.500 DN name space that allows DER-encoded names, which also cannot be guaranteed to be representable as UTF-8. The availability of an X.500 DN name space allowing DER-encoded names is also repeated in the uuid module documentation.
Your environment
I have encountered this bug in the following Python versions:
And it appears to be present in the python/cpython GitHub repository as of 2022-10-04.
The text was updated successfully, but these errors were encountered: