-
-
Notifications
You must be signed in to change notification settings - Fork 33.5k
Description
Bug report
Bug description:
The implementation of _pyrepl.input.KeymapTranslator has a check for input with Unicode category "C". This code has been present since the initial commit of the new REPL in #111567. However, no such category is ever returned by unicodedata because there is no "C" entry in the list of category names¹
>>> any(unicodedata.category(chr(n)) == "C" for n in range(sys.maxunicode)) # Python 3.12
FalseI'm not familiar enough with _pyrepl to know what the implications of this always-false predicate are, but I do know that the block in question is effectively dead code because of it.
I think this is meant to be a .startswith() check for the Other category identified by UAX #44, i.e. the union of Cc | Cf | Cs | Co | Cn, in line with other usage in _pyrepl.reader. I'll open a PR for that.
¹ the list of category names is hardcoded in makeunicodedata.py rather than derived from UCD, which does define C = Cc | Cf | Cs | Co | Cn in PropertyValueAliases.txt. I don't think there's any version of the unicodedata API that would support returning "C" here, though. Just being a little obsessive.
CPython versions tested on:
CPython main branch
Operating systems tested on:
Linux