-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Format the hex codes in Unicode/hex escape sequences (\U
, \u
, \x
) in string literals
#2067
Comments
I agree that Black should make this consistent but have no real view on which direction the consistency should go. Some things that may help a decision:
|
None that I know of.
Most occurrences seem to use the lowercased representation, though there's not that much usage of it within the Python documentation. As for CPython's Python code, it seems to use a lowercased version more often, but I don't think it's that consistent there.
Python's >>> "\x1B"
'\x1b'
>>> "\u200B"
'\u200b'
>>> "\U0001F977"
'\U0001f977' |
Thanks! That would make me lean towards using all lowercase in Black too. |
I think it may be relevant that Black currently formats hex literals |
This also affects I'm leaning towards keeping |
\U
and \u
) in string literals\U
, \u
, \x
) in string literals
Also see https://docs.python.org/3/reference/lexical_analysis.html#string-and-bytes-literals for reference on how these work. |
Is your feature request related to a problem? Please describe.
Currently, one can write either
"\U0001f977"
or"\U0001F977"
which are equivalent but they look differently. Similarly, one can write"\u200b"
or"\u200B"
which also are equivalent but they look differently (do mind that\U
and\u
are NOT equivalent though; unless we also want to talk about shortening"\U0000200b"
to"\u200b"
which I guess would make sense but is probably a separate issue).Right now, I'm forced to think whether I should use uppercase or lowercase letters as Black doesn't enforce it.
Describe the solution you'd like
I think it would make sense to have these be consistent in some way, possibly in the same way as the numeric literals, although personally, I think it would make more sense to have it all uppercase in case of
\U0001F977
and all lowercase in case of\u200b
. I definitely think that the\u200b
should not be\u200B
but I don't have a strong opinion on the\U0001f977
vs\U0001F977
.Describe alternatives you've considered
Alternatives were not considered.
Additional context
None.
The text was updated successfully, but these errors were encountered: