-
-
Notifications
You must be signed in to change notification settings - Fork 31.9k
bpo-43950: handle wide unicode characters in tracebacks #28150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Someone should try this on Windows with "Courier New" as Terry mentioned in the issue. It will be good to check with a bunch of fonts to see what we are up against. |
f4b35ac
to
6ed3a9d
Compare
This PR is stale because it has been open for 30 days with no activity. |
Wow, this is an old PR but @pablogsal @ammaraskar are we interested in reviving it? I think I can rebase and get it ready for 3.12. |
I am. I was precisely thinking on reviving it last week so you managed to read my mind! 😁 |
Hahahaha, perfection! I'll try to get it up to date over this week, and will ping you again for the review 💯 |
Sounds good, happy to re-review when you update :) Sorry for the slow follow-ups around PEP657 stuff, I've been a little busy and inactive recently :( |
6ed3a9d
to
8102a9d
Compare
I've pushed the initial revision where we handle everything using the
I guess also looking at what other compilers are doing by default might also help us gain some insight (I recall @pablogsal mentioning rustc; maybe it might worth a shot to check out what they do to decide when to show carets). |
CC: @cfbolz (would also love to hear your feedback on the unicode related parts) |
I'll take a look at the code! "Amusingly" the width doesn't line up in my browser's font: (Looks fantastic in my editor and my terminal though) Personal opinions on some of your questions:
|
The code looks reasonable to me. I've been thinking about it a bit more, and it would be certainly more annoying to implement, but I am wondering whether it wouldn't be an option to use unicode chars 0x3000 (IDEOGRAPHIC SPACE) and 0xFF3E (FULLWIDTH CIRCUMFLEX ACCENT) to do the spaces/underlines under wide chars. Because even if in a font the width of two ascii spaces is not the same as a fullwidth char, the font should at least be consistent with itself and have the fullwidth space be the same width. Example: Here are the chars:
Screenshot in my Firefox: (in my terminal all the 'a' line up, so it doesn't matter there). |
(Unfortunately it already breaks down in Chrome on my laptop, where the book emoji is even wider) |
@isidentical let's push this forward. Could you rebase the PR? |
8102a9d
to
5ec6af6
Compare
yay, i'm excited for this to land :-) |
…nGH-28150) (cherry picked from commit 78e6d72) Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
GH-111345 is a backport of this pull request to the 3.11 branch. |
…nGH-28150) (cherry picked from commit 78e6d72) Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
GH-111346 is a backport of this pull request to the 3.12 branch. |
…nGH-28150) (cherry picked from commit 78e6d72) Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
…nGH-28150) (cherry picked from commit 78e6d72) Co-authored-by: Batuhan Taskaya <isidentical@gmail.com>
…nGH-28150) (cherry picked from commit 78e6d72) Co-authored-by: Batuhan Taskaya <isidentical@gmail.com> Signed-off-by: Pablo Galindo <pablogsal@gmail.com>
GH-111373 is a backport of this pull request to the 3.11 branch. |
Do not merge yet, only for discussion.
This PR adds support for the existing traceback machinery to work with wide unicode characters when dumping to the terminal. It uses

unicodedata.east_asian_width
to classify individual unicode characters.https://bugs.python.org/issue43950