-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider making ConPTY and Windows Terminal treat all ambiguous-width characters as 1 cell instead of asking the font #2066
Comments
From @egmontkob's note above, and from seeing how some other terminal emulators do this, it looks like this might be the correct choice. There's some affordances in certain projects for supporting "legacy" ambiguous character widths, but by and large terminals have agreed that they should be a single cell wide. |
@DHowett-MSFT how does this play with emoji? Aren't they usually ambiguous, but actually double wide? |
This is good approach, it seems to solve part of the unicode rendering issue, which might solve Chinese/double-width character issues, quite a lot emoji issues. but I wonder if it only solves some issues. as unicode 9 is soon a headache |
VS Code and hyper.js use xterm.js as terminal engine, as they are working on similar Unicode Also, iterm2 a popular terminal app on Mac OS made a lot changes years back to suppor Unicode. Since terminal/console/wsl is system app, I hope a more mature and overall solution is discussed, proposed, reviewed and implemented for further extension. Current Unicode support is partial and kind of bugfix only |
@DHowett-MSFT
|
No. |
@DHowett-MSFT |
Codepages have proven, almost without exception, to be an unmitigable disaster. They complicate the text buffer, they complicate the handling of DBCS characters, they provide little to no value in modern UTF-8-aware applications. The codepage stuff will stay on the far side of ConPTY and be rendered to the terminal in nice good and clean UTF-8. 😄 |
@DHowett-MSFT Well what I mean is that, some far east console applications may assume that characters' width follows the code page byte count, so turning them into single-width may break these applications (though... you can still throw them into ConHost V1). Another issue may include:
|
I get that, but to quote the initial post that spawned this issue:
|
@DHowett-MSFT
This is somehow like how UAX #50 works: Analyze runs first, then apply |
From Egmont Koblinger: > In terminal emulation, apps have to be able to print something and keep track of the cursor, whereas they by design have no idea of the font being used. In many terminals the font can also be changed runtime and it's absolutely not feasible to then rearrange the cells. In some other cases there is no font at all (e.g. the libvterm headless terminal emulation library, or a detached screen/tmux), or there are multiple fonts at once (a screen/tmux attached from multiple graphical emulators). > The only way to do that is via some external agreement on the number of cells, which is typically the Unicode EastAsianWidth, often accessed via wcwidth(). It's not perfect (changes through Unicode versions, has ambiguous characters, etc.) but is still the best we have. > glibc's wcwidth() reports 1 for ambiguous width characters, so the de facto standard is that in terminals they are narrow. > If the glyph is wider then the terminal has to figure out what to do. It could crop it (newer versions of Konsole, as far as I know), overflow to the right (VTE), shrink it (Kitty I believe does this), etc. See Also: https://bugzilla.gnome.org/show_bug.cgi?id=767529 https://gitlab.freedesktop.org/terminal-wg/specifications/issues/9 https://www.unicode.org/reports/tr11/tr11-34.html Salient point from proposed update to Unicode Standard Annex #11: > Note: The East_Asian_Width property is not intended for use by modern terminal emulators without appropriate tailoring on a case-by-case basis. Fixes #2066 Fixes #2375
From Egmont Koblinger: > In terminal emulation, apps have to be able to print something and keep track of the cursor, whereas they by design have no idea of the font being used. In many terminals the font can also be changed runtime and it's absolutely not feasible to then rearrange the cells. In some other cases there is no font at all (e.g. the libvterm headless terminal emulation library, or a detached screen/tmux), or there are multiple fonts at once (a screen/tmux attached from multiple graphical emulators). > The only way to do that is via some external agreement on the number of cells, which is typically the Unicode EastAsianWidth, often accessed via wcwidth(). It's not perfect (changes through Unicode versions, has ambiguous characters, etc.) but is still the best we have. > glibc's wcwidth() reports 1 for ambiguous width characters, so the de facto standard is that in terminals they are narrow. > If the glyph is wider then the terminal has to figure out what to do. It could crop it (newer versions of Konsole, as far as I know), overflow to the right (VTE), shrink it (Kitty I believe does this), etc. See Also: https://bugzilla.gnome.org/show_bug.cgi?id=767529 https://gitlab.freedesktop.org/terminal-wg/specifications/issues/9 https://www.unicode.org/reports/tr11/tr11-34.html Salient point from proposed update to Unicode Standard Annex 11: > Note: The East_Asian_Width property is not intended for use by modern terminal emulators without appropriate tailoring on a case-by-case basis. Fixes #2066 Fixes #2375 Related to #900
🎉This issue was addressed in #2928, which has now been successfully released as Handy links: |
Originally posted by @egmontkob in #2049 (comment)
The text was updated successfully, but these errors were encountered: