You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This emoji is measured as a width of 1, but it is actually a width of 2, causing rjust() to format it wrong. It also fails to account correctly when zero-width, ZWJ, and variation selectors are used. Python fails to get this measurement "right" for any kind of display device at all, but I think it goes without saying that the only purpose of this function is for monospace character displays such as terminals.
I believe the Built-in format string alignment functions, str.rjust, str.ljust, str.center, and textwrap.wrap should measure these unicode characters for their printable width, and not just the "number of codepoints".
The built-in REPL also gets this wrong in the readline-like library input. It becomes impossible to edit strings containing these characters, the cursor position and the result of input is unpredictable and disorienting.
IPython, which uses wcwidth, does a better job and should fare better with #91 closed, but it should not be required to use a large project like IPython as a REPL as a solution.
It would be good to experiment with the source code of Python, to see which parts of the codebase need changing. See #93 for the basic high-level functions
And, it would be better to draft and submit a PEP.
The text was updated successfully, but these errors were encountered:
I have since found a few python bug reports, patches, and proposals for wcwidth in the standard library, linked below with a small number of choice quotes. The last issue (56777) got the closest, but shows a lot of disagreement about how to interpret the Unicode Specification, and, the fundamental problem of wrapping any OS-provided wcwidth(3) or wcswidth(3) would be inconsistent. Some people fundamentally misunderstand about fixed width vs. variable width fonts, and others the need for wcswidth() instead of, or in addition to wcwidth().
Anyway, this wcwidth library is now used in many applications, we have authored a clear specification and a terminal compliance assessment utility that was not previously available, and I think these offerings would push through any of the previously given contrary arguments.
There is no need to be perfectly correct for all terminals, but to be mostly correct for most languages in the most popular terminals is preferable!
Other functions I miss a lot are wcwidth() and wcswidth(). These functions return the real width (read, cells length in screen) for unicode strings. [..] I think Python could benefit from having these functions in the standard library.
Judging by your post your English probably is good enough to write a PEP [..] However, I doubt a PEP would be necessary.
Can't we expose wcswidth() as locale.strwidth() with a recipe explaining how to use unicodedata to get a "correct" result? At least until everyone implements correctly Unicode and Unicode stops evolving? :-)
I think this function would be very useful in many parts of interpreter core and standard library. From displaying tracebacks to formatting helps. Otherwise we are doomed to implement imperfect variants in multiple places.
Since we failed to agree on this feature, I close the issue.
I close the issue as WONTFIX.
Like P1868R2, "🦄 width: clarifying units of width and precision in std::format", Published Proposal, 2020-02-11 https://fmt.dev/papers/p1868.html
Why can't Python just do the right thing? For example, here it gets it wrong,
This emoji is measured as a width of 1, but it is actually a width of 2, causing rjust() to format it wrong. It also fails to account correctly when zero-width, ZWJ, and variation selectors are used. Python fails to get this measurement "right" for any kind of display device at all, but I think it goes without saying that the only purpose of this function is for monospace character displays such as terminals.
I believe the Built-in format string alignment functions, str.rjust, str.ljust, str.center, and textwrap.wrap should measure these unicode characters for their printable width, and not just the "number of codepoints".
The built-in REPL also gets this wrong in the readline-like library input. It becomes impossible to edit strings containing these characters, the cursor position and the result of input is unpredictable and disorienting.
IPython, which uses wcwidth, does a better job and should fare better with #91 closed, but it should not be required to use a large project like IPython as a REPL as a solution.
It would be good to experiment with the source code of Python, to see which parts of the codebase need changing. See #93 for the basic high-level functions
And, it would be better to draft and submit a PEP.
The text was updated successfully, but these errors were encountered: