Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected line breaks occur when Completions contains Chinese characters. #713

Closed
kagurazakayashi opened this issue Dec 25, 2024 · 5 comments
Labels
bug Something isn't working

Comments

@kagurazakayashi
Copy link

Unexpected line breaks occur when Completions contains Chinese characters.

It works properly when only English is included:

1

If it contains Chinese, it will cause unexpected line breaks:

2

Misalignment occurs after Tab fills or continues typing:

3

If press BACKSPACE, a lot of characters will be deleted, or half a Chinese character will remain:

4

Repeated editing in the rows can cause the actual submission to not match the input display:

5

The problem started to appear in versions with the Right=Insert Suggestion prompt on the right side.

@chrisant996
Copy link
Owner

The problem started to appear in versions with the Right=Insert Suggestion prompt on the right side.

Thank you for reporting the problem. I would like to fix it, but I'll need your help.



First, it is necessary to clarify some things:

That hint text was added in v1.5.14, more than one year ago.

But in v1.7.0, the hint text changed from "Accept Suggestion" to "Insert Suggestion".

And in v1.7.0, big changes were made to the algorithm that tries to predict the width of characters in the terminal.

I think what you mean is that starting in v1.7.0, the width of Chinese characters is being predicted incorrectly.

Unfortunately, the Unicode standard has a large range of Chinese characters which have different widths depending on what system code page is used. They are called "Ambiguous Width" characters in the Unicode standard.

When the system code page is not a CJK code page, then the new algorithm is very accurate. But I can't test how the CJK code pages behave for the large range of "Ambiguous Width" Chinese characters, because my operating system language does not use those CJK code pages (CJK refers to Chinese, Japanese, Korean).



So, I need to ask some questions:

  • What is your system code page? Can you please run chcp and share the output from running it?
  • Does the problem go away if you run chcp 65001?
  • Would you be willing to run a special program to test the character widths? If yes, then I will share a program and instructions for how to run it.

@kagurazakayashi
Copy link
Author

kagurazakayashi commented Dec 25, 2024

Sorry, I made an error in the description of the version. I retested it:

Test content:

  • Windows 10 & 11 , System Language: Simplified Chinese , Activity code page: 936
  • The command for testing is: git commit --date %GIT_COMMITTER_DATE% -m "支持实时回显当前操作结果,提供进度条;多处优化和修复"
  1. Input to: git c
  2. Press Complete, then press Backspace

Test results (Activity code page: 936):

  • The above issue occurs on my Windows 10 computer (Clink version 1.7.6).
  • My Windows 11 computer hasn't updated Clink (Clink version 1.7.5), so I don't have this issue.

1

2

Test results (Active code page: 65001):

  • Same as above.

3

Try upgrading Clink on Windows 11 computer (1.7.5 -> 1.7.6):

  • No issues after the upgrade.
  • However, the problem starts to occur when a new command window is opened.

4

I am willing to run a special program to test character width.

@chrisant996
Copy link
Owner

chrisant996 commented Dec 25, 2024

@kagurazakayashi Thank you very much for the updated information -- it is extremely useful, and I'm able to reproduce the problem on my English computer!

The problem was introduced by this commit:
33880df8 -- Pick up Win8.1 wcwidth updates from wcwidth-verifier. (2024-11-19).

I see the mistake. Fortunately, it is a simple logic mistake and should be easy to correct. When running on Win 10 or 11 in a terminal other than Windows Terminal, it accidentally skips the measurement logic for codepoints U+2E80 through U+A4CF and for codepoints U+AC00 through U+D7A3.

I'll make a fix today or tomorrow. It may take some time because I intend to do extensive testing to confirm the fix on several different terminal programs and OS versions (and also track down how my original testing missed the case).

In the meantime, you can avoid the problem by either:

  1. Downgrade to v1.7.5 -- Download the v1.7.5 .zip file, extract the *.exe and *.dll files into your Clink program directory, and don't update again until v1.7.7 is available.
  2. Or, using Windows Terminal avoids the problem.

@chrisant996
Copy link
Owner

and also track down how my original testing missed the case

I tracked down how my testing missed it:

My wcwidth-verifier project did not fully automatically configure itself. And so when running in terminals other than Windows Terminal, it accidentally always used the full emoji-enabled width measurement function, unless specifically told to use the function optimized for other terminals.

So if I missed adding the right flag, then it tested the wrong function.

I'm updating the wcwidth-verifier program to fully auto-configure itself by default, to avoid similar problems in the future.

@chrisant996 chrisant996 added the bug Something isn't working label Dec 26, 2024
@chrisant996
Copy link
Owner

@kagurazakayashi v1.7.7 has been published with the fix.

Thank you again for reporting the issue, and for providing such thorough and useful troubleshooting information!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants