Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ICH no longer works correctly on lines containing wide characters #14626

Closed
j4james opened this issue Jan 3, 2023 · 5 comments · Fixed by #14650
Closed

ICH no longer works correctly on lines containing wide characters #14626

j4james opened this issue Jan 3, 2023 · 5 comments · Fixed by #14650
Labels
Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Area-VT Virtual Terminal sequence support Issue-Bug It either shouldn't be doing this or needs an investigation. Product-Conhost For issues in the Console codebase Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. Severity-Blocking We won't ship a release like this! No-siree.

Comments

@j4james
Copy link
Collaborator

j4james commented Jan 3, 2023

Windows Terminal version

Commit 547349a

Windows build number

10.0.19044.2364

Other Software

No response

Steps to reproduce

  1. Create a build from the latest commit.
  2. Start Windows Terminal and open a WSL bash shell.
  3. Execute the following command: printf "\u3053\u3093\u306B\u3061\u306F World\r\e[@\n"

Expected Behavior

This should write out the text こんにちは World, and then insert a space at the start of the line via an ICH sequence. This is what it looks like in Preview version 1.16.2641.0:

image

Actual Behavior

When using the latest commit, the Japanese characters are erased by the ICH sequence, so all that's left is World.

image

So this looks to me like a regression, although I'm not sure when exactly it stopped working.

And note that this fails in OpenConsole as well. It's just easier to confirm that it's a recent regression in Windows Terminals because I know 1.16 was working OK.

@j4james j4james added Issue-Bug It either shouldn't be doing this or needs an investigation. Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels Jan 3, 2023
@j4james
Copy link
Collaborator Author

j4james commented Jan 3, 2023

As far as I can tell, the regression came from the new ROW code in a01500f.

The other thing I've noticed is that it only appears to be a problem when inserting a single space. An ICH with a parameter of 2 or more seems to work fine. I wonder if the problem is that it's trying to copy a 2-cell character on top of itself when moving one cell to the right, and in the process it destroys the source data before it can finish writing it to the target.

@zadjii-msft zadjii-msft added Product-Conhost For issues in the Console codebase Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Area-VT Virtual Terminal sequence support and removed Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting labels Jan 4, 2023
@zadjii-msft zadjii-msft added this to the Terminal v1.17 milestone Jan 4, 2023
@zadjii-msft zadjii-msft added the Severity-Blocking We won't ship a release like this! No-siree. label Jan 4, 2023
@DHowett
Copy link
Member

DHowett commented Jan 6, 2023

Thanks for digging into this @j4james! Leonard is out for a couple more weeks -- is this an area you'd be comfortable poking at in advance of the 1.17 release?

@j4james
Copy link
Collaborator Author

j4james commented Jan 6, 2023

The buffer code is a bit of a black box to me, so I don't want to promise anything, but I can take a look and see if there's an easy fix. I suspect this is out of my league though.

@j4james
Copy link
Collaborator Author

j4james commented Jan 7, 2023

First off, a few more test cases.

  1. Deleting a character with DCH. This hits the same code path as ICH (the _ScrollRectHorizontally method in AdaptDispatch), but this shows it fails when scrolling in both directions.

     printf " \u3053\u3093\u306B\u3061\u306F World\r\e[P\n"
    
  2. Copying the screen one column to the right with DECCRA. This goes through the CopyRectangularArea method of AdaptDispatch, but it's a similar pattern of code to _ScrollRectHorizontally.

     printf "\u3053\u3093\u306B\u3061\u306F World\n\e[;;;;;;2\$v"
    
  3. Scrolling the screen one column to the right with the ScrollConsoleScreenBuffer API. This goes through conhost's _CopyRectangle function, which again is using a similar pattern of code.

    SetConsoleOutputCP(65001);
    std::cout << "こんにちは World\n";
    HANDLE h = GetStdHandle(STD_OUTPUT_HANDLE);
    SMALL_RECT scrollRect = { 0, 0, 32767, 32767 };
    CHAR_INFO fill = {L' '};
    ScrollConsoleScreenBuffer(h, &scrollRect, &scrollRect, {1,0}, &fill);
    

In all these cases, what we're doing is iterating over a range of cells in the buffer, and copying them from one location to another like this:

do
{
    const auto data = OutputCell(*textBuffer.GetCellDataAt(sourcePos));
    textBuffer.Write(OutputCellIterator({ &data, 1 }), targetPos);
    source.WalkInBounds(sourcePos, walkDirection);
} while (target.WalkInBounds(targetPos, walkDirection));

But when the source and target are only one cell apart, and the characters we're manipulating are occupying more than one cell, we can end up erasing a character before we've finished copying it.

Let's say we've got a DBCS character in columns 1 and 2, which we're copying one column to the right, so we start by writing the trailing byte to column 3. However, because this is a two-cell character, the code immediately replaces both columns 2 and 3. But this leaves column 1 in a broken state (it's now half a DBCS character), so that gets erased as well.

Now we come to copying what would have been the leading byte from column 1 to column 2, but it's column 1 is now a blank. And once we write a blank into column 2, that leaves column 3 in a broken state, so again that needs to be erased, and everything is blank. At least I that's my understanding of what's going on.

I suspect the erasure of half-DBCS characters is a new thing that was added in the ROW update, and that's possibly why we didn't have this issue before. But I don't want to change that - I'm fairly certain that is the right thing to do. We just need to update the various scrolling operations to take that into account, and I think I've got a simple way we can achieve that.

All we need to do is read two cells from the source, before writing to the target, so the code snippet from above would change to something like this:

auto next = OutputCell(*textBuffer.GetCellDataAt(sourcePos));
do
{
    const auto current = next;
    source.WalkInBounds(sourcePos, walkDirection);
    next = OutputCell(*textBuffer.GetCellDataAt(sourcePos));
    textBuffer.WriteLine(OutputCellIterator({ &current, 1 }), targetPos);
} while (target.WalkInBounds(targetPos, walkDirection));

Now I realise this is not a long term fix - this will assumedly not work once we have characters that can span more than two cells. But it does at least gets things working again without requiring massive changes to the code. And once Leonard is back he can come up with a proper solution.

@ghost ghost added the In-PR This issue has a related PR label Jan 8, 2023
@ghost ghost closed this as completed in #14650 Jan 13, 2023
ghost pushed a commit that referenced this issue Jan 13, 2023
When the buffer contains wide characters that occupy more than one cell,
and those cells are scrolled horizontally by exactly one column, that
operation can result in the wide characters being completely erased.
This PR attempts to fix that bug, although it's not an ideal long term
solution.

Although not really to blame, it was PR #13626 that exposed this issue.

The root of the problem is that scrolling operations copy cells one by
one, but wide characters are written to the buffer two cells at a time.
So when you move a wide character one position to the left or right, it
can overwrite itself before it's finished copying, and the end result is
the whole character gets erased.

I've attempt to solve this by getting the affected operations to read
two cells in advance before they start writing, so there's no risk of
losing the source data before it's fully output. This may not work in
the long term, with characters wider than two cells, but it should at
least be good enough for now.

I've also changed the `TextBuffer::Write` call to a `WriteLine` call to
improve the handling of a wide character on the end of the line, where
moving it right by one column would place it half off screen. It should
just be dropped, but with the `Write` method, it would end up pushed
onto the following line.

## Validation Steps Performed

I've manually confirmed this fixes all the test cases described in
#14626, and also added some unit tests that replicate those scenarios.

Closes #14626
@ghost ghost added Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. and removed In-PR This issue has a related PR labels Jan 13, 2023
@ghost
Copy link

ghost commented Jan 24, 2023

🎉This issue was addressed in #14650, which has now been successfully released as Windows Terminal Preview v1.17.1023.:tada:

Handy links:

This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Output Related to output processing (inserting text into buffer, retrieving buffer text, etc.) Area-VT Virtual Terminal sequence support Issue-Bug It either shouldn't be doing this or needs an investigation. Product-Conhost For issues in the Console codebase Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. Severity-Blocking We won't ship a release like this! No-siree.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants