Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor VT terminal input #16511

Merged
merged 12 commits into from
Jan 30, 2024
Merged

Conversation

j4james
Copy link
Collaborator

@j4james j4james commented Dec 31, 2023

The primary reason for this refactoring was to simplify the management
of VT input sequences that vary depending on modes, adding support for
the missing application keypad sequences, and preparing the way for
future extensions like S8C1T.

However, it also includes fixes for a number of keyboard related bugs,
including a variety of missing or incorrect mappings for the Ctrl and
Ctrl+Alt key combinations,

References and Relevant Issues

This PR also includes a fix for #10308, which was previously closed as a
duplicate of #10551. I don't think those bugs were related, though, and
although they're both supposed to be fixed in Windows 11, this PR fixes
the issue in Windows 10.

Detailed Description of the Pull Request / Additional comments

The way the input now works, there's a single keyboard map that takes a
virtual key code combined with Ctrl, Alt, and Shift modifier bits
as the lookup key, and the expected VT input sequence as the value. This
map is initially constructed at startup, and then regenerated whenever a
keyboard mode is changed.

This map takes care of the cursor keys, editing keys, function keys, and
keys like BkSp and Return which can be affected by mode changes. The
remaining "graphic" key combinations are determined manually at the time
of input.

The order of precedence looks like this:

  1. If the virtual key is 0 or VK_PACKET, it's considered to be a
    synthesized keyboard event, and the UnicodeChar value is used
    exactly as given.

  2. If it's a numeric keypad key, and Alt is pressed (but not Ctrl),
    then it's assumedly part of an Alt-Numpad composition, so the key
    press is ignored (the generated character will be transmitted when
    the Alt is released).

  3. If the virtual key combined with modifier bits is found in the key
    map described above, then the matched escape sequence will be used
    used as the output.

  4. If a UnicodeChar value has been provided, that will be used as the
    output, but possibly with additional Ctrl and Alt modifiers applied:

    a. If it's an AltGr key, and we've got either two Ctrl keys
    pressed or a left Ctrl key that is distinctly separate from a
    right Alt key, then we will try and convert the character into
    a C0 control code.

    b. If an Alt key is pressed (or in the case of an AltGr value,
    both Alt keys are pressed), then we will convert it into an
    Alt-key sequence by prefixing the character with an ESC.

  5. If we don't have a UnicodeChar, we'll use the ToUnicodeEx API to
    check whether the current keyboard state reflects a dead key, and if
    so, return nothing.

  6. Otherwise we'll make another ToUnicodeEx call but with any Ctrl
    and Alt modifiers removed from the state to determine the base key
    value. Once we have that, we can apply the modifiers ourself.

    a. If the Ctrl key is pressed, we'll try and convert the base value
    into a C0 control code. But if we can't do that, we'll try again
    with the virtual key code (if it's alphanumeric) as a fallback.

    b. If the Alt key is pressed, we'll convert the base value (or
    control code value) into an Alt-key sequence by prefixing it with
    an ESC.

For step 4-a, we determine whether the left Ctrl key is distinctly
separate from the right Alt key by recording the time that those keys
are pressed, and checking for a time gap greater than 50ms. This is
necessary to distinguish between the user pressing Ctrl+AltGr, or
just pressing AltGr alone, which triggers a fake Ctrl key press at
the same time.

Validation Steps Performed

I created a test script to automate key presses in the terminal window
for every relevant key, along with every Ctrl/Alt/Shift modifier, and
every relevant mode combination. I then compared the generated input
sequences with XTerm and a DEC VT240 terminal. The idea wasn't to match
either of them exactly, but to make sure the places where we differed
were intentional and reasonable.

This mostly dealt with the US keyboard layout. Comparing international
layouts wasn't really feasible because DEC, Linux, and Windows keyboard
assignments tend to be quite different. However, I've manually tested a
number of different layouts, and tried to make sure that they were all
working in a reasonable manner.

In terms of unit testing, I haven't done much more than patching the
ones that already existed to get them to pass. They're honestly not
great tests, because they aren't generating events in the form that
you'd expect for a genuine key press, and that can significantly affect
the results, but I can't think of an easy way to improve them.

PR Checklist

@microsoft-github-policy-service microsoft-github-policy-service bot added Issue-Bug It either shouldn't be doing this or needs an investigation. Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Area-Input Related to input processing (key presses, mouse, etc.) Area-VT Virtual Terminal sequence support Priority-2 A description (P2) Product-Conhost For issues in the Console codebase Product-Conpty For console issues specifically related to conpty Product-Terminal The new Windows Terminal. labels Dec 31, 2023
@j4james j4james marked this pull request as ready for review January 2, 2024 11:30
@j4james
Copy link
Collaborator Author

j4james commented Jan 2, 2024

If you don't like the overall refactoring concept in this PR, it's worth noting that at least some of the bug fixes could easily be ported back to the current TerminalInput implementation, so it's not like it's all or nothing.

And on the flip side, if you like the refactoring, but don't like specific bug fixes, some of them can easily be dropped (I can imagine things like the #10308 fix, and the #16509 change might be controversial).

@zadjii-msft
Copy link
Member

okay 11 bug fixes in January is setting the bar REAL high for 2024 😆 This might take us a bit to get to, but we should most certainly do this for 1.20

src/terminal/input/terminalInput.cpp Outdated Show resolved Hide resolved
src/terminal/input/terminalInput.cpp Show resolved Hide resolved
src/terminal/input/terminalInput.cpp Outdated Show resolved Hide resolved
src/terminal/input/terminalInput.cpp Outdated Show resolved Hide resolved
src/terminal/input/terminalInput.cpp Outdated Show resolved Hide resolved
src/terminal/input/terminalInput.cpp Show resolved Hide resolved
src/terminal/input/terminalInput.cpp Outdated Show resolved Hide resolved
src/terminal/input/terminalInput.cpp Outdated Show resolved Hide resolved
// -> Use the vkey to determine if Ctrl+@ is being pressed and produce ^[^@.
if (ch == UNICODE_NULL && vkey == LOBYTE(OneCoreSafeVkKeyScanW(0)))
auto charSequence = _makeCharOutput(unicodeChar);
if (!charSequence.empty())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By moving the surrogate pair check up in this function we could make it so that _makeCharOutput never returns an empty string.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what you meant here. Did you want to move the whole if til::is_leading_surrogate(ch)) _leadingSurrogate.emplace(ch) test out of the _makeCharOutput function? Because this isn't the only place that's being called. It's also used for the VK_PACKET and 0 virtual key events (although I'm not actually sure we should be delaying the surrogate pairs in that case).

Copy link
Member

@lhecker lhecker Jan 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I meant is that we could consider moving the leading-surrogate check right past the if (!keyEvent.bKeyDown) branch. It would then store any leading-surrogate into a member variable and could then early-return. That is:

class TerminalInput
{
    // no need for an optional
    wchar_t _leadingSurrogate = 0;
};

TerminalInput::OutputType TerminalInput::HandleKey(const INPUT_RECORD& event)
{
    // ...

    if (!keyEvent.bKeyDown)
    {
        // ...
    }

    // Unpaired surrogates are no good -> early return.
    if (til::is_leading_surrogate(unicodeChar))
    {
        _leadingSurrogate = unicodeChar;
        return MakeOutput({});
    }
    // Using a scope_exit ensures that a previous leading surrogate is forgotten
    // even if the KEY_EVENT that followed didn't end up calling _makeCharOutput.
    const auto leadingSurrogateReset = wil::scope_exit([&]() {
        _leadingSurrogate = 0;
    });

    // ...
}

TerminalInput::StringType TerminalInput::_makeCharOutput(const wchar_t ch)
{
    StringType str;

    if (_leadingSurrogate && til::is_trailing_surrogate(ch))
    {
        str.push_back(_leadingSurrogate);
    }

    str.push_back(ch);
    return str;
}

That way _makeCharOutput never returns an empty string which may allow you to simplify the if (unicodeChar != 0) branch.

However, there's one thing that made me wonder a bit now. The VK_PACKET handler doesn't deal with Alt key combinations. Why's that? Shouldn't it use the exact same code as the if (unicodeChar != 0) branch? Because if it did, you could abstract it away and then call that function directly when encountering a trailing surrogate key, since no character from a higher Unicode plane should ever hit any of the VK_* handlers, etc., anyways (I think?).


BTW: It may be worth introducing a _makeNoOutput() function or similar, since we now have 7 instances of MakeOutput({}), but it's conceptually the same thing every time.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. That makes sense. I've incorporated those changes now. It does simplify things. Thanks.

As for VK_PACKET, my understanding was that it serves as a kind of "pass through" mode, when you just want to send a raw unicode character as input. It will typically come from something like an emoji picker, or a SendInput call with KEYEVENTF_UNICODE.

It's not an actual keypress, so I don't think it's supposed to be affected by modifiers. If you wanted to simulate a VT Alt key with VK_PACKET you'd literally send through the sequence of unicode characters that make up that Alt key sequence.

src/terminal/input/terminalInput.cpp Outdated Show resolved Hide resolved
// the Ctrl key is manually pressed in addition to the AltGr key, we have to
// be able to detect when the Ctrl key isn't genuine. We do so by tracking
// the time between the Alt and Ctrl key presses, and only consider the Ctrl
// key to really be pressed if the difference is more than 50ms.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay this is pretty clever I'll give you that

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't entirely take credit for this. I did some research to see how other people dealt with the issue, and the typical approach was to handle it at the time the Ctrl key message was read from the queue. The idea was that you would peek the next message looking for an Alt key, and compare the two MSG.time fields. If it was a fake Ctrl+Alt combo, they would have identical values, so you would know to ignore the Ctrl.

But that exact approach wouldn't have worked for us over conpty, so that's why I track the time manually and allow a bit of leeway in the time difference, but the general idea is the same.

@lhecker lhecker added the AutoMerge Marked for automatic merge by the bot when requirements are met label Jan 29, 2024
@lhecker
Copy link
Member

lhecker commented Jan 29, 2024

(I've taken the liberty to merge my suggestion as I didn't want to bother you with such a small nit. I hope you didn't mind this.)

return MakeUnhandled();
// If NumLock is on, and this is an Alt release with a unicode char,
// it must be the generated character from an Alt-Numpad composition.
if (WI_IsFlagSet(controlKeyState, NUMLOCK_ON) && virtualKeyCode == VK_MENU && unicodeChar != 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder - will Alt+Numpad ever produce a leading/trailing surrogate pair?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, Alt+Numpad is just for codepoints 0 to 255, and there's another variant (involving the plus key maybe?) which is supposed to handle Unicode characters. That's not currently supported.

// it must be the generated character from an Alt-Numpad composition.
if (WI_IsFlagSet(controlKeyState, NUMLOCK_ON) && virtualKeyCode == VK_MENU && unicodeChar != 0)
{
return MakeOutput({ &unicodeChar, 1 });
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

note to self: see if this differs from makeCharOutput

Copy link
Member

@DHowett DHowett Jan 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does; that inserts the leading s.p., this does not

src/terminal/input/terminalInput.cpp Outdated Show resolved Hide resolved
@lhecker lhecker removed the AutoMerge Marked for automatic merge by the bot when requirements are met label Jan 29, 2024
@DHowett DHowett enabled auto-merge (squash) January 29, 2024 22:34
@DHowett DHowett disabled auto-merge January 30, 2024 00:58
@DHowett DHowett merged commit dccc1f4 into microsoft:main Jan 30, 2024
15 checks passed
DHowett pushed a commit that referenced this pull request Feb 6, 2024
This fixes two issues where the `Space` key wasn't being handled
correctly:

* Keyboards with an `AltGr`+`Space` mapping were not generating the
  expected character.
* Pressing a dead key followed by `Space` is supposed to generate the
  accent character associated with that key, but it wasn't doing so.

## References and Relevant Issues

These were both regressions from the keyboard refactor in PR #16511.

## Detailed Description of the Pull Request / Additional comments

The problem was that we were treating `VK_SPACE` as a "functional" key,
which means it gets hardcoded VT mappings which take precedence over
whatever is in the keyboard layout. This was deemed necessary to deal
with the fact that many keyboards incorrectly map `Ctrl`+`Space` as a
`SP` character, when it's expected to be `NUL`.

I've now dropped `VK_SPACE` from the functional mapping table and allow
it be handled by the default mapping algorithm for "graphic" keys.
However, I've also introduced a special case check for `Ctrl`+`Space`
(and other modifier variants), so we can bypass any incorrect keyboard
layouts for those combinations.

## Validation Steps Performed

I couldn't test with a French-BEPO keyboard layout directly, because the
MS Keyboard Layout Creator wouldn't accept a `Space` key mapping that
wasn't whitespace. However, if I remapped the `AltGr`+`Space` combo to
`LF`, I could confirm that we are now generating that correctly.

I've also tested the dead key `Space` combination on various keyboard
layouts and confirmed that that is now working correctly, and checked
that the `Ctrl`+`Space` combinations are still working too.

Closes #16641
Closes #16642
DHowett pushed a commit that referenced this pull request Feb 6, 2024
DECKPAM originally tracked in #16506.
Support was added in #16511.
But turns out people didn't expect the Terminal to actually be like,
compliant: #16654

This closes #16654 while we think over in #16672 how we want to solve
this
@j4james j4james deleted the refactor-vt-keyboard branch February 10, 2024 13:03
DHowett pushed a commit that referenced this pull request Feb 21, 2024
This fixes two issues where the `Space` key wasn't being handled
correctly:

* Keyboards with an `AltGr`+`Space` mapping were not generating the
  expected character.
* Pressing a dead key followed by `Space` is supposed to generate the
  accent character associated with that key, but it wasn't doing so.

## References and Relevant Issues

These were both regressions from the keyboard refactor in PR #16511.

## Detailed Description of the Pull Request / Additional comments

The problem was that we were treating `VK_SPACE` as a "functional" key,
which means it gets hardcoded VT mappings which take precedence over
whatever is in the keyboard layout. This was deemed necessary to deal
with the fact that many keyboards incorrectly map `Ctrl`+`Space` as a
`SP` character, when it's expected to be `NUL`.

I've now dropped `VK_SPACE` from the functional mapping table and allow
it be handled by the default mapping algorithm for "graphic" keys.
However, I've also introduced a special case check for `Ctrl`+`Space`
(and other modifier variants), so we can bypass any incorrect keyboard
layouts for those combinations.

## Validation Steps Performed

I couldn't test with a French-BEPO keyboard layout directly, because the
MS Keyboard Layout Creator wouldn't accept a `Space` key mapping that
wasn't whitespace. However, if I remapped the `AltGr`+`Space` combo to
`LF`, I could confirm that we are now generating that correctly.

I've also tested the dead key `Space` combination on various keyboard
layouts and confirmed that that is now working correctly, and checked
that the `Ctrl`+`Space` combinations are still working too.

Closes #16641
Closes #16642

(cherry picked from commit ec91be5)
Service-Card-Id: 91738880
Service-Version: 1.20
DHowett pushed a commit that referenced this pull request Feb 21, 2024
DECKPAM originally tracked in #16506.
Support was added in #16511.
But turns out people didn't expect the Terminal to actually be like,
compliant: #16654

This closes #16654 while we think over in #16672 how we want to solve
this

(cherry picked from commit 71efdcb)
Service-Card-Id: 91781164
Service-Version: 1.20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Input Related to input processing (key presses, mouse, etc.) Area-VT Virtual Terminal sequence support Issue-Bug It either shouldn't be doing this or needs an investigation. Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Priority-2 A description (P2) Product-Conhost For issues in the Console codebase Product-Conpty For console issues specifically related to conpty Product-Terminal The new Windows Terminal.
Projects
None yet
4 participants