Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix /etc/inputrc{,.keys} #84

Merged
merged 2 commits into from
Feb 22, 2021
Merged

Fix /etc/inputrc{,.keys} #84

merged 2 commits into from
Feb 22, 2021

Conversation

akinomyoga
Copy link
Contributor

@bitstreamout Is this the right place to suggest changes in /etc/inputrc{,.keys}? Following the code comment at the top of /etc/inputrc, here I submit a PR. There are two types of issues with the current /etc/inputrc and /etc/inputrc.keys:

  • Some key sequences are specified in a wrong format. Also, there are duplicate keybindings, i.e., keybindings with multiple different representations of an identical key sequence are defined.
  • The keybindings to raw 8-bit C1 characters conflict with UTF-8 encoding.

Incorrect key-sequence format

Currently, some key sequences are specified in incorrect formats. Here are the examples:

1. \C-\eOD (/etc/inputrc)

This is interpreted as ^\ (Control-Backslash = FS) + e + O + D until Readline 8.0. This is interpreted as \e (ESC) + O + D in Readline 8.1+. FS e O D doesn't make sense, so I guess the intended key sequences here is ESC O D (7-bit representation of SS3 D). However, the keybindings of these key sequences are already defined.

2. \M-\eOD (/etc/inputrc)

This is interpreted as M-\ (Meta-Backslash = raw Ü) + e + O + D until Readline 8.0. This is interpreted as \233 (raw CSI)+ O + D in Readline 8.1+. Neither Ü e O D nor CSI O D doesn't make sense as a key sequence. I guess the original intent is \e\217D ESC SS3 D. However, the keybindings of these key sequences are already defined.

3. \C-\M-OD (/etc/inputrc)

This is interpreted as \e (ESC) + ^O (Control-O) + D until Readline 8.0. This is interpreted as \217 (raw SS3) + D in Readline 8.1+. I guess the original intent is \217D as in Readline 8.1+. However, the keybindings of these key sequences are already defined.

4. \M-[2~ (/etc/inputrc.keys)

This is interpreted as M-[ (Meta-Left-Bracket = raw Û)+ 2 ~. This is interpreted as \e + [ + 2 + ~ only when set convert-meta on, but convert-meta is turned off at the beginning of /etc/inputrc. Û 2 ~ doesn't make sense as a key sequence. I guess the intent is \e[2~, but the corresponding keybindings are already registered.

5. \C-\M-[D (/etc/inputrc.keys)

This is interpreted as \e (ESC) + \e (ESC = C-[) + D until Readline 8.0. This is interpreted as \233 (raw CSI) + D in Readline 8.1+. I guess this is intended to be \233D.

6. \C-^[[D (/etc/inputrc.keys)

This is interpreted as ^^ (Control-circumflex) + [ + [ + D which doesn't make sense as a key sequence. I guess this is intended to be \e[D (\C-\C-[ + [ + D = \e + [ + D).

I have adjusted these key sequences, and also removed duplicate keybindings in the commit ba286f5.

Conflicts with UTF-8

The keybindings that involve raw 8-bit C1 characters---specifically, \233 (raw CSI) and \217 (raw SS3)---conflict with UTF-8 encoding. These bytes \233 and \217 are used in the second or later bytes in the multibyte representation of UTF-8. Because these raw 8-bit C1 characters are directly used in the keybindings such as \233A or \233F, it is now impossible to input some combinations of Unicode strings, e.g. "ΛC", "力F", "子供A", because of these UTF-8 violating keybindings.

  • First of all, this 8-bit representation of C1 characters is rarely used today. Even the terminal emulators with the 8-bit C1 support use 7-bit representation by default and only use the 8-bit representation when requested by the terminal application using S8C1T (\e G). And, for the application side, there is no reason to request the 8-bit representation because it cannot expect the 8-bit C1 support by all the terminals while it can expect the 7-bit C1 support.
  • Next, the 8-bit representation of C1 characters is not always transmitted with raw bytes but, depending on the terminal implementation, may be encoded using the UTF-8 scheme when the terminal is working with LC_CTYPE=*.UTF-8. For example CSI U+009B should be encoded as \302\233 in UTF-8. SS3 U+008F should be encoded as \302\217 in UTF-8. To avoid conflicts in UTF-8 encoding, 8-bit C1 characters should be encoded by the UTF-8 scheme. However, in reality, many terminals still use raw C1 characters, but only a small number of terminals use UTF-8 encoded 8-bit C1 characters.
  • /etc/inputrc{,.keys} cannot be configured so as to switch key bindings depending on LC_CTYPE, so it should not contain the encoding dependent keybindings.

For these reasons, the keybindings involving \233 and \217 (which are inherently ambiguous, rarely used, and conflicting with UTF-8) are now commented out (ad4f969). Maybe these lines can be completely deleted. Or, maybe these lines can be converted to use the UTF-8 encoded 8-bit C1 characters instead of raw ones. If there is any suggestion, I can edit the commit.

@bitstreamout
Copy link
Member

Hmmm ... yep, the 8-bit Controls mode of XTerm (in VT220 emulation mode) does conflict with UTF-8

@bitstreamout bitstreamout merged commit 425f3e9 into openSUSE:master Feb 22, 2021
@akinomyoga
Copy link
Contributor Author

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants