-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
readline: Improve Unicode handling #25723
Conversation
acc443b
to
26f5fc5
Compare
@Avi-D-coder thanks a lot! Would you be so kind to add further tests that verify the correct positions for these two cases:
Please also use the following helper function to retrieve the number of characters in an string: function characters(str) {
let count = 0;
// eslint-disable-next-line no-unused-vars
for (const char of str) {
count++;
}
return count;
} In my mini benchmark this function outperforms |
@BridgeAR Will do |
@Avi-D-coder you could also calculate the correct length offset with the cursor instead of tracking it as well. That would probably be easier (just a bit more CPU expensive but it should not hurt all that much). function cursorToLength(cursor, str) {
let len = 0;
for (const char of str) {
len += char.length;
if (--cursor === 0)
break;
}
return len;
} This function could be used in all places that use the cursor position to slice something or that is compared to the line length. Please also note that all occurrences of e.g. |
@BridgeAR I added the tests and optimizations. I was not able to produce corrupt characters on this commit or the previous. I renamed your I'm not entirely sure why |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is looking really good. I just left a few comments and ctrl + b
and ctrl + f
should also be changed accordingly.
@BridgeAR I included your code. |
LGTM, just Lines 836 to 837 in 91adbe1
Lines 840 to 841 in 91adbe1
|
@BridgeAR my bad. I could have sworn I adjusted |
@Avi-D-coder thanks a lot for sticking to it! I just found the issue that confused me originally and it seems to be the last obstacle. Thanks again for reacting so quickly to everything! ❤️ |
1cc5b79
to
0639c1b
Compare
Anyone know the reason for the inconsistent ci? |
Do any changes need to be made? Why are those two ci environments failing? |
@Avi-D-coder It looks like this is failing tests when Node.js is compiled with
|
@addaleax This fix requires node to correctly handle unicode. I am not exactly sure what How can these tests be blacklisted for |
|
a0287dc
to
f4635bd
Compare
f4635bd
to
1b9ea93
Compare
The |
2cff8e4
to
e649085
Compare
The failing tests are under a security embargo. What is the next step here? |
Waiting until the CI is unlocked after the security releases are done. |
Resumed CI https://ci.nodejs.org/job/node-test-pull-request/21093/ ✅ (besides Windows fanned). Rebuild Windows fanned: https://ci.nodejs.org/job/node-test-commit-windows-fanned/25073/ ✔️ |
Prevents moving left or right from placing the cursor in between code units comprising a code point. PR-URL: nodejs#25723 Fixes: nodejs#25693 Reviewed-By: Anna Henningsen <anna@addaleax.net> Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Ruben Bridgewater <ruben@bridgewater.de>
Landed in fedc31b 🎉 @Avi-D-coder congratulations on your first commit to Node.js and a big thanks for sticking to it! ❤️ This was great work. |
@BridgeAR thanks for all your help. |
Fixes: #25693
In plain english: Deleting and moving the cursor across emoji works now, but grapheme clusters/characters like
👨👩👧👦
still have trailing blank spaces.Ideally grapheme clusters would be used instead of code points, but until the proposal-intl-segmenter lands node seems to be without built in support for handling them.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes