-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
util: improve unicode support #31319
util: improve unicode support #31319
Conversation
81735c6
to
02515e8
Compare
This comment has been minimized.
This comment has been minimized.
02515e8
to
80fe23b
Compare
This comment has been minimized.
This comment has been minimized.
da70f98
to
fc7f090
Compare
I'm not sure who to ping for a review. @srl295 maybe? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Could optimize the existing code, but generally LGTM.
const isFullWidthCodePoint = (code) => { | ||
// Code points are partially derived from: | ||
// http://www.unicode.org/Public/UNIDATA/EastAsianWidth.txt | ||
return code >= 0x1100 && ( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So this might be doable as a regex… it could be compiled as a regex, i don't think there's an East Asian Width property available in regex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could definitely be a regular expression. I guess it's slower that way but I did not check. I'll have a look soon.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ICU4C also has API to get the East Asian Width.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we use that in case Node.js is build with ICU
but this is the fallback code.
The array grouping function relies on the width of the characters. It was not calculated correct so far, since it used the string length instead. This improves the unicode output by calculating the mono-spaced font width (other fonts might differ).
fc7f090
to
1b9547f
Compare
Relevant test failures in the no-intl host on CI? |
The array grouping function relies on the width of the characters. It was not calculated correct so far, since it used the string length instead. This improves the unicode output by calculating the mono-spaced font width (other fonts might differ). PR-URL: #31319 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Steven R Loomis <srloomis@us.ibm.com> Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Minwoo Jung <nodecorelab@gmail.com>
Landed in 8fb5fe2 🎉 |
The array grouping function relies on the width of the characters. It was not calculated correct so far, since it used the string length instead. This improves the unicode output by calculating the mono-spaced font width (other fonts might differ). PR-URL: #31319 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Steven R Loomis <srloomis@us.ibm.com> Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Minwoo Jung <nodecorelab@gmail.com>
@BridgeAR if this should go back to |
The array grouping function relies on the width of the characters. It was not calculated correct so far, since it used the string length instead. This improves the unicode output by calculating the mono-spaced font width (other fonts might differ). PR-URL: nodejs#31319 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Steven R Loomis <srloomis@us.ibm.com> Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Minwoo Jung <nodecorelab@gmail.com>
The array grouping function relies on the width of the characters. It was not calculated correct so far, since it used the string length instead. This improves the unicode output by calculating the mono-spaced font width (other fonts might differ). PR-URL: #31319 Reviewed-By: James M Snell <jasnell@gmail.com> Reviewed-By: Steven R Loomis <srloomis@us.ibm.com> Reviewed-By: Rich Trott <rtrott@gmail.com> Reviewed-By: Minwoo Jung <nodecorelab@gmail.com>
The array grouping function relies on the width of the characters.
It was not calculated correct so far, since it used the string
length instead.
This improves the unicode output by calculating the mono-spaced
font width (other fonts might differ).
I had to move some functions. Otherwise we'd have to load the utils functions by default and that did not seem necessary.
Checklist
make -j4 test
(UNIX), orvcbuild test
(Windows) passes