Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CodepointWidthDetector: reclassify U+25FB, U+25FC as Narrow #5914

Merged
1 commit merged into from
May 14, 2020

Conversation

DHowett-MSFT
Copy link
Contributor

@DHowett-MSFT DHowett-MSFT commented May 14, 2020

This seems to be in line with the emoji-sequences table in the latest
version of the Unicode standard: those glyphs require U+FE0F to activate
their emoji presentation. Since we don't support composing U+FE0F, we
should not present them as emoji by default.

Fixes #5910.

Yes, I hate this.

This seems to be in line with the emoji-sequences table in the latest
version of the unicode standard: those glyphs require U+FE0F to activate
their emoji presentation. Since we don't support composing U+FE0F, we
should not present them as emoji by default.

Fixes #5910.
@DHowett-MSFT DHowett-MSFT requested a review from leonMSFT May 14, 2020 17:57
Copy link
Contributor

@leonMSFT leonMSFT left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get behind these two, but just looking at emoji-sequences.txt, there's a lot of emojis that have FE0F added on, should all of those be classified as narrow too?

@DHowett-MSFT
Copy link
Contributor Author

@leonMSFT honestly not sure. Looking at that list I see a bunch of things I can imagine people want to be emoji. I sorta hate this problem. :)

@DHowett-MSFT
Copy link
Contributor Author

I was somewhat informed by printing them all out and seeing what Cascadia Code did. If they were monochrome, I made them narrow. Sorta hate it.

Copy link
Member

@zadjii-msft zadjii-msft left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate everything - we're never going to have a definitive answer to this, are we?

@skyline75489
Copy link
Collaborator

I made #5407 back then to “have a definitive answer” to most of these. I think we can revisit this when 1.0 is out.

@DHowett-MSFT
Copy link
Contributor Author

@msftbot merge this in 1 minute

@ghost ghost added the AutoMerge Marked for automatic merge by the bot when requirements are met label May 14, 2020
@ghost
Copy link

ghost commented May 14, 2020

Hello @DHowett-MSFT!

Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:

  • I won't merge this pull request until after the UTC date Thu, 14 May 2020 23:48:30 GMT, which is in 1 minute

If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you".

@ghost ghost merged commit c39f9c6 into master May 14, 2020
@ghost ghost deleted the dev/duhowett/continue_reclassifying branch May 14, 2020 23:49
DHowett-MSFT pushed a commit that referenced this pull request May 14, 2020
This seems to be in line with the emoji-sequences table in the latest
version of the Unicode standard: those glyphs require U+FE0F to activate
their emoji presentation. Since we don't support composing U+FE0F, we
should not present them as emoji by default.

Fixes #5910.

Yes, I hate this.

(cherry picked from commit c39f9c6)
@j4james
Copy link
Collaborator

j4james commented May 15, 2020

I can get behind these two, but just looking at emoji-sequences.txt, there's a lot of emojis that have FE0F added on, should all of those be classified as narrow too?

I suspect they should be. I just did a quick test of the first 65536 codepoints in XTerm, Gnome Terminal, and Mintty to see which characters they treated as wide. There were 90 or so characters that we had as wide, that nobody else did. I haven't checked them all, but my impression is that they're mostly (if not entirely) from the FE0F list.

U+2139, U+2328, U+23CF, U+23ED..U+23EF, U+23F1..U+23F2, U+23F8..U+23FA, U+24C2, U+25AA..U+25AB, U+25FB..U+25FC, U+2600..U+2604, U+260E, U+2611, U+2618, U+261D, U+2620, U+2622..U+2623, U+2626, U+262A, U+262E..U+262F, U+2638..U+263A, U+2640, U+2642, U+265F..U+2660, U+2663, U+2665..U+2666, U+2668, U+267B, U+267E, U+2692, U+2694..U+2697, U+2699, U+269B..U+269C, U+26A0, U+26A7, U+26B0..U+26B1, U+26C8, U+26CF, U+26D1, U+26D3, U+26E9, U+26F0..U+26F1, U+26F4, U+26F7..U+26F9, U+2702, U+2708..U+2709, U+270C..U+270D, U+270F, U+2712, U+2714, U+2716, U+271D, U+2721, U+2733..U+2734, U+2744, U+2747, U+2763..U+2764, U+27A1, U+302A..U+302D, U+3099..U+309A, U+321F

I wasn't using the very latest build, so I think a couple of those have already been fixed, but that gives you a general idea of the range of characters that might need to be updated.

@DHowett-MSFT
Copy link
Contributor Author

Apart from the ideograph tone marks (U+302A..U+302D, which look like combining characters) and the other combining ones (3099, 309A), it looks like we've gotten the width right on the ones that we (DirectWrite(!)) have given a default emoji presentation.

image

While I agree that it's non-ideal, we need to get support into our renderer for variations (and default variations) to display them as monochrome glyphs instead of squashed single-column color glyphs if we wanna fix that 😄

@DHowett-MSFT
Copy link
Contributor Author

(the combining character problem is a whole other can of worms)

@DHowett-MSFT
Copy link
Contributor Author

DHowett-MSFT commented May 16, 2020

Okay, James and Leon, you’re both correct. All of the lower codepoints that were retroactively reclassified (and require FE0F as a variation selector) must be set to 1-wide and we’ll deal with what fallout comes from that. I’m putting together this change for last minute inclusion in v1.

@DHowett-MSFT
Copy link
Contributor Author

I’d rather be correct and look slightly wrong than incorrect and introduce a way for third party app developers to take a dependency on some silly way Windows differs from everyone else.

DHowett-MSFT pushed a commit that referenced this pull request May 16, 2020
This removes all glyphs from the emoji list that do not default to
"emoji presentation" (EPres). It removes all local overrides, but retain
the comments about the emoji we left out that are Microsoft-specific.

This brings us fully in line with the most popular Terminals on OS X,
except that we squash our emoji down to fit in one cell and they let
them hang over the edges and damage other characters. Oh well.

Refs #900, #5914.
DHowett pushed a commit that referenced this pull request May 17, 2020
This removes all glyphs from the emoji list that do not default to
"emoji presentation" (EPres). It removes all local overrides, but retains
the comments about the emoji we left out that are Microsoft-specific.

This brings us fully in line with the most popular Terminals on OS X,
except that we squash our emoji down to fit in one cell and they let
them hang over the edges and damage other characters. Oh well.

## Detailed Description of the Pull Request / Additional comments

Late Friday evening, I tested my emoji test file on iTerm2. In so doing, I realized
that @j4james and @leonMSFT were right the entire time in #5914: Emoji
that require `U+FE0F` must not be double-width by default.

I finally banged up a powershell script that parses the UCD and emits a codepoint
width table. Once checked in, this will be definitive.

Refs #900, #5914.
Fixes #5941.
DHowett pushed a commit that referenced this pull request May 17, 2020
This removes all glyphs from the emoji list that do not default to
"emoji presentation" (EPres). It removes all local overrides, but retains
the comments about the emoji we left out that are Microsoft-specific.

This brings us fully in line with the most popular Terminals on OS X,
except that we squash our emoji down to fit in one cell and they let
them hang over the edges and damage other characters. Oh well.

## Detailed Description of the Pull Request / Additional comments

Late Friday evening, I tested my emoji test file on iTerm2. In so doing, I realized
that @j4james and @leonMSFT were right the entire time in #5914: Emoji
that require `U+FE0F` must not be double-width by default.

I finally banged up a powershell script that parses the UCD and emits a codepoint
width table. Once checked in, this will be definitive.

Refs #900, #5914.
Fixes #5941.

(cherry picked from commit ba1a298)
jelster pushed a commit to jelster/terminal that referenced this pull request May 28, 2020
…t#5914)

This seems to be in line with the emoji-sequences table in the latest
version of the Unicode standard: those glyphs require U+FE0F to activate
their emoji presentation. Since we don't support composing U+FE0F, we
should not present them as emoji by default.

Fixes microsoft#5910.

Yes, I hate this.
jelster pushed a commit to jelster/terminal that referenced this pull request May 28, 2020
This removes all glyphs from the emoji list that do not default to
"emoji presentation" (EPres). It removes all local overrides, but retains
the comments about the emoji we left out that are Microsoft-specific.

This brings us fully in line with the most popular Terminals on OS X,
except that we squash our emoji down to fit in one cell and they let
them hang over the edges and damage other characters. Oh well.

## Detailed Description of the Pull Request / Additional comments

Late Friday evening, I tested my emoji test file on iTerm2. In so doing, I realized
that @j4james and @leonMSFT were right the entire time in microsoft#5914: Emoji
that require `U+FE0F` must not be double-width by default.

I finally banged up a powershell script that parses the UCD and emits a codepoint
width table. Once checked in, this will be definitive.

Refs microsoft#900, microsoft#5914.
Fixes microsoft#5941.
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AutoMerge Marked for automatic merge by the bot when requirements are met
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Emoji character width / spacing too large (Black Medium Square)
7 participants