Skip to content

Commit

Permalink
Remove RGI qualification stuff to make things simpler
Browse files Browse the repository at this point in the history
  • Loading branch information
janlelis committed Nov 17, 2024
1 parent 09f9743 commit 4ccf617
Show file tree
Hide file tree
Showing 5 changed files with 17 additions and 48 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,9 @@
ZWJ/modifier sequence (`:all`). The latter is more common and more efficient
to implement.
- Add alias `emoji: :auto` for `emoji: true` and `emoji: :none` for `emoji: false`
- Unify `rgi_*` options to just `rgi` to keep things simpler (corresponds to
the former `:rgi_uqe` option). Most terminals that want to support the RGI set
will probably want to catch Emoji sequences with missing VS16s.
- Add new `:all_no_vs16` mode
- Only consider terminal cells needed when recommending Emoji support level
(Emoji themselves might display differently)
Expand Down
18 changes: 3 additions & 15 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -110,35 +110,23 @@ Option | Description | Example Terminals
`emoji: true` or `emoji: :auto` | Automatically use recommended Emoji setting for your terminal | -
`emoji: false` or `emoji: :none` | No Emoji adjustments, Emoji characters with VS16 not handled | Gnome Terminal, many older terminals
`emoji: :basic` | Full-width VS16-Emoji, but no width adjustments for Emoji sequences: All partial Emoji treated separately with a width of 2 | ?
`emoji: :rgi_fqe` | Full-width VS16-Emoji, all fully-qualified RGI Emoji sequences are considered to have a width of 2 | ?
`emoji: :rgi_mqe` | Full-width VS16-Emoji, all fully- and minimally-qualified RGI Emoji sequences are considered to have a width of 2 | ?
`emoji: :rgi_uqe` | Full-width VS16-Emoji, all RGI Emoji sequences, regardless of qualification status are considered to have a width of 2 | Apple Terminal
`emoji: :rgi` | Full-width VS16-Emoji, all RGI Emoji sequences are considered to have a width of 2 | Apple Terminal
`emoji: :possible`| Full-width VS16-Emoji, all possible/well-formed Emoji sequences are considered to have a width of 2 | ?
`emoji: :all` | Full-width VS16-Emoji, all ZWJ/modifier/keycap sequences have a width of 2, even if they are not well-formed Emoji sequences | foot, Contour
`emoji: :all_no_vs16` | VS16-Emoji not handled, all ZWJ/modifier/keycap sequences to have a width of 2, even if they are not well-formed Emoji sequences | WezTerm

- *RGI Emoji:* Emoji Recommended for General Interchange
- *Qualification:* Whether an Emoji sequence has all required VS16 codepoints
- *ZWJ:* Zero-width Joiner: Codepoint `U+200D`,used in many Emoji sequences

Example:

```ruby
Unicode::DisplayWidth.of "🐻‍❄", emoji: :rgi_mqe # => 3 (2 for U+1f43b, 1 for U+2744)
Unicode::DisplayWidth.of "🐻‍❄", emoji: :rgi_uqe # => 2
```

See [emoji-test.txt](https://www.unicode.org/Public/emoji/16.0/emoji-test.txt), the [unicode-emoji gem](https://github.com/janlelis/unicode-emoji) and [UTS-51](https://www.unicode.org/reports/tr51/#def_qualified_emoji_character) for more details about qualified and unqualified Emoji sequences.

#### Emoji Support in Terminals

Unfortunately, the level of Emoji support varies a lot between terminals. While some of them are able to display (almost) all Emoji sequences correctly, others fall back to displaying sequences of basic Emoji. When `emoji: true` or `emoji: :auto` is used, the gem will attempt to set the best fitting Emoji setting for you (e.g. `:rgi_uqe` on "Apple_Terminal" or `:none` on Gnome's terminal widget).
Unfortunately, the level of Emoji support varies a lot between terminals. While some of them are able to display (almost) all Emoji sequences correctly, others fall back to displaying sequences of basic Emoji. When `emoji: true` or `emoji: :auto` is used, the gem will attempt to set the best fitting Emoji setting for you (e.g. `:rgi` on "Apple_Terminal" or `:none` on Gnome's terminal widget).

Note that Emoji display and number of terminal columns used might differs a lot. For example, it might be the case that a terminal does not understand which Emoji to display, but still manages to calculate the proper amount of terminal cells. The automatic Emoji support level per terminal only considers the latter (cursor position), not the actual Emoji image(s) displayed. Please [open an issue](https://github.com/janlelis/unicode-display_width/issues/new) if you notice your terminal application could use a better default value. Also see the [ucs-detect project], which is a great resource that compares various terminal's Unicode/Emoji capabilities.

---

To terminal implementors reading this: Although handling Emoji/ZWJ sequences as always having a width of 2 (`:all` mode described above) has some advantages, it does not lead to a particularly good developer experience. Since there is always the possibility of well-formed Emoji that are currently not supported (non-RGI / future Unicode) appearing, those sequences will take more cells. Instead of overflowing, cutting off sequences or displaying placeholder-Emoji, could it be worthwile to implement the `:rgi_uqe` option (see table above) and just give those unknown Emoji the space they need? It is painful to implement, I know, but it kind of underlines the idea that the meaning of an unknown Emoji sequence can still be conveyed (without messing up the terminal at the same time). Just a thought…
To terminal implementors reading this: Although handling Emoji/ZWJ sequences as always having a width of 2 (`:all` mode described above) has some advantages, it does not lead to a particularly good developer experience. Since there is always the possibility of well-formed Emoji that are currently not supported (non-RGI / future Unicode) appearing, those sequences will take more cells. Instead of overflowing, cutting off sequences or displaying placeholder-Emoji, could it be worthwile to implement the `:rgi` option (see table above) and just give those unknown Emoji the space they need? It is painful to implement, I know, but it kind of underlines the idea that the meaning of an unknown Emoji sequence can still be conveyed (without messing up the terminal at the same time). Just a thought…

---

Expand Down
4 changes: 1 addition & 3 deletions lib/unicode/display_width.rb
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,7 @@ class DisplayWidth
WIDTH_TWO: decompress_index(INDEX[:WIDTH_TWO][0][0], 1),
}
EMOJI_SEQUENCES_REGEX_MAPPING = {
rgi_fqe: :REGEX,
rgi_mqe: :REGEX_INCLUDE_MQE,
rgi_uqe: :REGEX_INCLUDE_MQE_UQE,
rgi: :REGEX_INCLUDE_MQE_UQE,
possible: :REGEX_WELL_FORMED,
}
REGEX_EMOJI_BASIC_OR_KEYCAP = Regexp.union(Unicode::Emoji::REGEX_BASIC, Unicode::Emoji::REGEX_EMOJI_KEYCAP)
Expand Down
2 changes: 1 addition & 1 deletion lib/unicode/display_width/emoji_support.rb
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ def self.recommended
when "iTerm.app"
return :all
when "Apple_Terminal" # Also: If first Emoji part is EAW 1, gives whole ZWJ seqs width 1
return :rgi_uqe
return :rgi
when "WezTerm"
return :all_no_vs16
end
Expand Down
38 changes: 9 additions & 29 deletions spec/display_width_spec.rb
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@
end

it 'counts default-text presentation Emoji with Emoji Presentation (VS16) as 2 (in a sequence)' do
expect( "❣️‍❣️".display_width(emoji: :rgi_fqe) ).to eq 4
expect( "❣️‍❣️".display_width(emoji: :rgi) ).to eq 4
end

it 'counts default-emoji presentation Emoji according to EAW (always 2)' do
Expand Down Expand Up @@ -229,11 +229,11 @@

describe '(modifiers and zwj sequences)' do
it 'counts RGI Emoji ZWJ sequence as width 2' do
expect( "🤾🏽‍♀️".display_width(1, emoji: :rgi_fqe) ).to eq 2
expect( "🤾🏽‍♀️".display_width(1, emoji: :rgi) ).to eq 2
end

it 'works for emoji involving characters which are east asian ambiguous' do
expect( "🤾🏽‍♀️".display_width(2, emoji: :rgi_fqe) ).to eq 2
expect( "🤾🏽‍♀️".display_width(2, emoji: :rgi) ).to eq 2
end
end

Expand All @@ -253,33 +253,13 @@
end
end

describe ':rgi_fqe' do
it 'will ignore shorter width of MQE / UQE / non-RQI sequences' do
expect( "🤾🏽‍♀️".display_width(1, emoji: :rgi_fqe) ).to eq 2 # FQE
expect( "🤾🏽‍♀".display_width(1, emoji: :rgi_fqe) ).to eq 5 # MQE
expect( "❤‍🩹".display_width(1, emoji: :rgi_fqe) ).to eq 3 # UQE
expect( "🤠‍🤢".display_width(1, emoji: :rgi_fqe) ).to eq 4 # Non-RGI/well-formed
expect( "🚄🏾‍▶️".display_width(1, emoji: :rgi_fqe) ).to eq 6 # Invalid/non-Emoji sequence
end
end

describe ':rgi_mqe' do
it 'will ignore shorter width of UQE / non-RQI sequences' do
expect( "🤾🏽‍♀️".display_width(1, emoji: :rgi_mqe) ).to eq 2 # FQE
expect( "🤾🏽‍♀".display_width(1, emoji: :rgi_mqe) ).to eq 2 # MQE
expect( "❤‍🩹".display_width(1, emoji: :rgi_mqe) ).to eq 3 # UQE
expect( "🤠‍🤢".display_width(1, emoji: :rgi_mqe) ).to eq 4 # Non-RGI/well-formed
expect( "🚄🏾‍▶️".display_width(1, emoji: :rgi_mqe) ).to eq 6 # Invalid/non-Emoji sequence
end
end

describe ':rgi_uqe' do
describe ':rgi' do
it 'will ignore shorter width of non-RQI sequences' do
expect( "🤾🏽‍♀️".display_width(1, emoji: :rgi_uqe) ).to eq 2 # FQE
expect( "🤾🏽‍♀".display_width(1, emoji: :rgi_uqe) ).to eq 2 # MQE
expect( "❤‍🩹".display_width(1, emoji: :rgi_uqe) ).to eq 2 # UQE
expect( "🤠‍🤢".display_width(1, emoji: :rgi_uqe) ).to eq 4 # Non-RGI/well-formed
expect( "🚄🏾‍▶️".display_width(1, emoji: :rgi_uqe) ).to eq 6 # Invalid/non-Emoji sequence
expect( "🤾🏽‍♀️".display_width(1, emoji: :rgi) ).to eq 2 # FQE
expect( "🤾🏽‍♀".display_width(1, emoji: :rgi) ).to eq 2 # MQE
expect( "❤‍🩹".display_width(1, emoji: :rgi) ).to eq 2 # UQE
expect( "🤠‍🤢".display_width(1, emoji: :rgi) ).to eq 4 # Non-RGI/well-formed
expect( "🚄🏾‍▶️".display_width(1, emoji: :rgi) ).to eq 6 # Invalid/non-Emoji sequence
end
end

Expand Down

0 comments on commit 4ccf617

Please sign in to comment.