Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New check: Check that glyph for U+0675 ARABIC LETTER HIGH HAMZA is not a mark #4290

Open
2 of 7 tasks
khaledhosny opened this issue Oct 1, 2023 · 6 comments
Open
2 of 7 tasks
Assignees
Labels
New check proposal We expect new check proposals to include a detailed rationale description and a suggested check-id
Milestone

Comments

@khaledhosny
Copy link
Contributor

What needs to be checked?

U+0675 ARABIC LETTER HIGH HAMZA should be a base glyph not a mark and should be the same size as U+0621 ARABIC LETTER HAMZA but slightly higher above baseline.

Detailed description of the problem

Many fonts incorrectly treat U+0675 ARABIC LETTER HIGH HAMZA as a variant of U+0654 ARABIC HAMZA ABOVE and makes it a combining mark of the same size. But U+0675 is base letter and should be a variant of U+0621 ARABIC LETTER HAMZA but raised slightly above baseline.

Resources and steps needed to reproduce the problem

The current version of Noto Sans Arabic has this issue and many other fonts.

Suggested profile

  • Vendor-specific: Google Fonts
  • Vendor-specific: Adobe Fonts
  • OpenType (requirements imposed by the OpenType specification)
  • Universal (broadly accepted best practices on the type design community)
  • Other:

Suggested result

Which log result level should the check have:

  • 🔥 FAIL if U+0675 is mark and it size is different than U+0621
  • ⚠️ WARN (A potential issues that may need to be addressed)

Severity assessment

4 as this effectively makes it useless for Jawi and possibly Kazakh as well.

@khaledhosny khaledhosny added the New check proposal We expect new check proposals to include a detailed rationale description and a suggested check-id label Oct 1, 2023
@felipesanches felipesanches added this to the 0.10.2 milestone Oct 20, 2023
felipesanches added a commit to felipesanches/fontbakery that referenced this issue Oct 20, 2023
Many fonts incorrectly treat ARABIC LETTER HIGH HAMZA (U+0675) as a variant of
ARABIC HAMZA ABOVE (U+0654) and make it a combining mark of the same size.

But U+0675 is a base letter and should be a variant of ARABIC LETTER HAMZA
(U+0621) but raised slightly above baseline.

Not doing so effectively makes the font useless for Jawi and
possibly Kazakh as well.

Added to the Universal Profile
com.google.fonts/check/arabic_high_hamza (experimental)
(issue fonttools#4290)
felipesanches added a commit to felipesanches/fontbakery that referenced this issue Oct 20, 2023
Many fonts incorrectly treat ARABIC LETTER HIGH HAMZA (U+0675) as a variant of
ARABIC HAMZA ABOVE (U+0654) and make it a combining mark of the same size.

But U+0675 is a base letter and should be a variant of ARABIC LETTER HAMZA
(U+0621) but raised slightly above baseline.

Not doing so effectively makes the font useless for Jawi and
possibly Kazakh as well.

Added to the Universal Profile
com.google.fonts/check/arabic_high_hamza (experimental)
(issue fonttools#4290)
felipesanches added a commit to felipesanches/fontbakery that referenced this issue Oct 20, 2023
Many fonts incorrectly treat ARABIC LETTER HIGH HAMZA (U+0675) as a variant of
ARABIC HAMZA ABOVE (U+0654) and make it a combining mark of the same size.

But U+0675 is a base letter and should be a variant of ARABIC LETTER HAMZA
(U+0621) but raised slightly above baseline.

Not doing so effectively makes the font useless for Jawi and
possibly Kazakh as well.

Added to the Universal Profile
com.google.fonts/check/arabic_high_hamza (experimental)
(issue fonttools#4290)
felipesanches added a commit that referenced this issue Oct 20, 2023
Many fonts incorrectly treat ARABIC LETTER HIGH HAMZA (U+0675) as a variant of
ARABIC HAMZA ABOVE (U+0654) and make it a combining mark of the same size.

But U+0675 is a base letter and should be a variant of ARABIC LETTER HAMZA
(U+0621) but raised slightly above baseline.

Not doing so effectively makes the font useless for Jawi and
possibly Kazakh as well.

Added to the Universal Profile
com.google.fonts/check/arabic_high_hamza (experimental)
(issue #4290)
@simoncozens
Copy link
Collaborator

All this is true, but it should apply to U+0674 (HIGH HAMZAH), not U+0675 (HIGH HAMZAH ALEF)

@simoncozens simoncozens reopened this May 30, 2024
@bobh0303
Copy link
Contributor

bobh0303 commented Aug 1, 2024

[U+0674] ARABIC LETTER HIGH HAMZA should be ... the same size as U+0621 ARABIC LETTER HAMZA but slightly higher above baseline.

Where does this information come from? Unicode's current code charts certainly do not suggest this.

@khaledhosny
Copy link
Contributor Author

khaledhosny commented Aug 4, 2024

[U+0674] ARABIC LETTER HIGH HAMZA should be ... the same size as U+0621 ARABIC LETTER HAMZA but slightly higher above baseline.

Where does this information come from? Unicode's current code charts certainly do not suggest this.

This is based on its use as three quarters hamza in Jawi use and native readers preferences. See https://www.unicode.org/L2/L2022/22051-jawi-hamza.pdf and the Unicode response in https://www.unicode.org/L2/L2022/22068-script-adhoc-rept.pdf. Though the recommendation there is to have a Jawi-specific variant glyph and relay on language tagging, my understanding is that this extra complication is unnecessary as a full size hamza seems to be acceptable for Kazakh, like many of the samples shown here for example https://www.unicode.org/L2/L2020/20289-kazakh-kyrgyz-uyghur-annot.pdf.

@bobh0303
Copy link
Contributor

bobh0303 commented Aug 7, 2024

Thanks. So the result of Script Ad Hoc two years ago says:

However, the general consensus amongst the group was to unify THREE QUARTER HIGH HAMZA with
U+0674 HIGH HAMZA, and to highly encourage font designers to support the glyph with the expected
shape and positioning for Jawi, employing a language tag or creating a Jawi-specific font. Encoding a
new character would take several years, and text representation would be inconsistent, causing other
problems for users.

This suggests to me that the portion of this fontbakery test that confirms the glyph size is similar to U+0621 should apply only to fonts that have such language tag or are known to be designed specifically for Jawi.

In fact one could argue that in all other cases (i.e., no Jawi language tag and font not known to be specific to Jawi) the size check should be against U+0654.

@bobh0303
Copy link
Contributor

bobh0303 commented Aug 7, 2024

should apply only to fonts that have such language tag

Actually, it is slightly more nuanced than that.

If the font has a Jawi language tag, then fontbakery should test 0674 glyph size with and without the that language applied, and the glyph size should be similar to 0621 in the first case and 0654 in the second.

@bobh0303
Copy link
Contributor

bobh0303 commented Sep 5, 2024

The soon-to-be-released Unicode 16 has included information about Jawi usage to the discussion of high hamza. The Review Draft of Ch 9 now reflects the Script Ad Hoc recommendation, specifically saying:

Malay Jawi uses U+0674 ٴ ARABIC LETTER HIGH HAMZA. In Jawi, the letter is the same size as U+0621 ء ARABIC LETTER HAMZA; however, unlike U+0621, it is positioned above the baseline at three-quarters height of the U+0627 ا ARABIC LETTER ALEF. Font designers can use language tagging in order to support the preferred shapes for both Kazakh and Jawi in multilingual fonts

With that in view, it seems this test really needs to evaluate the size of high hamza based on langtag-specific rendering within the font. Whether this even possible with facilities within fontbakery I do not know. But it seems clear that the following logic would be needed:

  • If the MLY language system tag is in the font for Arabic script, then fontbakery should apply that tag and then check that the size of the resulting high hamza glyph is about that of U+0621 hamza
  • If the KAZ language system tag is in the font for Arabic script, then fontbakery should apply that tag and then check that the size of the resulting high hamza glyph is about that of U+0654 hamza above
  • Regarding the default glyph: Unless fontbakery knows that the font is intended only for Jawi — and I'm not sure how it could know that — then it seems to me the default glyph for high hamza should be as specified in the Unicode code charts, namely the size of U+0654 hamza above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
New check proposal We expect new check proposals to include a detailed rationale description and a suggested check-id
Projects
None yet
Development

No branches or pull requests

5 participants