Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request: Ŋ ŋ #64

Open
AshtarBalynestjar opened this issue Jun 22, 2019 · 15 comments
Open

Request: Ŋ ŋ #64

AshtarBalynestjar opened this issue Jun 22, 2019 · 15 comments
Labels
addressed in source files completed in dev versions, but not yet released as fonts

Comments

@AshtarBalynestjar
Copy link

AshtarBalynestjar commented Jun 22, 2019

I’d like to request adding support for the letter eng. I am aware that the letter is used in IPA and is part of the AL-5 character set, so it is already in the pipeline. However, by adding it alone, you will support several languages such as Ganda (6.5 million users), Wolof (5.2 million users) and Lakota (2100 users; would require using combining diacritics for ǧ and ȟ), and be underway to supporting languages like Northern Sami (26 thousand users; ŧ missing), Dinka (1.3 million users; ɛ ɔ ɣ missing) and Fula (29 million users; ɓ ɗ ƴ missing, as well as ɲ in some orthographies).

The relevant codepoints are:

Ŋ U+014A LATIN CAPITAL LETTER ENG
ŋ U+014B LATIN SMALL LETTER ENG

The capital Eng has two main alternative glyphs: a capital N with a descender (preferred by Sami users), and an enlarged version of the lowercase eng (preferred by African users). Given all of this, the reference glyph should be the enlarged lowercase form, but most typefaces on Google Fonts seem to have it default to the capital-N form. I’m not sure if this is simply because they do not intend to support African languages, or whether there is something I’m missing, but Source Sans Pro seems to be alone among the most downloaded typefaces in defaulting to the enlarged lowercase form. In any case, both glyphs should be available, either through the locl feature or stylistic sets.

@frankrolf
Copy link
Member

Thank you for this detailed request.
There currently is no immediate goal to extend Source Serif to AL-5 or add all of IPA, but adding individual glyphs (especially with a strong use case) is not out of the question.

In fact, Ŋ and ŋ already exist in the Roman masters, because they were needed for Source Han Serif.
However, they are not fully “wired up” – there are no variants such as small caps, there is no localized alternate, they are not in any kerning pairs, etc.

If you feel like testing the current engs, you can add two lines to the GlyphOrderAndAliasDB file found in the Roman subdirectory:

Eng	Eng
eng	eng

(the divider is a single tab character)
If you rebuild the fonts after that via ./build.sh, the Eng/eng should be available.

I cannot comment on why the Sami Eng variant seems to be the standard (despite the presumably lower number of users). I think it might be because it’s easier to derive from capital N and J.
I also don’t know why @pauldhunt chose to make the African variant the default (which is a deviation from every Adobe font before Source Sans) – which form do you think should be the standard?

@AshtarBalynestjar
Copy link
Author

AshtarBalynestjar commented Jun 23, 2019

I also don’t know why @pauldhunt chose to make the African variant the default (which is a deviation from every Adobe font before Source Sans) – which form do you think should be the standard?

If the only addition to the current character set is eng, the only Sami language that will be fully supported would be Lule Sami (with less than 2000 speakers). However, it will be enough to support at least Wolof (5.2 million), Mandinka (1.3 million) and Ganda (6.5 million), so in my opinion the African form should be the default.

That said, it is not entirely unreasonable to ask for the Sami glyph as an alternate.

@frankrolf
Copy link
Member

Thanks!
Since Source Sans is already taking that route, I will implement the African variant (cap-size lowercase n form) as a default, and add the N-shaped variant via language NSM in the locl feature.

@AshtarBalynestjar
Copy link
Author

AshtarBalynestjar commented Jun 23, 2019

Would it be a good idea to have the N-variant under ss03 as well, mirroring the way you’ve handled Bulgarian and Serbian Cyrillic?

@frankrolf
Copy link
Member

I will think about it. Usually I am not in favor of using up a stylistic set for a single glyph variant, but I agree there should be a secondary way of accessing regional alternates. (ss03 is already taken in the Italics)

@FloraCanou
Copy link

If the only addition to the current character set is eng, the only Sami language that will be fully supported would be Lule Sami (with less than 2000 speakers). However, it will be enough to support at least Wolof (5.2 million), Mandinka (1.3 million) and Ganda (6.5 million), so in my opinion the African form should be the default.

I don't see the number of speakers to be a valid reason to pick one as default over the other.

@frankrolf
Copy link
Member

@FloraCanou Why is that? Which other reasons would you suggest are valid?

@FloraCanou
Copy link

Unicode takes the N-with-descender form as default. I suggest following Unicode.

@frankrolf
Copy link
Member

The document you linked states “glyph may also have appearance of large form of the small letter”.
This example demonstrates that “following Unicode” often is ambiguous – therefore, (IMO) other practical factors (such as the number of speakers) may also be considered.

@AshtarBalynestjar
Copy link
Author

I will think about it. Usually I am not in favor of using up a stylistic set for a single glyph variant, but I agree there should be a secondary way of accessing regional alternates. (ss03 is already taken in the Italics)

I think I’ve found it: the features cv01 through cv99. Here's what Microsoft Typography has to say about them:

The function of these features is similar to the function of the Stylistic Alternates feature ('salt') and the Stylistic Set features (see 'ss01' – 'ss20'). Whereas the Stylistic Set features assume recurring stylistic variations that apply to a broad set of Unicode characters, these features are intended for scenarios in which particular characters have variations not applicable to a broad set of characters. The Stylistic Alternates feature provides access to glyph variants, but does not allow an application to control these on a character-by-character basis; the Character Variant features provide the greater granularity of control.

The function of these features is also related to that of the Localized Forms ('locl') feature, in that particular variations for a character may be preferred for particular languages. In practice, though, it may not be feasible to associate particular glyph variants with particular language systems for all the relevant languages; for example, the requirements of particular languages may not be known when a font is being developed.

The distinction between these features and the Stylistic set features is most easily understood in terms of variations applying to a single character versus variations applying across a range of characters. In practice, if a variation applies to a character in a bicameral script, then the casing-pair character may have the same variation. Also, Unicode includes pre-composed characters for certain base + mark combinations, hence a single abstract character may be incorporated into a number of Unicode characters. Therefore, a variation for a particular abstract character may be applicable to several related Unicode characters. The Character Variant features can be used for sets of related characters in these cases. The key distinction between such use and the intended use for Stylistic Set features is that a Character Variant feature should apply only to one character or a set of characters closely related in this way, while Stylistic Set features are intended for broader sets of characters.

@AshtarBalynestjar
Copy link
Author

AshtarBalynestjar commented Apr 12, 2020

Update:

The new Kazakh Latin alphabet presented by the Baitursynov Institute of Linguistics uses the letter eng, with the capital being in the N-form. Now that there is an actual compromise to be made between Kazakh and, say, Wolof and Ganda, the preferred default form isn’t so obvious. However, because Kazakhstan has a population of 18.2 million, of which over 76% are Internet users and at least 9.9 million of which actively use the Kazakh language, the use case for eng is even stronger.

frankrolf added a commit that referenced this issue Sep 3, 2020
This adds the codepoints for Eng and eng (as well as the glyph Eng.sc), as requested in #64.
While Ŋŋ are not in any Adobe charset below AL-5, the introduction of the new Kazakh Latin Alphabet makes this request a bit more relevant.

Note: this is adding the N-like version of the capital Eng.
The arch-like version has to be added in a future update focused on African languages.
@frankrolf frankrolf added the addressed in source files completed in dev versions, but not yet released as fonts label Sep 4, 2020
@moyogo
Copy link

moyogo commented Nov 3, 2020

Most of the languages (Ganda, Wolof, Mandinka-Bambara-Dyoula, Dinka, or Lakota) mentioned in the original post use the n-form uppercase more often than the N-form uppercase. Many minor West African languages that do not have OpenType language system tags also mostly use the n-form.

The Sami languages that use Ŋ prefer the N-form.

It may be better to do as Source Sans, with the n-form as the default and the N-form as a locl feature for Sami languages.

@andjc
Copy link

andjc commented Jan 15, 2021

Actually using locl feature is complicated, and the amount of research needed to properly implement it would be extensive. Beyond there are many languages that use Eng, one variant sued extensively in Africa (n-form with a descender, and in at least one case the n-form w/o descender). The N-form is used in Northern Europe, Northern Australia, North America, and other locations.

It is possible to use cvNN features, but there is no one source of information identifying all the languages that use Eng, so locl will be incomplete, and can't be the only method to access those variant glyphs.

@frankrolf
Copy link
Member

Thanks for your thoughts, @andjc!
I agree that locl tagging is a spotty way of implementing character alternates, and this concern has come up before (with Bulgarian alternates, for example).
The Ŋ now added to Source Serif (with the optical size extension) is the N-like form – simply because it already was drawn for Source Han Serif.
What I would like to see is a focus on extending Source Serif toward African languages – perhaps a the proper way for doing this would be creating a per-language (or language-group) forks. Not making any promises, just thinking out aloud.
I think there was some interest in the past, and I think this work can start soon.

@andjc
Copy link

andjc commented Jan 24, 2021

@frankrolf you will need a strategy to handle glyph variation, for wide-spread African language support you will need to cohesive approach to variant glyphs. The SIL repertoire is the most extensive, but I still find occasional gaps in their coverage.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
addressed in source files completed in dev versions, but not yet released as fonts
Projects
None yet
Development

No branches or pull requests

5 participants