Combining diacritics from diacritics already in fonts #17

moyogo · 2015-09-13T10:00:05Z

No description provided.

…flexcmb, dieresiscmb, dotaccentcmb, gravecmb, hungarumlautcmb, macroncmb, ogonekcmb, ringcmb, tildecmb These combining marks are using spacing or legacy diacritics as components. It might be better the other way around.

@CMB

…s components * Rename acute.cap, caron.cap, circumflex.cap, grave.cap to acutecmb.cap, caroncmb.cap, circumflexcmb.cap, gravecmb.cap * Adjust their advance width to zero * Add groups @CMB and @cmbcap and substition after @uppercase in 'calt' * Rename components in other glyphs

weiweihuanghuang · 2015-09-13T18:57:21Z

Thank you Denis! By missing anchors, I assume there's a standard on what anchors each glyph requires–is there some resource I can refer to in the future? Of course Glyphs App itself has it's own GlyphData.xml but I'd still like to a source for reference.

moyogo · 2015-09-13T19:29:14Z

No there is no standard per se on what anchors each glyph requires.
Unicode says any base letter can be combined with any diacritic. For Latin that means the letter characters and the “common” diacritics shared with Greek, Cyrillic and in some cases others scripts, but other symbols can also be used with diacritics.

I keep a list for African orthographies with the Latin alphabet in https://github.com/moyogo/anloc-data. I also have an unpublished list for other languages.
The Adobe also has multiple Latin sets http://blog.typekit.com/2008/08/28/extended_latin/

If you venture in phonetic transcriptions or historical orthographies there are even more combinations.

The bottom line is: you might as well assume any base character can be combined with any diacritic.

Which brings me to a couple of question I have for how you’d like things to go.

Can I add the bottomC (cedilla) anchor and the ogonek anchor to all the base letters?
Some North American languages seem to prefer a straight and centered ogonek and some languages use the letters with cedilla (classic cedilla and comma cedilla) but should have only one shape. Is is OK if I add variants and features to access these as well.
You have a specific acute for the letter with ascender lacute 013A instead of using the acute or acute.cap (acutecmb and acutecmb.cap in this branch). That means a full set of acutecmb.asc, etc. would need to be designed to go with letters with ascenders. Would you consider using acutecmb.cap directly (replacing the current acute in lacute) or as a composite in acutecmb.asc (for positioning)?

davelab6 · 2015-09-13T19:49:38Z

Cool. This sounds like something I should be standardising in all libre fonts. Do you agree?

googlefonts/gf-docs@9b1c6bd

moyogo · 2015-09-13T19:56:35Z

@davelab6 Yes, I agree.

A good starting point is actually to use the combining diacritics characters instead of the spacing or legacy diacritics to build the precomposed accented characters (those in Unicode). That way you can easily extend the character set and you support combining with combining diacritic.

Just to make sure there no misunderstanding, the main point of these anchors is to end up in the 'mark' feature in GPOS, not to build all possible combinations as precomposed glyphs (precomposed characters in Unicode, ex: e+cedillacmb, and combination in Unicode as character sequences, ex: q+acutecmb).

davelab6 · 2015-09-13T20:04:15Z

I agree; but https://github.com/twardoch/ttfdiet#test-results shows that Adobe, Quark and Word fail to use mark/ccmp correctly and rely on the legacy chars

davelab6 · 2015-09-13T20:05:00Z

So I think that, like KERN vs GPOS kerning, the 'master' TTFs should have both, and then subset for platforms that can use smaller/newer techniques as needed.

moyogo · 2015-09-13T20:15:53Z

Yes, the precomposed character should still be there. I meant to say not all possible combination should be a a precomposed glyph. I would not include precomposed glyphs for q́, q̀, q̂, q̌, q̈, etc. in a font but they can still be composed with the font.

weiweihuanghuang · 2015-09-14T16:06:33Z

Thanks for answering and your continued contributions!

Can I add the bottomC (cedilla) anchor and the ogonek anchor to all the base letters?

Yes.

Some North American languages seem to prefer a straight and centered ogonek and some languages use the letters with cedilla (classic cedilla and comma cedilla) but should have only one shape. Is is OK if I add variants and features to access these as well.

If you think it's appropriate, sure. But I don't understand what you mean here:

Does some languages use the letters with cedilla mean that a letter such as (I'm making this up) yogonek ends up using a cedilla instead?
And but should have only one shape meaning they need to be consistent or?
What's a straight and centered ogonek?

You have a specific acute for the letter with ascender lacute 013A instead of using the acute or acute.cap (acutecmb and acutecmb.cap in this branch). That means a full set of acutecmb.asc, etc. would need to be designed to go with letters with ascenders. Would you consider using acutecmb.cap directly (replacing the current acute in lacute) or as a composite in acutecmb.asc (for positioning)?

I tried using the acutecmb.cap and I think it's too tall. Can we not just taking the acute in the lacute and creating a acutecmb.asc, what is the etc?

moyogo · 2015-09-14T17:29:41Z

Does some languages use the letters with cedilla mean that a letter such as (I'm making this up) yogonek ends up using a cedilla instead?
Andbut should have only one shape meaning they need to be consistent or?

Marshallese uses Ļ ļ M̧ m̧ Ņ ņ O̧ o̧. Because of the preferred comma shaped cedilla Ļ ļ Ņ ņ have a comma shaped cedilla. But M̧ m̧ O̧ o̧ have the classic cedilla. It would be best if the locl feature or an optional stylistic feature would make them consistent indeed.

What's a straight and centered ogonek?

It’s something the fonts on Navajo resources are doing:

moyogo · 2015-09-14T18:31:54Z

Can we not just taking the acute in the lacute and creating a acutecmb.asc, what is the etc?

Yes, that’s fine as well. The other diacritics can also go above letters with ascender. For example circumflex is used on h in Esperanto, grave on f, t, k in ISO 9 romanization of cyrillic, caron on h in Lakota or Romani in Finland, dieresis on h in Kurmanji, macron on l in Votic, tilde on l in Lithuanian, dot above was used on b or d in Irish. There are some romanization or historical orthographies using breve and ring above letters with ascenders as well. I’m not aware of double acute on ascender.

weiweihuanghuang · 2015-09-16T13:08:32Z

Yes, that’s fine as well. The other diacritics can also go above letters with ascender. For example circumflex is used on h in Esperanto, grave on f, t, k in ISO 9 romanization of cyrillic, caron on h in Lakota or Romani in Finland, dieresis on h in Kurmanji, macron on l in Votic, tilde on l in Lithuanian, dot above was used on b or d in Irish. There are some romanization or historical orthographies using breve and ring above letters with ascenders as well. I’m not aware of double acute on ascender.

I see, I can add .asc versions where appropriate too. Should it conform to the 125% of UPM max height?

weiweihuanghuang · 2015-09-16T13:12:38Z

Marshallese uses Ļ ļ M̧ m̧ Ņ ņ O̧ o̧. Because of the preferred comma shaped cedilla Ļ ļ Ņ ņ have a comma shaped cedilla. But M̧ m̧ O̧ o̧ have the classic cedilla. It would be best if the locl feature or an optional stylistic feature would make them consistent indeed.

Even if those glyphs are made with a ̧ 0327 COMBINING CEDILLA, why do L l N n default to the comma cedilla? and M m O o to the classical cedilla?

moyogo · 2015-09-20T13:20:12Z

I see, I can add .asc versions where appropriate too. Should it conform to the 125% of UPM max height?

If that’s what you used for lacute, yes.
I think lacute is the tallest reaching 930 and descenders or diacritics below go just below 210. The hhea, typo and win metrics each add up to 1173, so 117.3%. If you’re planning on supporting Vietnamese at some point, 125% or something around that would be better.

Even if those glyphs are made with a ̧ 0327 COMBINING CEDILLA, why do L l N n default to the comma cedilla? and M m O o to the classical cedilla?

Around the end of the 19th century the comma below was a common shape for the cedilla in fonts.
When Latvian orthography started using the cedilla, it was common to see it with either the comma or classic cedilla and eventually the comma shape became the most common. The same thing happened with Romanian.
By the time character encodings were made, the G K L N R cedilla were encoded for Latvian as letters with cedilla but with the most common shape. For Romanian, there was S T cedilla, but they had a classic cedilla (as S cedilla is used in Turkish). Unicode did not differentiate those characters G K L N R S T with cedilla from any with comma but has a combining comma separate from the combining cedilla. This made the cedilla ambiguous, it can have both shapes, while the combining comma below can only have one shape.
The Romanian standard association eventually asked for separate S T comma below and they were encoded in ISO 8859-16 and Unicode. Shortly after that Unicode stopped encoding precomposed characters.

So, the cedilla can have several shapes. In Latvian the preferred shape is now the comma cedilla under Latvian letters. In Romanian, separate characters were created so the comma below diacritic can be used. In Marshallese or some other context, the cedilla should have a single shape.

I’ve finished adding anchors.

weiweihuanghuang · 2015-09-20T15:50:12Z

Thanks for the information! I noticed the anchors in the Black masters for .asc glyphs are wrong (the asc height of the Black master changes). I'm currently adding the .asc versions of diacritics.

moyogo · 2015-09-20T19:50:54Z

Cool. I’m fixing the top anchor on those ascenders setting them all to 730.
I noticed I missed adding anchors to j. I’ll also add jdotless.

moyogo · 2015-09-23T10:19:24Z

FYI about the ogonek: adobe-fonts/source-sans#75

weiweihuanghuang · 2015-09-23T17:00:18Z

BTW if you are going to move anchors on any base glyphs you need to disable automatic alignment on the related diacritic glyphs. I.e. you moved the bottom anchor on T, then the commaccent in Tcommaccent will be moved to a new position that is not what I intended.

Many of the diacritics have moved out of place now, I don't know of any easy way to go through and place the components in the correct place again. cc @schriftgestalt @mekkablue ?

…ble automatic alignment on uni013C.ss02

…d bottom to lowercase missed in previous commit

…as components from previous commits

schriftgestalt · 2015-09-23T20:37:08Z

Why would you change the anchors in a way that would make the Tcommaaccent look bad?

moyogo · 2015-09-23T20:44:19Z

Why would you change the anchors in a way that would make the Tcommaaccent look bad?

Good question :-)

There should probably be a different anchor for bottom anchors like dot below, macron below, circumflex below. These should be centered on the stem of T instead.

I see I need to also realign and disable automatic alignment of the grave and double acute on ÀÌÒÙŰ in two masters.

weiweihuanghuang · 2015-09-23T20:45:08Z

Because the other anchors look better that way. And then if I changed the anchor in the commaaccent itself it wouldn't look balanced elsewhere and if I balanced those other ones then it threw more off. I find some diacritics don't work with a single anchor.

On 23 Sep 2015, at 10:37 pm, Georg notifications@github.com wrote:

Why would you change the anchors in a way that would make the Tcommaaccent look bad?

—
Reply to this email directly or view it on GitHub.

moyogo · 2015-09-23T21:12:30Z

What should I do when symmetric diacritics are positioned differently on the same base letter or similar diacritics are positioned differently on a symmetric base letter.

Compare Ecircumflex and Ecaron in master and in this branch:

In master the circumflex and the caron have different horizontal offsets.

In this branch the circumflex and the caron have the same horizontal offset.

Compare Igrave and Icircumflex in master and in this branch:

In master, the grave is more the left and and the acute is more centered.

In this branch, the grave and the acute are symmetrically centered on top of the I.

Do you want me to realign these as in master?

schriftgestalt · 2015-09-23T23:02:06Z

That all looks like a mistake. And I didn’t see a case where the marks that where positioned properly in the I (uppercase i), would need a different horizontal position on any of the wider letters. So center the anchors in the I and position them in the marks that they look good in on the I. Then you can position the anchors in all other letter.

Previously used ordinary caron above L

weiweihuanghuang · 2015-10-11T10:52:57Z

I noticed now just quickly in the Regular master, grave now is too far right, noticeable more here:

I'm going through and using manual alignment for some of these.

…p glyphs

weiweihuanghuang · 2015-10-11T11:10:58Z

Cool. I’m fixing the top anchor on those ascenders setting them all to 730.

Why 730? it should be 700, I'm changing them now too

…mb.asc

schriftgestalt · 2015-10-11T11:18:19Z

The acute look fine for me. But if you don't like it, move The anchor in the grave

weiweihuanghuang · 2015-10-11T11:33:53Z

The acute look fine for me. But if you don't like it, move The anchor in the grave

Can't, it will move it everywhere else too!

weiweihuanghuang · 2015-10-11T11:35:16Z

@moyogo new branch with your changes https://github.com/weiweihuanghuang/Work-Sans/tree/moyogo-diac

schriftgestalt · 2015-10-11T15:29:59Z

The base of the acute/grave should not be centered above the glyph: http://diacritics.typo.cz/index.php?id=4

The placement on your I and A is exactly as it should be.

weiweihuanghuang · 2015-10-11T16:35:50Z

The base of the acute/grave should not be centered above the glyph: http://diacritics.typo.cz/index.php?id=4

Why not? [Edit] These guidelines don't say it's bad practise to center it:

Horizontal placing of acute may prove difficult: it should incline towards the right slightly, but at the same time it should not “fall off” the character. The more steep the acute, the closer the lower tip can be to the optical centre of the letter; the more horizontal it is, the more the whole accent needs to be optically centered above the letter. The angle of the acute to the vertical axis of the type should be the same as of grave. Because characters with acute are included in most western typefaces, there are many examples as how to draw it properly. If the weight of the acute stroke varies, it should narrow towards the bottom

Having looked on MyFonts I do see a lot of examples where the lower edge goes beyond the base glyph.

moyogo · 2015-10-16T07:11:30Z

@moyogo new branch with your changes https://github.com/weiweihuanghuang/Work-Sans/tree/moyogo-diac

@weiweihuanghuang thanks. I rebased my branch on yours.
It looks good to me.
Is there anything else I can do?

weiweihuanghuang · 2015-10-16T16:51:46Z

@moyogo Great, I don't think so, I'll generate fonts and check that none of the custom TTFA hinting has changed. I'll also test the GPOS combos.

Btw do you know of a resource/tool where I can find what languages are now supported with these extended combining diacritics + anchors — or do you have an idea?

moyogo · 2015-10-17T16:51:12Z

Btw do you know of a resource/tool where I can find what languages are now supported with these extended combining diacritics + anchors — or do you have an idea?

The CLDR has data to get such a list, but it’s stil a work-in-progress and many languages are missing or have incomplete data.
Comparing the font before and after doesn’t give any difference in the number of CLDR languages supported. On OS X, Font Book uses the CLDR data to list what languages are supported by a font.

You’ll get a much larger CLDR-language count if you add all the precomposed characters that use the current diacritics.

A bunch of languages that benefit from these combining marks also use missing characters (either precomposed character like ŵ, ỹ, etc. or additional characters like ɛ, ɔ, etc.).
I had started adding some additional characters used in African orthographies: https://github.com/moyogo/work-sans/tree/latext/ɔ; but I still have quite a few things to do.

davelab6 · 2016-01-03T10:40:43Z

My Pyfontaine aspires to be such a tool

moyogo added 3 commits September 13, 2015 10:46

Adjust width of commaaccent 0326, it’s a non-spacing combining mark

78fae58

moyogo force-pushed the latext/diacritics branch 3 times, most recently from 86aef7f to df72fc1 Compare September 13, 2015 13:01

Add top and bottom anchors to uppercase where they were missing

3b6b64f

moyogo force-pushed the latext/diacritics branch from df72fc1 to 3b6b64f Compare September 13, 2015 13:10

moyogo added 5 commits September 23, 2015 20:13

Add top and bottom anchors to lowercase where they were missing; disa…

d827c3c

…ble automatic alignment on uni013C.ss02

Add substitution for @CMB after ascender lc in 'calt', and add top an…

3dd9df6

…d bottom to lowercase missed in previous commit

Add hungarumlautcmb.cap and adjust glyphs using combining diacritics …

76f0463

…as components from previous commits

Adjust Egrave, gravecmb.cap misplaced due to previous commit

b337a1f

Add top and bottom anchors to small caps where they were missing

8933728

moyogo and others added 7 commits September 25, 2015 08:41

Center caroncmb.cap in Bold master

275e838

Remove unicode from uniF8FF.001

30725e7

Fixed Lcaron.swsh

bb966f9

Previously used ordinary caron above L

Change hungarumlautcmb.cap to 0 width

80c11f0

Edit acutecmb.cap, caroncmb.cap outline

96b5427

Adjusted weight and length of acutecmb.asc to match rest of diacritics

1331d9a

Adjusted outline of caroncmb.asc to match rest of diacritics

2496c72

Fixed position of some acute.cap, grave.cap, caron.cap, circumflex.ca…

ade11d2

…p glyphs

Added *cmb.asc, fixed top anchor in Black asc glyphs, Added fea for c…

aebdcbe

…mb.asc

weiweihuanghuang added 3 commits October 11, 2015 13:24

Fixed N.swsh RSB

2b0b1f8

Adjusted outlines of caroncmb.asc gravecmb.asc

c9c47a4

Fixed lacute.ss02

bdceb70

weiweihuanghuang merged commit bdceb70 into weiweihuanghuang:master Jan 3, 2016

moyogo deleted the latext/diacritics branch August 11, 2016 09:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Combining diacritics from diacritics already in fonts #17

Combining diacritics from diacritics already in fonts #17

moyogo commented Sep 13, 2015

weiweihuanghuang commented Sep 13, 2015

moyogo commented Sep 13, 2015

davelab6 commented Sep 13, 2015

moyogo commented Sep 13, 2015

davelab6 commented Sep 13, 2015 via email

davelab6 commented Sep 13, 2015 via email

moyogo commented Sep 13, 2015

weiweihuanghuang commented Sep 14, 2015

moyogo commented Sep 14, 2015

moyogo commented Sep 14, 2015

weiweihuanghuang commented Sep 16, 2015

weiweihuanghuang commented Sep 16, 2015

moyogo commented Sep 20, 2015

weiweihuanghuang commented Sep 20, 2015

moyogo commented Sep 20, 2015

moyogo commented Sep 23, 2015

weiweihuanghuang commented Sep 23, 2015

schriftgestalt commented Sep 23, 2015

moyogo commented Sep 23, 2015

weiweihuanghuang commented Sep 23, 2015

moyogo commented Sep 23, 2015

schriftgestalt commented Sep 23, 2015

weiweihuanghuang commented Oct 11, 2015

weiweihuanghuang commented Oct 11, 2015

schriftgestalt commented Oct 11, 2015

weiweihuanghuang commented Oct 11, 2015

weiweihuanghuang commented Oct 11, 2015

schriftgestalt commented Oct 11, 2015

weiweihuanghuang commented Oct 11, 2015

moyogo commented Oct 16, 2015

weiweihuanghuang commented Oct 16, 2015

moyogo commented Oct 17, 2015

davelab6 commented Jan 3, 2016 via email

Combining diacritics from diacritics already in fonts #17

Combining diacritics from diacritics already in fonts #17

Conversation

moyogo commented Sep 13, 2015

weiweihuanghuang commented Sep 13, 2015

moyogo commented Sep 13, 2015

davelab6 commented Sep 13, 2015

moyogo commented Sep 13, 2015

davelab6 commented Sep 13, 2015 via email

davelab6 commented Sep 13, 2015 via email

moyogo commented Sep 13, 2015

weiweihuanghuang commented Sep 14, 2015

moyogo commented Sep 14, 2015

moyogo commented Sep 14, 2015

weiweihuanghuang commented Sep 16, 2015

weiweihuanghuang commented Sep 16, 2015

moyogo commented Sep 20, 2015

weiweihuanghuang commented Sep 20, 2015

moyogo commented Sep 20, 2015

moyogo commented Sep 23, 2015

weiweihuanghuang commented Sep 23, 2015

schriftgestalt commented Sep 23, 2015

moyogo commented Sep 23, 2015

weiweihuanghuang commented Sep 23, 2015

moyogo commented Sep 23, 2015

schriftgestalt commented Sep 23, 2015

weiweihuanghuang commented Oct 11, 2015

weiweihuanghuang commented Oct 11, 2015

schriftgestalt commented Oct 11, 2015

weiweihuanghuang commented Oct 11, 2015

weiweihuanghuang commented Oct 11, 2015

schriftgestalt commented Oct 11, 2015

weiweihuanghuang commented Oct 11, 2015

moyogo commented Oct 16, 2015

weiweihuanghuang commented Oct 16, 2015

moyogo commented Oct 17, 2015

davelab6 commented Jan 3, 2016 via email