Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GDEF feature writer #480

Merged
merged 21 commits into from
Mar 19, 2021
Merged

Add GDEF feature writer #480

merged 21 commits into from
Mar 19, 2021

Conversation

moyogo
Copy link
Collaborator

@moyogo moyogo commented Mar 5, 2021

Fixes #456.
This adds a GDEF feature writer to the default feature writers (kern feature writer, mark feature writer).

The GdefFeatureWriter first compiles the GDEF from the feature code (feaLib uses the GDEF table definition or builds it from lookups) and uses it to get Glyph Class Definitions (bases, ligatures, marks, components) and Ligature Carets.
It then updates the Glyph Class Definitions with the UFO lib "public.openTypeCategories" data, except for glyphs already defined, and updates the Ligature Carets with the UFO anchors with names starting with "caret_" and "vcaret_", except for glyphs with ligature carets already defined.

@moyogo moyogo mentioned this pull request Mar 5, 2021
definition if present or from the lookups.
Then it updates the GDEF with the public.openTypeCategories and the
ligature anchors for glyphs that are not already in GlyphClassDefs or
do not have ligature carets.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to say this glyph has no GDEF class but please don’t auto-generate one, would setting public.openTypeCategories to an empty string signal that?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, public.openTypeCategories values can only be base, ligature, mark, component.

If no GDEF is defined in the feature code and the glyph is not in public.openTypeCategories, it will only be in a GDEF Glyph Class if a lookup type makes it so.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK.

Copy link
Collaborator

@madig madig Mar 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://unifiedfontobject.org/versions/ufo3/lib.plist/#publicopentypecategories

Values must be one of base, mark, ligature or component.

So you can only leave it out, but then the writer would assume auto :o

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but then the writer would assume auto :o

Which is exactly what I’m trying to avoid. I’m converting fonts from SFD where you can leave out glyph class while opting out from automatic behavior. I currently also generate a GDEF table, but I was checking if I can skip that bit now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you describe how glyphs end up in GDEF Glyphs Class when you don't want to? What's the actual use case?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no specific use case, if the SFD says this glyph has no GDEF class and don’t guess one, I want to maintain that. Say I have a place holder glyph that gets decomposed in ccmp feature but uses components, it might get anchors propagated to it and classified as base glyph, but GDEF class does not matter for it and it would save a few bytes to not et a GDEF glyph class to it (and if you have a few hundreds of these glyphs it might add up).

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 6, 2021

I have no specific use case, if the SFD says this glyph has no GDEF class and don’t guess one, I want to maintain that. Say I have a place holder glyph that gets decomposed in ccmp feature but uses components, it might get anchors propagated to it and classified as base glyph, but GDEF class does not matter for it and it would save a few bytes to not et a GDEF glyph class to it (and if you have a few hundreds of these glyphs it might add up).

OK, I see. It would be nice to be able to indicate a glyph should be assigned the GDEF Glyph Class 0. That could also be used by the mark feature writer to avoid using the anchors of such a glyph.
Values could then be unassigned for class 0, or base, ligature, mark, component for class 1, 2, 3, 4.

When a glyph does not have a value assigned, its default would be class 0 in the GDEF or any of the other values depending on anchor propagation and lookups it ends up in.

Let's open a PR for the UFO spec.

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 6, 2021

@khaledhosny Actually, would assigning the component glyph class not have the same effect?

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 6, 2021

@khaledhosny Actually, would assigning the component glyph class not have the same effect?

Nevermind, of course it doesn't.

@anthrotype
Copy link
Member

first compiles the GDEF from the feature code ... then updates the Glyph Class Definitions with the UFO lib "public.openTypeCategories", except for glyphs already defined

I'm not 100% convinced this is the best approach. I think if one opts in to using this new openTypeCategories, those should be used as is (whether they are correct or not), instead of only when a class definition can't be inferred automatically by feaLib. Garbage in, garbage out.
Having to compile the whole of GPOS/GSUB when no explicit GDEF table definition is present in features.fea (which is all the time basically) is too costly in my opinion, just for the sake of obtaining the feaLib-inferred GDEF glyph class definitions -- and then throwing the resulting tables away (just to be recompiled later on in the build process).

@anthrotype
Copy link
Member

when a user provides an explicit GDEF table definition in features.fea containing GlyphClassDefs statements, both ufo2ft and feaLib currently take that at face value, they don't question whether those are correct or complete. Similarly I would expect from this new public.openTypeCategories.
Either the compiler guesses when no explicit data from the user is present, or it's the responsibility of the font developer to make sure the classifications provided are correct and complete. A middle ground only makes things more confusing.

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 9, 2021

The difference between GDEF in feature code and public.openTypeCategories is that GDEF in feature code is final, when it is defined in the feature code all the glyphs have a GDEF glyph class assigned, whereas public.openTypeCategories may be incomplete and only some glyphs have a glyph class defined while others don't.

If no GDEF is defined in the feature code, public.openTypeCategories should be used by the feature writers to exclude the glyphs that have public.openTypeCategories values that would create a conflict (for example don't include bases as marks in the mark feature).

If we don't want to spend cycles building a GDEF to get the glyph classes and then build it again (which we do for the GSUB, but that's fine for some reason), then the GDEF feature writer should be called after the features have been compiled and should be a GDEF table post processor.

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 9, 2021

We can take the public.openTypeCategories that are defined instead of the glyph class inferred by the features, but that glyph class of the glyphs that don't have any public.openTypeCategories defined still need to come from somewhere.

@anthrotype
Copy link
Member

public.openTypeCategories may be incomplete

that's what I am not conviced about. Why?

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 9, 2021

that's what I am not conviced about. Why?

https://unifiedfontobject.org/versions/ufo3/lib.plist/#publicopentypecategories

The dictionary may contain glyph names that are not in the font. The dictionary may not contain a key, value pair for all glyphs in the font.

@anthrotype
Copy link
Member

glyph names that are not in the font

we'd just skip those

dictionary may not contain a key, value pair for all glyphs in the font.

so what? if we don't have the data for specific glyphs, we do nothing for those

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 9, 2021

we do nothing for those

What does that mean?

@anthrotype
Copy link
Member

I mean, if a glyph has no public.openTypeCategory we don't add it to the GDEF

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 9, 2021

I mean, if a glyph has no public.openTypeCategory we don't add it to the GDEF

Then we do something, we add to the GDEF glyph class 0. That's what that means.

@anthrotype
Copy link
Member

I guess so, yeah.

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 9, 2021

There are two issues I think:

  1. What to do when the GDEF is in the feature code and there are public.openTypeCategories values.
    We either take the GDEF only, take the GDEF and augment or update it with public.openTypeCategories values, or only take the public.openTypeCategories values to write the GDEF. Feature writers should follow the same logic.

  2. What to do when a glyph is used as a base, ligature or mark in a feature when it doesn't have any public.openTypeCategories value defined, or when the public.openTypeCategories value contradicts how the glyph is used (which shouldn't happen in theory with feature writers).

@anthrotype
Copy link
Member

when the GDEF is in the feature code and there are public.openTypeCategories

the default behavior would be "skip" like the other writers. If a GDEF GlyphClassDefs are already present they would take priority

when a glyph is used as ... when it doesn't have any public.openTypeCategories, or when the value contradicts how the glyph is used

then the compiler would not be at fault, but source data that's incorrect.

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 9, 2021

#456 (comment)

another approach could be that you teach the mark feature writer about these new public.openTypeCategory key, so that they have the same effect of an explicit GDEF table override in the features.fea.

Then you only add a LigatureCaretsFeatureWriter or something to handle the carets only

Maybe this approach would work better then. At least for me and your past self.

If a GDEF is not defined in the feature code, only use the public.openTypeCategories values in feature writers and let feaLib build the GDEF.
Then update the ligature carets based on anchors with a post processor.

@anthrotype
Copy link
Member

that could work. But I feel a bit uneasy that we'd ignore a user's explicit settings (in the form of public.openTypeCategories) when building the GDEF table itself.
Can we think of a legitimate case when one would want to override the feaLib-generated GDEF? I.e. omit some classifications or add some that won't be automatically inferred?

Also this may be relevant: fonttools/fonttools#2210

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 9, 2021

Can we think of a legitimate case when one would want to override the feaLib-generated GDEF? I.e. omit some classifications or add some that won't be automatically inferred?

@khaledhosny assigning /space to the mark glyph class so it can be ignored in lookups with the ignore marks flag, for a style of Arabic script with contextual features happening between words separated by zero-width spaces.
But more generally some mark glyphs may be positioned correctly without needing anchors and mark feature lookups, while still needing to be ignored by lookups with ignore marks flag.

@khaledhosny
Copy link
Collaborator

Here is what I think would cause the least surprises and would give users the control they need:

  1. If the font has no public.openTypeCategory, do what ufo2ft is currently doing (i.e. before this PR):
  2. If the font has public.openTypeCategory, use these categories in feature writers and write a GDEF table based on them, no guessing at all.

This has the only downside of not allowing the user to explicitly define some categories and let the compiler guess the rest. Although I recognize this as legitimate use case (a one that I used in some fonts), in reality public.openTypeCategory is likely to come from glyphsLib and similar that know how to fill the data for most glyphs anyway and their guess is as good as feaLib’s.

@anthrotype
Copy link
Member

I agree with least surprises and no guessing.

If we cared to support what Denis proposes, we could add an additional mode whereby the GdefFeatureWriter would infer GDEF classes (that were not explicitly defined via public.openTypeCategories) from the current GPOS/GSUB. But I would make that not the default.

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 9, 2021

@khaledhosny See unified-font-object/ufo-spec#171

If we cared to support what Denis proposes, we could add an additional mode whereby the GdefFeatureWriter would infer GDEF classes (that were not explicitly defined via public.openTypeCategories) from the current GPOS/GSUB. But I would make that not the default.

That's an easy fix, and is almost the same, at least with feature writers that use public.openTypeCategories.
The only difference would be for lookups already in the feature code using glyphs in a way that conflicts with the public.openTypeCategories values.
I don't have a preference for glyph classes inferred from lookups or openTypeCategories.
The current PR does, to follow the "skip" logic (even if not explicit).

I can change the PR that way if that's fine.

@anthrotype
Copy link
Member

anthrotype commented Mar 9, 2021

explicit openTypeCategories should always take precedence over (implicit) inferred categories.

Is the intention of this proposed "unassigned" public.openTypeCategory to have the compiler automatically infer the GDEF classes for the glyphs that were not explicitly listed in openTypeCategories (neither assigned, nor explicitly "unassigned")?

I still don't like that we have to build the temporary GPOS and GSUB and the throw them away. For the kern feature writer, we have no way to classify pairs as LTR or RTL so we need to do the GSUB closure, but for this GDEF classes we do have explicit user settings (public.openTypeCategories).

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 9, 2021

explicit openTypeCategories should always take precedence over (implicit) inferred categories.
OK.

Is the intention of this proposed "unassigned" public.openTypeCategory to have the compiler automatically infer the GDEF classes for the glyphs that were not explicitly listed in openTypeCategories (neither assigned, nor explicitly "unassigned")?

No. It's up to authoring tools (feature writers, compilers) to decide what to do with glyphs that have no openTypeCategories value, whether they follow the AFDKO spec and infer GDEF glyph class from lookup types or do something else it up to them.
The point of the "unassigned" openTypeCategories value is to tell the authoring tools, that those should be in the the GDEF glyph class 0 and no other glyph class.

The point is to make it easier for user to define what GDEF glyph class certain glyphs should be in, which authoring tools may put in another GDEF glyph class for othre reason, without having to do the same as writing a complete GDEF in the feature code.

I still don't like that we have to build the temporary GPOS and GSUB and the throw them away. For the kern feature writer, we have no way to classify pairs as LTR or RTL so we need to do the GSUB closure, but for this GDEF classes we do have explicit user settings (public.openTypeCategories).

We can have a GDEF post processor instead of a feature writer. Or we can store the GPOS and GSUB in the compiler object, like the kern and mark feature writer do for the GSUB, and use them.

@anthrotype
Copy link
Member

Again, I'd like to keep things simple, basically what Khaled suggested above. If one wants to take advantage of the automatic GDEF class inference in feaLib, one does not use any public.openTypeCategories; or one can write a simple script that compiles the features and extract the inferred GDEF class info and store them as explicit public.openTypeCategories (and possibly amend them manually as they wish).

@madig
Copy link
Collaborator

madig commented Mar 9, 2021

I vote in favor of doing the simplest thing first (Khaled's suggestion) and going from there.

@khaledhosny
Copy link
Collaborator

An alternative, more complex approach:

  1. Have a pre-processor that checks for glyphs without public.openTypeCategory:
    • If the feature code has GDEF table, fill missing public.openTypeCategory from it, no guessing.
    • If not, do the guessing done in this PR and fill public.openTypeCategory.
  2. Feature writers as well as GDEF writer work exclusively with public.openTypeCategory.

But I vote for doing the simple thing first until someone has a compelling use case for the complex one.

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 9, 2021

With a little change in the current PR, we should avoid building the GSUB and GPOS for inferred glyph classes when all glyphs have openTypeCategories values defined.

That said, I'm warming up to the idea of expecting the user or application generating the sources to define all glyphs' openTypeCategories values.
Inferring glyph classes is isn't that clearly defined and is rather messy if lookups use glyphs in multiple ways (like having two marks substituted for a single mark with a ligature substitution lookup inferring they are two components and a ligature).

@moyogo moyogo force-pushed the gdefFeatureWriter branch from 9635c94 to 90e0727 Compare March 17, 2021 17:50
Copy link
Member

@anthrotype anthrotype left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@moyogo moyogo force-pushed the gdefFeatureWriter branch from cdcdf67 to 02cbdf1 Compare March 18, 2021 21:18
@moyogo moyogo merged commit dd88011 into googlefonts:main Mar 19, 2021
@anthrotype
Copy link
Member

Thanks Denis!

@moyogo
Copy link
Collaborator Author

moyogo commented Mar 19, 2021

Thanks for the reviews @anthrotype, @madig, @khaledhosny.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Write GDEF feature writer
4 participants