Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid "reloading" TTFonts in PostProcessor.__init__ #485

Closed
madig opened this issue Mar 18, 2021 · 20 comments · Fixed by #671
Closed

Avoid "reloading" TTFonts in PostProcessor.__init__ #485

madig opened this issue Mar 18, 2021 · 20 comments · Fixed by #671

Comments

@madig
Copy link
Collaborator

madig commented Mar 18, 2021

Going from fonttools/fonttools#1095, quite some time is spent serializing/deserializing fonts somewhere inbetween compiling the final binary. This slows down the entire process. PostProcessor.__init__ seems to "reload" every font, maybe to clean up loose data? I think it should maybe just work with what it's given?

@simoncozens
Copy link
Contributor

simoncozens commented Mar 18, 2021

One thing the postprocessor does is rename the glyphs to production names. It's very hard to do this without a save/load, because internally to fonttools, glyphs are referred to by name, not by GID. If you change alef-ar to uni0627 in GlyphOrder, but font["GSUB"].table.LookupList.Lookups[5].Subtable[0].InputCoverage[0] still refers to alef-ar, your font will no longer save. So it's easier to save and reload than to go through every subtable everywhere trying to remap things.

@madig
Copy link
Collaborator Author

madig commented Mar 18, 2021

Argh. That's what I'm running into locally... So, what do?

@anthrotype
Copy link
Member

anthrotype commented Mar 18, 2021

yes, what Simon said.
however note that for VFs I think we do the glyph renaming only on the final VF, not on each master TTF

@anthrotype
Copy link
Member

maybe the postprocessor should only do the reloading if it's going to actually rename glyphs, not unconditionally like now

@madig
Copy link
Collaborator Author

madig commented Mar 18, 2021

There's a few other things that happen, I'll push a branch.

@behdad
Copy link
Collaborator

behdad commented Mar 18, 2021

Renaming glyphs is such a misunderstood op. One can build a font, make sure all tables are compiled, then do the renaming glyphs, which would involve just loading and saving the post table...

@behdad
Copy link
Collaborator

behdad commented Mar 18, 2021

Or should even be possible to provide the rename mapping at load time.

@behdad
Copy link
Collaborator

behdad commented Mar 18, 2021

So, a lot of this would be a non-issue if we were dealing with immutable data types, such that any modification would involve copy-on-write duplication... Many functional languages work that way. In Python, I'm sure there's some dark black magic with metaclasses to get some of that, but there will be dragons. Also, there's no way to get compile-time errors for violations with Python.

Anyway, attrs2 on this: https://www.attrs.org/en/stable/examples.html#immutability

@madig
Copy link
Collaborator Author

madig commented Mar 18, 2021

I'd take some immutability, but the issues I'm seeing are data normalization (e.g. post.italicAngle of 0.0 does not get normalized to 0 like when reloading), data sorting (e.g. name table being in a different order) and glyph names staying as they are 😐

@behdad
Copy link
Collaborator

behdad commented Mar 18, 2021

I see.

@madig
Copy link
Collaborator Author

madig commented Mar 18, 2021

So, what do? Ideally the compilation pass spits out something that we don't need to reload to make it mergable. Or we reload as little as possible? Any insights appreciated over at #486.

@behdad
Copy link
Collaborator

behdad commented Mar 18, 2021

So, what do? Ideally the compilation pass spits out something that we don't need to reload to make it mergable.

Yes, I'm happy to work on making that reality.

Or we reload as little as possible? Any insights appreciated over at #486.

Okay great. That gives me a point to start. Let's continue there.

@madig
Copy link
Collaborator Author

madig commented Apr 1, 2021

So, going from #486, I stumbled over the following when reloading strictly only for renaming or post version changes:

  • maxp table missing values
  • various post attributes being filled with verbatim Python types rather than the final value
  • the post extraNames list not being filled in
  • the STAT table not having AxisValueCount filled in
  • the name table being unsorted

Doing in the following in the middle of PostProcessor.process makes tests pass:

        # Sort the name table in case varLib added new entries.
        if "name" in self.otf:
            self.otf["name"].names.sort()
        if "maxp" in self.otf:
            self.otf["maxp"].compile(self.otf)
        if "STAT" in self.otf:
            data = self.otf["STAT"].compile(self.otf)
            self.otf["STAT"].decompile(data, self.otf)
        if "post" in self.otf and self.otf["post"].formatType == 2.0:
            self.otf["post"].extraNames = []
            self.otf["post"].compile(self.otf)

@behdad
Copy link
Collaborator

behdad commented Apr 1, 2021

  • the post extraNames list not being filled in

We should completely remove extraNames. It serves no functional purpose.

@behdad
Copy link
Collaborator

behdad commented Apr 1, 2021

  • the STAT table not having AxisValueCount filled in

Just make the STAT builder set it. The Count values in all of otData are also redundant and could be removed if we are willing to take the breakage and update code depending on it.

@khaledhosny
Copy link
Collaborator

The Count values in all of otData are also redundant and could be removed

Yes, please! It always seemed odd to have a separate Count field when the data is stored in a Python list anyway.

@behdad
Copy link
Collaborator

behdad commented Apr 4, 2021

I feel like a fonttools API break is gaining support...

@madig
Copy link
Collaborator Author

madig commented Apr 4, 2021

Would this conflict with fT's ability to carry and serialize meaningless/wrong input data in loaded fonts or is it sufficiently high level that that makes little sense? Is that even a design goal?

@khaledhosny
Copy link
Collaborator

These *Counts are written as comments in TTX, so probably can be kept regardless of the API change. In Python, LookupCount would be replaced by len(Lookup) and so on. May be the first step would to stop using these *Counts internally but set them when decompiling, so code building tables from scratch don’t have to set them, and later be removed entirely.

@behdad
Copy link
Collaborator

behdad commented Apr 6, 2021

Would this conflict with fT's ability to carry and serialize meaningless/wrong input data in loaded fonts or is it sufficiently high level that that makes little sense?

The count values are always ignored when compiling the font.

Is that even a design goal?

Nope. We don't even pretend.

These *Counts are written as comments in TTX, so probably can be kept regardless of the API change. In Python, LookupCount would be replaced by len(Lookup) and so on. May be the first step would to stop using these *Counts internally but set them when decompiling, so code building tables from scratch don’t have to set them, and later be removed entirely.

Or replace them with properties that warn when accessed; set a deadline for removing them, and advise code to switch. The replacement code can even be offered in the warning.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants