Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when building cmap we should error/warn if multiple glyphs map to same unicode #213

Closed
anthrotype opened this issue Feb 1, 2018 · 5 comments

Comments

@anthrotype
Copy link
Member

Since we only build one cmap subtable, it's impossible for the same unicode codepoint to map to multiple different glyphs (of course the opposite is possible, multiple unicodes mapping to the same glyph).

However, this code:

https://github.com/googlei18n/ufo2ft/blob/7821e5f50afbc152eb30b4f5be3c2ea58867fca2/Lib/ufo2ft/outlineCompiler.py#L152-L165

does not check if uni is already in mapping but simply overwrites a pre-existing value. And it is iterating over the items of a dictionary (unsorted collection) so the last value that prevails may be different upon every run (because of the randomized hash seed).

The UFO spec doesn't say anything about this, so I must assume it doesn't disallow multiple glyphs sharing the same unicode value.
I don't know how common this situation is, but we need to handle this in a clear and deterministic way.

I think we should iterate in the order of the font.glyphOrder and only take the first one, then all the following we warn and skip.

Or we could be stricter and raise an error, but that's probably too harsh.

@moyogo
Copy link
Collaborator

moyogo commented Feb 5, 2018

The UFO spec doesn't say anything about this, so I must assume it doesn't disallow multiple glyphs sharing the same unicode value.

That’s not good but may happen. Keeping only one (first or last in font.glyphOrder) and warning is good enough. The user should sort that out.

@sladen
Copy link

sladen commented Feb 5, 2018

Using the last (and warning the user that it is redefining/overriding probably makes more sense).

This is closer to what eg. a compiler does warning about (re-/defining twice), whilst still making it relatively easy to experiment with changing glyphs by copying-and-pasting the existing one and leaving it in place.

@behdad
Copy link
Collaborator

behdad commented Feb 6, 2018

I like error (at least by default) better. This is not a warning. This is just bad input. Compiler comparison is not accurate. Compiler errs if the two definition are different.

@khaledhosny
Copy link
Collaborator

I agree with Behdad, multiple glyphs having same unicode values means some charters will be unsupported by the font and in a random way, so it is rather a bug in the input whether or not the UFO spec is silent about it.

It would be helpful, though, to collect all such errors in the font and report them (glyph names, unicode value) and not just error at the first occurrence of duplication.

@moyogo
Copy link
Collaborator

moyogo commented Feb 7, 2018

As long as the user gets a useful error, failing is fine.

anthrotype added a commit that referenced this issue Mar 2, 2018
Take the first glyph found in the font.glyphOrder, and print a warning
in case other glyphs map to the same unicode value.

Fixes #213
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants