Letter ä not showing in pdf #8234

pchtri · 2017-04-04T09:44:13Z

Link to PDF file (or attach file here):
FF52_CR_ä_missing.pdf

Configuration:

Web browser and its version: FireFox 52.0.1
Operating system and its version: Windows 7 and 10
PDF.js version: 1.7.406
Is an extension:

Steps to reproduce the problem:

Open the attached file in https://mozilla.github.io/pdf.js/web/viewer.html or in FireFox 52.0.1

What is the expected behavior? (add screenshot)

The same pdf works fine in Acrobat Reader, Chrome and all other viewers I've tested.

What went wrong? (add screenshot)

The small letter ä is shown in the same color as the background (white). If you select the text and paste it to notepad ä is there, but it doesnt show in the pdf.

Link to a viewer (if hosted on a site other than mozilla.github.io/pdf.js or as Firefox/Chrome extension):
https://mozilla.github.io/pdf.js/web/viewer.html

THausherr · 2017-04-05T16:34:55Z

PDFJS-8234_reduced.pdf
Reduced version of the file. The font does not have a glyph name for code 32 (or it is .notdef), but it has a ToUnicode entry (ä) and a glyph (ä). Likely a bug in the subsetter of the library that produced this file.

There have been lots of problems with trying to map glyphs to their unicode values. It's more reliable to just use the private use areas so the browser's font renderer doesn't mess with the glyphs. Using the private use area for all glyphs did highlight other issues that this patch also had to fix: * small private use area - Previously, only the BMP private use area was used which can't map many glyphs. Now, the (much bigger) PUP 16 area can also be used. * glyph zero not shown - Browsers will not use the glyph from a font if it is glyph id = 0. This issue was less prevalent when we mapped to unicode values since the fallback font would be used. However, when using the private use area, the glyph would not be drawn at all. This is illustrated in one of the current test cases (issue mozilla#8234) where there's an "ä" glyph at position zero. The PDF looked like it rendered correctly, but it was actually not using the glyph from the font. To properly show the first glyph it is always duplicated and appended to the glyphs and the maps are adjusted. * supplementary characters - The private use area PUP 16 is 4 bytes, so String.fromCodePoint must be used where we previously used String.fromCharCode. This is actually an issue that should have been fixed regardless of this patch.

There have been lots of problems with trying to map glyphs to their unicode values. It's more reliable to just use the private use areas so the browser's font renderer doesn't mess with the glyphs. Using the private use area for all glyphs did highlight other issues that this patch also had to fix: * small private use area - Previously, only the BMP private use area was used which can't map many glyphs. Now, the (much bigger) PUP 16 area can also be used. * glyph zero not shown - Browsers will not use the glyph from a font if it is glyph id = 0. This issue was less prevalent when we mapped to unicode values since the fallback font would be used. However, when using the private use area, the glyph would not be drawn at all. This is illustrated in one of the current test cases (issue mozilla#8234) where there's an "ä" glyph at position zero. The PDF looked like it rendered correctly, but it was actually not using the glyph from the font. To properly show the first glyph it is always duplicated and appended to the glyphs and the maps are adjusted. * supplementary characters - The private use area PUP 16 is 4 bytes, so String.fromCodePoint must be used where we previously used String.fromCharCode. This is actually an issue that should have been fixed regardless of this patch. * charset - Freetype fails to load fonts when the charset size doesn't match number of glyphs in the font. We now write out a fake charset with the correct length. This also brought up the issue that glyphs with seac/endchar should only ever write a standard charset, but we now write a custom one. To get around this the seac analysis is permanently enabled so those glyphs are instead always drawn as two glyphs.

timvandermeij · 2018-09-08T15:53:19Z

Fixed by #9340.

Snuffleupagus added font-conversion font-truetype labels Apr 4, 2017

brendandahl mentioned this issue Apr 5, 2017

Don’t skip glyph 0 in cmap. #8243

Merged

Snuffleupagus closed this as completed in #8243 Apr 6, 2017

brendandahl mentioned this issue Jan 4, 2018

Map all glyphs to the private use area and duplicate the first glyph. #9340

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Letter ä not showing in pdf #8234

Letter ä not showing in pdf #8234

pchtri commented Apr 4, 2017

THausherr commented Apr 5, 2017 •

edited

Loading

timvandermeij commented Sep 8, 2018

Letter ä not showing in pdf #8234

Letter ä not showing in pdf #8234

Comments

pchtri commented Apr 4, 2017

THausherr commented Apr 5, 2017 • edited Loading

timvandermeij commented Sep 8, 2018

THausherr commented Apr 5, 2017 •

edited

Loading