chunk with newline char "\n" triggers font substitution (LiberationSans) #1024

pichlerm · 2024-01-17T11:26:48Z

Describe the bug
When a Chunk contains a newline char \n like in "Line 1 of 2 Hello\nLine 2 of 2 Ansi abcdef" the font (e.g. Helvetica) is replaced by LiberationSans, which is not wanted/needed.

To Reproduce
Code to reproduce the issue

Font font = new Font(Font.HELVETICA, 10, Font.NORMAL, Color.BLACK);
doc.add(new Chunk("Line 1 Hello World", font));
doc.add(Chunk.NEWLINE);
doc.add(new Chunk("Line 2 Ansi abcdef", font));
doc.add(Chunk.NEWLINE);

doc.add(new Chunk("Line 1 of 2 Hello\nLine 2 of 2 Ansi abcdef", font));
doc.add(Chunk.NEWLINE);

doc.add(new Chunk("123 ÄäÖöÜüß Stëᶂañoš Đoğīć ψάρι борщ", font));
// Open PDF:       123 ÄäÖöÜüß Stëᶂañoš Đoğīć ψάρι борщ
// itext 2.1.7:    123 ÄäÖöÜüß Stëañoš o

Expected behavior
The Chunk that contains two lines "Line 1 of 2 Hello\nLine 2 of 2 Ansi abcdef" but no special characters should use the specified Font.HELVETICA as it is when adding each line and a Chunk.NEWLINE separately.

For special chars like "123 ÄäÖöÜüß Stëᶂañoš Đoğīć ψάρι борщ" the automatic font replacement is wanted and works fine, because otherwise the non-Ansi chars are missing in the Helvetica font (that was the case in itext 2.1.7).

Screenshots
Current behavior:

"Line 1 of 2 Hello" uses a different font (LiberationSans, 1 with base line) than "Line 1 Hello World" (Helvetica, 1 w/o base line)

Expected behavior:

"Line 1 of 2 Hello" uses the same font as "Line 1 Hello World" (Helvetica, 1 w/o base line)
The font substitution in "123 ..." is ok, to show all special chars that are not included in Helvetica.

System (please complete the following information):

OS: windows 11
Used Font: Font.HELVETICA

Additional context
This seems to be caused by the character range check in
https://github.com/LibrePDF/OpenPDF/blob/master/openpdf/src/main/java/com/lowagie/text/pdf/PdfChunk.java#L213
which got extended by the tab char c == 0x09 in
#519

Changing the code from
allMatch(c -> ((c >= 0x20 && c <= 0xFF) || c == 0x09))
to
allMatch(c -> (c >= 0x0 && c <= 0xFF)))
solved the problem for me, but I'm not sure whether other Ansi chars below 0x20 make a difference.
https://de.wikipedia.org/wiki/Windows-1252

The text was updated successfully, but these errors were encountered:

pichlerm · 2024-01-17T12:38:08Z

Sorry for the link to the german page of Ansi/Cp1252, here in english: https://en.wikipedia.org/wiki/Windows-1252#Codepage_layout

andreasrosdal · 2024-01-17T13:30:05Z

@pichlerm Thank you for reporting this.
So I understand that this problem was introduced in this pull request:
#519 @Wugengxian

Are either of you able to create a new pull request to solve this?
This seems like a good solution:

Changing the code from
allMatch(c -> ((c >= 0x20 && c <= 0xFF) || c == 0x09))
to
allMatch(c -> (c >= 0x0 && c <= 0xFF)))

pichlerm · 2024-01-17T13:41:28Z

Yes, this is my suggested solution (did copy and compile the source file locally, works for my use case).
The bug is not caused by merge #519 but happens at the same place:

old code: 0x20 ... 0xFF (neither tab nor newline)
if (chunk.getContent().chars().allMatch(c -> (c >= 0x20 && c <= 0xFF))) {

fix of bug #454 adds the tab char 0x09 to 0x20 ... 0xFF
if (chunk.getContent().chars().allMatch(c -> ((c >= 0x20 && c <= 0xFF) || c == 0x09))) {

suggested fix for this bug that also fixes #454: handle all chars 0x0 ... 0xFF using the base font as the special chars from 0x0 below 0x20 are not expected to have a visual representation.
if (chunk.getContent().chars().allMatch(c -> (c >= 0x0 && c <= 0xFF))) {

uploaded pull request.

pichlerm added the bug label Jan 17, 2024

pichlerm mentioned this issue Jan 17, 2024

issue 1024: newline (or other control char below 0x20) should not switch font #1026

Merged

2 tasks

andreasrosdal closed this as completed Jan 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chunk with newline char "\n" triggers font substitution (LiberationSans) #1024

chunk with newline char "\n" triggers font substitution (LiberationSans) #1024

pichlerm commented Jan 17, 2024

pichlerm commented Jan 17, 2024

andreasrosdal commented Jan 17, 2024 •

edited

Loading

pichlerm commented Jan 17, 2024 •

edited

Loading

chunk with newline char "\n" triggers font substitution (LiberationSans) #1024

chunk with newline char "\n" triggers font substitution (LiberationSans) #1024

Comments

pichlerm commented Jan 17, 2024

pichlerm commented Jan 17, 2024

andreasrosdal commented Jan 17, 2024 • edited Loading

pichlerm commented Jan 17, 2024 • edited Loading

andreasrosdal commented Jan 17, 2024 •

edited

Loading

pichlerm commented Jan 17, 2024 •

edited

Loading