Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chunk with newline char "\n" triggers font substitution (LiberationSans) #1024

Closed
pichlerm opened this issue Jan 17, 2024 · 3 comments
Closed
Labels

Comments

@pichlerm
Copy link
Contributor

Describe the bug
When a Chunk contains a newline char \n like in "Line 1 of 2 Hello\nLine 2 of 2 Ansi abcdef" the font (e.g. Helvetica) is replaced by LiberationSans, which is not wanted/needed.

To Reproduce
Code to reproduce the issue

Font font = new Font(Font.HELVETICA, 10, Font.NORMAL, Color.BLACK);
doc.add(new Chunk("Line 1 Hello World", font));
doc.add(Chunk.NEWLINE);
doc.add(new Chunk("Line 2 Ansi abcdef", font));
doc.add(Chunk.NEWLINE);

doc.add(new Chunk("Line 1 of 2 Hello\nLine 2 of 2 Ansi abcdef", font));
doc.add(Chunk.NEWLINE);

doc.add(new Chunk("123 ÄäÖöÜüß Stëᶂañoš Đoğīć ψάρι борщ", font));
// Open PDF:       123 ÄäÖöÜüß Stëᶂañoš Đoğīć ψάρι борщ
// itext 2.1.7:    123 ÄäÖöÜüß Stëañoš o

Expected behavior
The Chunk that contains two lines "Line 1 of 2 Hello\nLine 2 of 2 Ansi abcdef" but no special characters should use the specified Font.HELVETICA as it is when adding each line and a Chunk.NEWLINE separately.

For special chars like "123 ÄäÖöÜüß Stëᶂañoš Đoğīć ψάρι борщ" the automatic font replacement is wanted and works fine, because otherwise the non-Ansi chars are missing in the Helvetica font (that was the case in itext 2.1.7).

Screenshots
Current behavior:
grafik
"Line 1 of 2 Hello" uses a different font (LiberationSans, 1 with base line) than "Line 1 Hello World" (Helvetica, 1 w/o base line)

Expected behavior:
grafik
"Line 1 of 2 Hello" uses the same font as "Line 1 Hello World" (Helvetica, 1 w/o base line)
The font substitution in "123 ..." is ok, to show all special chars that are not included in Helvetica.

System (please complete the following information):

  • OS: windows 11
  • Used Font: Font.HELVETICA

Additional context
This seems to be caused by the character range check in
https://github.com/LibrePDF/OpenPDF/blob/master/openpdf/src/main/java/com/lowagie/text/pdf/PdfChunk.java#L213
which got extended by the tab char c == 0x09 in
#519

Changing the code from
allMatch(c -> ((c >= 0x20 && c <= 0xFF) || c == 0x09))
to
allMatch(c -> (c >= 0x0 && c <= 0xFF)))
solved the problem for me, but I'm not sure whether other Ansi chars below 0x20 make a difference.
https://de.wikipedia.org/wiki/Windows-1252

@pichlerm pichlerm added the bug label Jan 17, 2024
@pichlerm
Copy link
Contributor Author

Sorry for the link to the german page of Ansi/Cp1252, here in english: https://en.wikipedia.org/wiki/Windows-1252#Codepage_layout

@andreasrosdal
Copy link
Contributor

andreasrosdal commented Jan 17, 2024

@pichlerm Thank you for reporting this.
So I understand that this problem was introduced in this pull request:
#519 @Wugengxian

Are either of you able to create a new pull request to solve this?
This seems like a good solution:

Changing the code from
allMatch(c -> ((c >= 0x20 && c <= 0xFF) || c == 0x09))
to
allMatch(c -> (c >= 0x0 && c <= 0xFF)))

@pichlerm
Copy link
Contributor Author

pichlerm commented Jan 17, 2024

Yes, this is my suggested solution (did copy and compile the source file locally, works for my use case).
The bug is not caused by merge #519 but happens at the same place:

old code: 0x20 ... 0xFF (neither tab nor newline)
if (chunk.getContent().chars().allMatch(c -> (c >= 0x20 && c <= 0xFF))) {

fix of bug #454 adds the tab char 0x09 to 0x20 ... 0xFF
if (chunk.getContent().chars().allMatch(c -> ((c >= 0x20 && c <= 0xFF) || c == 0x09))) {

suggested fix for this bug that also fixes #454: handle all chars 0x0 ... 0xFF using the base font as the special chars from 0x0 below 0x20 are not expected to have a visual representation.
if (chunk.getContent().chars().allMatch(c -> (c >= 0x0 && c <= 0xFF))) {

uploaded pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants