Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug fix 529: error /toUnicode #530

Merged
merged 1 commit into from
May 2, 2021
Merged

Conversation

Wugengxian
Copy link

@Wugengxian Wugengxian commented Apr 26, 2021

Description of the new Feature/Bugfix

OpenPDF should mapping the processedChar to origin char.

The /Tounicode after changing
image

The copy value of "ετε" in html
image

Related Issue: #529

Unit-Tests for the new Feature/Bugfix

  • Unit-Tests added to reproduce the bug
  • Unit-Tests added to the added feature
void testToUnicode() throws Exception {
        Document document = new Document();
        Document.compress = false;
        FileOutputStream outputStream = new FileOutputStream("output.pdf");
        PdfWriter.getInstance(document, outputStream);
        document.open();

        document.add(new Chunk("ετε", new Font(Font.SYMBOL)));
        document.close();
        PdfTextExtractor pdfTextExtractor = new PdfTextExtractor(new PdfReader("output.pdf"));
        Assertions.assertEquals("ετε", pdfTextExtractor.getTextFromPage(1));
    }

other Test:
It can pass the failed test of pull request #521 after providing true UnicodeMap

@Test
void getTextFromPageWithParagraphs_expectsTextHasNoMultipleSpaces() throws IOException {
// given
final Paragraph loremIpsumParagraph = new Paragraph(LOREM_IPSUM);
loremIpsumParagraph.setAlignment(Element.ALIGN_JUSTIFIED);
byte[] pdfBytes = createSimpleDocumentWithElements(
loremIpsumParagraph,
Chunk.NEWLINE,
loremIpsumParagraph
);
final String expected = LOREM_IPSUM + " " + LOREM_IPSUM;
// when
final String extracted = new PdfTextExtractor(new PdfReader(pdfBytes)).getTextFromPage(1);
// then
assertThat(extracted, equalToCompressingWhiteSpace(expected));
assertThat(extracted, not(containsString(" ")));
}

Compatibilities Issues

No

Testing details

No

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@Wugengxian Wugengxian changed the title Update FopGlyphProcessor.java Bug fix 529: error /toUnicode Apr 26, 2021
@asturio asturio merged commit 76c20ac into LibrePDF:master May 2, 2021
@Wugengxian Wugengxian deleted the fix-529 branch May 2, 2021 19:18
@asturio asturio added this to the 1.3.26 milestone May 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants