Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: find highlight matches wrong index with normalization for compound word with a line break after the hyphen. #19120

Closed
Mario185 opened this issue Nov 28, 2024 · 1 comment · Fixed by #19122
Assignees

Comments

@Mario185
Copy link

Attach (recommended) or Link to PDF file

compound_word_with_line_break_after_hyphen.pdf

Web browser and its version

Firefox 133

Operating system and its version

Windows 10

PDF.js version

v4.8.69

Is the bug present in the latest PDF.js version?

Yes

Is a browser extension

No

Steps to reproduce the problem

  1. Open the attached file "compound_word_with_line_break_after_hyphen.pdf"
  2. Search for the letter "o"

What is the expected behavior?

The letter "o" should be highlighted

What went wrong?

Because of the normalization the index is off by 1 and the letter "n" is highlighted
image
The span for the highlight contains only the letter "n" instead of "o"
image

Link to a viewer

No response

Additional context

It seems that during normalization in pdf_find_controller.js the position calculated for the case "Compound word with a line break after the hyphen." is wrong.

@Snuffleupagus
Copy link
Collaborator

Is this perhaps a regression from PR #18730, since #18693 (comment) mentions that fixing that one could lead to other issues?

calixteman added a commit to calixteman/pdf.js that referenced this issue Nov 28, 2024
…t contains a compound word on two lines

It fixes mozilla#19120.

The original text doesn't contain the cr so we must take that into account.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
3 participants