-
Notifications
You must be signed in to change notification settings - Fork 10k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add initial support for "Whole words" searching in the viewer #10028
Conversation
c30a528
to
fbb22bc
Compare
69cd3a2
to
fd8607a
Compare
fd8607a
to
def39e0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good with the comment addressed and a rebase. If you would like to add it, the Dutch translation for the new string is:
find_entire_word_label=Hele woorden
',': CharacterType.PUNCT, | ||
'.': CharacterType.PUNCT, | ||
';': CharacterType.PUNCT, | ||
':': CharacterType.PUNCT, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add a few more tests here for the missing character types, namely:
- HAN_LETTER
- KATAKANA_LETTER
- HIRAGANA_LETTER
- HALFWIDTH_KATAKANA_LETTER
- THAI_LETTER
I think we can simply find them with Wikipedia or perhaps the Mozilla test suite. It would be good to have test coverage for those too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it be possible to add a few more tests here for the missing character types [...]
Of course; this was just me being lazy and not wanting to spend to much time on it until there was feedback on the overall approach :-)
If you would like to add it, the Dutch translation for the new string is:
find_entire_word_label=Hele woorden
Thanks, it's been added to the patch.
def39e0
to
a50353b
Compare
As outlined in https://bugzilla.mozilla.org/show_bug.cgi?id=1282759 the internal Firefox name for the feature is `entireWord`, hence that name is used here as well for consistency (with "Whole words" being limited to the UI). Given existing limitations of the PDF.js search functionality, e.g. the existing problems of searching across "new lines", there's some edge-cases where "Whole words" searching will ignore (valid) results. However, considering that this is a pre-existing issue related to the way that the find controller joins text-content together, that shouldn't have to block this new feature in my opionion. *Please note:* In order to enable this feature in the `MOZCENTRAL` version, a small follow-up patch for [PdfjsChromeUtils.jsm](https://hg.mozilla.org/mozilla-central/file/tip/browser/extensions/pdfjs/content/PdfjsChromeUtils.jsm) will be required once this has landed in `mozilla-central`.
The property is intended to contain a reference to a DOM element, which not only is nowhere to be found *now* but appears to never have existed in the first place.
a50353b
to
b4edcce
Compare
From: Bot.io (Windows)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.215.176.217:8877/0f5be8cc5759682/output.txt |
From: Bot.io (Linux m4)ReceivedCommand cmd_unittest from @Snuffleupagus received. Current queue size: 0 Live output at: http://54.67.70.0:8877/e8b0c5914d19193/output.txt |
From: Bot.io (Windows)SuccessFull output at http://54.215.176.217:8877/0f5be8cc5759682/output.txt Total script time: 5.46 mins
|
From: Bot.io (Linux m4)SuccessFull output at http://54.67.70.0:8877/e8b0c5914d19193/output.txt Total script time: 8.31 mins
|
/botio-windows preview |
From: Bot.io (Windows)ReceivedCommand cmd_preview from @timvandermeij received. Current queue size: 0 Live output at: http://54.215.176.217:8877/18ac274adac108d/output.txt |
From: Bot.io (Windows)SuccessFull output at http://54.215.176.217:8877/18ac274adac108d/output.txt Total script time: 4.67 mins Published |
Looks good now. Thank you! |
As outlined in https://bugzilla.mozilla.org/show_bug.cgi?id=1282759 the internal Firefox name for the feature is
entireWord
, hence that name is used here as well for consistency (with "Whole words" being limited to the UI).Given existing limitations of the PDF.js search functionality, e.g. the existing problems of searching across "new lines", there's some edge-cases where "Whole words" searching will ignore (valid) results.
However, considering that this is a pre-existing issue related to the way that the find controller joins text-content together, that shouldn't have to block this new feature in my opionion.
Please note: In order to enable this feature in the
MOZCENTRAL
version, a small follow-up patch for PdfjsChromeUtils.jsm will be required once this has landed inmozilla-central
.Edit: See also https://bugzilla.mozilla.org/show_bug.cgi?id=1282759