Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add normalization for Hyphen -> Hyphen-minus #13492

Merged
merged 1 commit into from
Jun 4, 2021

Conversation

MMeent
Copy link
Contributor

@MMeent MMeent commented Jun 4, 2021

Previously these two characters were not searchable interchangably, even when Hyphen-Minus is being changed to Hyphen in some text to PDF pipelines.

example of what it fixes: The CA/B Forum Baseline Requirements documents (v1.7.5, https://cabforum.org/wp-content/uploads/CA-Browser-Forum-BR-1.7.5.pdf) currently contain "SHA-1" on page 15 (search 'Subordinate CA certificates using the SHA"), but you won't find matches for "SHA-1" due to it using Hyphen (\u2010) instead of Hyphen-Minus (\u002D).

Previously these two characters were not searchable interchangably, even when Hyphen-Minus is being changed to Hyphen in some text to PDF pipelines.
@timvandermeij
Copy link
Contributor

/botio-linux preview

@pdfjsbot
Copy link

pdfjsbot commented Jun 4, 2021

From: Bot.io (Linux m4)


Received

Command cmd_preview from @timvandermeij received. Current queue size: 1

Live output at: http://54.67.70.0:8877/628b3238bd68b7c/output.txt

@pdfjsbot
Copy link

pdfjsbot commented Jun 4, 2021

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/628b3238bd68b7c/output.txt

Total script time: 3.71 mins

Published

@timvandermeij timvandermeij merged commit ed0990a into mozilla:master Jun 4, 2021
@timvandermeij
Copy link
Contributor

Thank you for improving this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants