-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
unicode link label normalization (fix test 539) #277
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice improvement and fix! I've two minor suggestions. I'll stage my suggestions (and perhaps merge conflict cleanup) as PR into this PR, unless you prefer (and have time) to address those things before I can get to it :D
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Thanks for the review @shonfeder, I added extra tests to cover the normalization of labels in 49d3ca2 |
@@ -8,7 +8,7 @@ let protect ~finally f = | |||
finally (); | |||
r | |||
|
|||
let disabled = [ 206; 215; 216; 519; 539 ] | |||
let disabled = [ 206; 215; 216; 519 ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Outstanding progress. You're demolishing these deviations. Thanks so much @tatchi! 🎉
CHANGES: - Expose the HTML escape function `htmlentities` (ocaml-community/omd#295 @cuihtlauac) - Support generation of identifiers in headers (ocaml-community/omd#294, @tatchi) - Support GitHub-Flavoured Markdown tables (ocaml-community/omd#292, @bobatkey) - Update parser to support CommonMark Spec 0.30 (ocaml-community/omd#266, @SquidDev) - Preserve the order of input files in the HTML output to stdout (ocaml-community/omd#258, @patricoferris) - Fix all deviations from CommonMark Spec 0.30 (ocaml-community/omd#284, ocaml-community/omd#283, ocaml-community/omd#278, ocaml-community/omd#277, ocaml-community/omd#269, @tatchi)
Input:
This is the result in master:
The issue is that both labels are not being matched, hence is it not recognized as a link. To match labels, we need to
normalize
them (strip off leading/trailing whitespace, ...) and do a case-insensitive comparison. The unicode version of that is a bit more complex as we need to do aUnicode case folding
. From the spec:This PR adapts the
normalize
function to work with unicode labels too. Fortunately, I could rely on some libs (uutf
,uucp
, anduunf
) and I even found a piece of code in the doc that does almost what's needed.With that adapted
normalize
function,ẞ
andSS
are matched. The result is now a link as expected.