-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adoc escaping problems in source code: angle brackets #763
Comments
Here's the relevant code: mrdocs/src/lib/Gen/adoc/AdocGenerator.cpp Lines 35 to 58 in e41d780
The only way I see to solve this problem consistently is for HTML to escape whatever goes inside An alternative would be to try to escape as much as possible with
In this case, it would have to be a more complex algorithm that keeps track of the context so that we know what kind of commands we are escaping. There's also a weird problem of precedence between commands:
I don't even know how to achieve So, we must identify "effective" opening tags by examining the current context and looking ahead to ensure the tag is closing somewhere. Whenever in a command context, we can't try to escape anything else. It's a complex algorithm that would evolve with time. However, now that I'm thinking more about it, escaping HTML inside pass is not such a bad idea because by using pass, the content is coupled with the output format anyway. It's just coupled in a way that's not what we want for any output format (such as literal |
Another thing we could implement here is making the |
I just did a small experiment with nested commands. The results are almost random. Sometimes, But at least we can identify what should be escaped based on the context. Nested formatters
_a_, _#a#_, _*a*_, _`a`_, _~a~_, _^a^_, _[[a]]_, _{a}_, _<<a>>_
#_a_#, #a#, #*a*#, #`a`#, #~a~#, #^a^#, #[[a]]#, #{a}#, #<<a>>#
*_a_*, *#a#*, *a*, *`a`*, *~a~*, *^a^*, *[[a]]*, *{a}*, *<<a>>*
`_a_`, `#a#`, `*a*`, `a`, `~a~`, `^a^`, `[[a]]`, `{a}`, `<<a>>`
~_a_~, ~#a#~, ~*a*~, ~`a`~, ~a~, ~^a^~, ~[[a]]~, ~{a}~, ~<<a>>~
^_a_^, ^#a#^, ^*a*^, ^`a`^, ^~a~^, ^a^, ^[[a]]^, ^{a}^, ^<<a>>^
[[_a_]], [[#a#]], [[*a*]], [[`a`]], [[~a~]], [[^a^]], [[a]], [[{a}]], [[<<a>>]]
{_a_}, {#a#}, {*a*}, {`a`}, {~a~}, {^a^}, {[[a]]}, {a}, {<<a>>}
<<_a_>>, <<#a#>>, <<*a*>>, <<`a`>>, <<~a~>>, <<^a^>>, <<[[a]]>>, <<{a}>>, <<a>>
Escape outter
\_a_, \_#a#_, \_*a*_, \_`a`_, \_~a~_, \_^a^_, \_[[a]]_, \_{a}_, \_<<a>>_
\#_a_#, \#a#, \#*a*#, \#`a`#, \#~a~#, \#^a^#, \#[[a]]#, \#{a}#, \#<<a>>#
\*_a_*, \*#a#*, \*a*, \*`a`*, \*~a~*, \*^a^*, \*[[a]]*, \*{a}*, \*<<a>>*
\`_a_`, \`#a#`, \`*a*`, \`a`, \`~a~`, \`^a^`, \`[[a]]`, \`{a}`, \`<<a>>`
\~_a_~, \~#a#~, \~*a*~, \~`a`~, \~a~, \~^a^~, \~[[a]]~, \~{a}~, \~<<a>>~
\^_a_^, \^#a#^, \^*a*^, \^`a`^, \^~a~^, \^a^, \^[[a]]^, \^{a}^, \^<<a>>^
\[[_a_]], \[[#a#]], \[[*a*]], \[[`a`]], \[[~a~]], \[[^a^]], \[[a]], \[[{a}]], \[[<<a>>]]
\{_a_}, \{#a#}, \{*a*}, \{`a`}, \{~a~}, \{^a^}, \{[[a]]}, \{a}, \{<<a>>}
\<<_a_>>, \<<#a#>>, \<<*a*>>, \<<`a`>>, \<<~a~>>, \<<^a^>>, \<<[[a]]>>, \<<{a}>>, \<<a>>
Escape inner
_a_, _\#a#_, _\*a*_, _\`a`_, _\~a~_, _\^a^_, _\[[a]]_, _\{a}_, _\<<a>>_
#\_a_#, #a#, #\*a*#, #\`a`#, #\~a~#, #\^a^#, #\[[a]]#, #\{a}#, #\<<a>>#
*\_a_*, *\#a#*, *a*, *\`a`*, *\~a~*, *\^a^*, *\[[a]]*, *\{a}*, *\<<a>>*
`\_a_`, `\#a#`, `\*a*`, `a`, `\~a~`, `\^a^`, `\[[a]]`, `\{a}`, `\<<a>>`
~\_a_~, ~\#a#~, ~\*a*~, ~\`a`~, ~a~, ~\^a^~, ~\[[a]]~, ~\{a}~, ~\<<a>>~
^\_a_^, ^\#a#^, ^\*a*^, ^\`a`^, ^\~a~^, ^a^, ^\[[a]]^, ^\{a}^, ^\<<a>>^
[[\_a_]], [[\#a#]], [[\*a*]], [[\`a`]], [[\~a~]], [[\^a^]], [[a]], [[\{a}]], [[\<<a>>]]
{\_a_}, {\#a#}, {\*a*}, {\`a`}, {\~a~}, {\^a^}, {\[[a]]}, {a}, {\<<a>>}
<<\_a_>>, <<\#a#>>, <<\*a*>>, <<\`a`>>, <<\~a~>>, <<\^a^>>, <<\[[a]]>>, <<\{a}>>, <<a>>
Escape both
\_a_, \_\#a#_, \_\*a*_, \_\`a`_, \_\~a~_, \_\^a^_, \_\[[a]]_, \_\{a}_, \_\<<a>>_
\#\_a_#, \#a#, \#\*a*#, \#\`a`#, \#\~a~#, \#\^a^#, \#\[[a]]#, \#\{a}#, \#\<<a>>#
\*\_a_*, \*\#a#*, \*a*, \*\`a`*, \*\~a~*, \*\^a^*, \*\[[a]]*, \*\{a}*, \*\<<a>>*
\`\_a_`, \`\#a#`, \`\*a*`, \`a`, \`\~a~`, \`\^a^`, \`\[[a]]`, \`\{a}`, \`\<<a>>`
\~\_a_~, \~\#a#~, \~\*a*~, \~\`a`~, \~a~, \~\^a^~, \~\[[a]]~, \~\{a}~, \~\<<a>>~
\^\_a_^, \^\#a#^, \^\*a*^, \^\`a`^, \^\~a~^, \^a^, \^\[[a]]^, \^\{a}^, \^\<<a>>^
\[[\_a_]], \[[\#a#]], \[[\*a*]], \[[\`a`]], \[[\~a~]], \[[\^a^]], \[[a]], \[[\{a}]], \[[\<<a>>]]
\{\_a_}, \{\#a#}, \{\*a*}, \{\`a`}, \{\~a~}, \{\^a^}, \{\[[a]]}, \{a}, \{\<<a>>}
\<<\_a_>>, \<<\#a#>>, \<<\*a*>>, \<<\`a`>>, \<<\~a~>>, \<<\^a^>>, \<<\[[a]]>>, \<<\{a}>>, \<<a>> And here are the results: Nested formatters a, a, *a*, `a`, a, a, , {a}, [a] a, a, a, a, a, a,
a, a, a, a, a, a, [[a]], [[a]], [[a]], [[ {a}, {a}, {a}, { Escape outter _a_, _#a#_, _*a*_, _`a`_, _a_, _a_, __, _{a}_, _[a]_ #a#, #a#, #a#, # *a*, *a*, *a*, * `a`, `a`, `a`, `a`, `a`, `a`, ``, `{a}`, `[a]` ~a~, ~a~, ~a~, ~ ^a^, ^a^, ^a^, ^ \[[a]], \[[a]], \[[a]], \[[ \{a}, \{a}, \{a}, \{ Escape inner a, #a#, \*a*, \`a`, ~a~, ^a^, [[a]], {a}, <<a>> _a_, a, *a*, `a`, ~a~, ^a^, [[a]], {a}, <<a>> _a_, #a#, a, `a`, ~a~, ^a^, [[a]], {a}, <<a>>
_a_, #a#, *a*, `a`, a, ^a^, [[a]], {a}, <<a>> _a_, #a#, *a*, `a`, ~a~, a, [[a]], {a}, <<a>> {_a_}, {#a#}, {*a*}, {`a`}, {~a~}, {^a^}, {[[a]]}, {a}, {<<a>>} Escape both _a_, _\#a#_, _\*a*_, _\`a`_, _~a~_, _^a^_, _[[a]]_, _{a}_, _<<a>>_ #_a_#, #a#, #*a*#, #`a`#, #~a~#, #^a^#, #[[a]]#, #{a}#, #<<a>># *_a_*, *#a#*, *a*, *`a`*, *~a~*, *^a^*, *[[a]]*, *{a}*, *<<a>>* `_a_`, `#a#`, `*a*`, `a`, `~a~`, `^a^`, `[[a]]`, `{a}`, `<<a>>` ~_a_~, ~#a#~, ~*a*~, ~`a`~, ~a~, ~^a^~, ~[[a]]~, ~{a}~, ~<<a>>~ ^_a_^, ^#a#^, ^*a*^, ^`a`^, ^~a~^, ^a^, ^[[a]]^, ^{a}^, ^<<a>>^ [[_a_]], \[[#a#]], \[[*a*]], \[[`a`]], \[[~a~]], \[[^a^]], [[a]], \[[{a}]], \[[<<a>>]] {_a_}, \{#a#}, \{*a*}, \{`a`}, \{~a~}, \{^a^}, \{[[a]]}, {a}, \{<<a>>} <<_a_>>, <<#a#>>, \<<*a*>>, \<<`a`>>, \<<~a~>>, \<<^a^>>, \<<[[a]]>>, <<{a}>>, <<a>> |
Replace escaping based on passthroughs without universal escaping based on character substitutions: https://docs.asciidoctor.org/asciidoc/latest/subs/replacements/ fix cppalliance#763
The following generates problems:
For
repro/trait/value.adoc
, it generates:Which renders as:
I'm worried that we're trying to solve an unsolvable problem, as the lack of a robust escaping strategy has been an open issue in asciidoc for 10 years (asciidoctor/asciidoctor#901).
The text was updated successfully, but these errors were encountered: