Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dokuwiki output: List elements with double line breaks fall back to HTML #7413

Closed
joneuhauser opened this issue Jun 27, 2021 · 10 comments
Closed

Comments

@joneuhauser
Copy link

joneuhauser commented Jun 27, 2021

Consider the following Pandoc Markdown:

-   This is the first item

-   This is an item

    with a double line break

Converting it with pandoc file.txt -f markdown -t dokuwiki -o output.txt

results in

<HTML><ul></HTML>
<HTML><li></HTML><HTML><p></HTML>This is the first item<HTML></p></HTML><HTML></li></HTML>
<HTML><li></HTML><HTML><p></HTML>This is an item<HTML></p></HTML>
<HTML><p></HTML>with a double line break<HTML></p></HTML><HTML></li></HTML><HTML></ul></HTML>

which is overly complicated, because Dokuwiki's docs state that the following is recommended in this case (resulting in an output very similar to Pandoc's markdown):

  * This is the first item
  * This is an item \\ \\ with a double line break

which renders like this in native Dokuwiki:
grafik

Since allowing HTML in Dokuwiki is a potential security problem, I'd expect Pandoc to try to convert as much as possible without relying on native HTML tags.

Note: The file I'm originally starting from is a LaTeX file with a double line break in a list item, but the issue is reproducible with Pandoc's markdown format as well.

Pandoc version:

pandoc 2.14.0.3
Compiled with pandoc-types 1.22, texmath 0.12.3, skylighting 0.10.5.2,
citeproc 0.4.0.1, ipynb 0.1.0.1

Ubuntu 20.04
@jgm
Copy link
Owner

jgm commented Jun 28, 2021

Semantically, a paragraph with internal hard line breaks is different from two paragraphs. So the output you're recommending would change the meaning.

@xrat
Copy link

xrat commented Sep 18, 2024

Please note that the use of HTML in DokuWiki is strongly discouraged since DokuWiki v2023-04-04 "Jack Jackrum":

The options to embed HTML and PHP have been completely removed for security reasons

@jgm
Copy link
Owner

jgm commented Sep 18, 2024

@xrat that is good to know. What we should probably do is make dokuwiki output sensitive to the raw_html extension, and turn this OFF by default.

However, this would require finding some sensible alternative to use in lists with complex block-level content. Is there anything in current dokuwiki that would be the equivalent of, for example, the markdown list item

1.  this is a paragraph.

    another.

    > and a block quote

    ```ruby
     and.some.code()
    ```

    - and a nested list here
    - with more items

Any pandoc writer has to render this somehow. Currently we fall back to raw HTML in such cases for dokuwiki. What is the alternative?

@xrat
Copy link

xrat commented Sep 19, 2024

However, this would require finding some sensible alternative to use in lists with complex block-level content.
Is there anything in current dokuwiki that would be the equivalent of, for example, the markdown list item

No, not in core DokuWiki without syntax extensions/plugins. Cf. wiki:syntax [DokuWiki]. There is even a FAQ "Multiline List Items" in faq:lists [DokuWiki]: "The list syntax expects you to put each item in a single line".

The following DokuWiki extensions/plugins are worth mentioning in this context:

  • adhoctags: Allows selected HTML tags (whitelist approach)
  • htmlok: Allows all HTML, discouraged for security reasons
  • wrap: Allows to "wrap wiki text inside containers (divs or spans)"

To somewhat illustrate the situation, here are best effort renditions of your example except for the fact that I am using a 2-line blockquote:

Core DokuWiki list syntax (all on 1 line with the exception of <code> blocks):

  - this is a paragraph.\\ \\ another.\\ \\ > and a\\ > block quote\\ \\ <code ruby>
     and.some.code()
</code>
    - and a nested list here
    - with more items

which is rendered as

<li class="level1 node"><div class="li"> this is a paragraph.<br>
<br>
another.<br>
<br>
&gt; and a<br>
&gt; block quote<br>
<br>
<pre class="code ruby">     and.some.code()</pre>
</div>
<ol>
<li class="level2"><div class="li"> and a nested list here</div></li>
<li class="level2"><div class="li"> with more items</div></li>
</ol>
</li>

DokuWiki list syntax with wrap extension:

  - this is a paragraph \\ <WRAP>

another

> and a
> block quote

<code ruby>
     and.some.code()
</code>

  - and a nested list here
  - with more items
</WRAP>

renders as

<li class="level1"><div class="li"> this is a paragraph <br>
<div class="plugin_wrap">
<p>another</p>
<blockquote><div class="no">
 and a<br>
 block quote</div></blockquote>
<pre class="code ruby">     and.some.code()</pre>
<ol>
<li class="level1"><div class="li"> and a nested list here</div></li>
<li class="level1"><div class="li"> with more items</div></li>
</ol>
</div></div>
</li>

@jgm
Copy link
Owner

jgm commented Sep 19, 2024

The WRAP extension looks like it might be the best approach. Is that in wide use?

@xrat
Copy link

xrat commented Sep 19, 2024

DokuWiki collects voluntary statistics. According to this data the wrap extension is actually the most popular one. Cf. the table at plugins [DokuWiki] accessed just now (not counting the "upgrade" plugin for obvious reasons)

@jgm
Copy link
Owner

jgm commented Sep 20, 2024

One more question about WRAP. Can you instead do this?

  - <WRAP>
this is a paragraph

another

> and a
> block quote

<code ruby>
     and.some.code()
</code>

  - and a nested list here
  - with more items
</WRAP>

That would be more uniform and easier to generate.

@xrat
Copy link

xrat commented Sep 20, 2024

I fully agree. And I have good news: It works.

<ol>
<li class="level1"><div class="li"> <div class="plugin_wrap">
<p>this is a paragraph</p>
<p>another</p>
<blockquote><div class="no">
 and a<br>
 block quote</div></blockquote>
<pre class="code ruby">     and.some.code()</pre>
<ol>
<li class="level1"><div class="li"> and a nested list here</div></li>
<li class="level1"><div class="li"> with more items</div></li>
</ol>
</div></div>
</li>
</ol>

(I've removed some NLs to make the code more compact, e.g. before </p> and </li>.)

@jgm
Copy link
Owner

jgm commented Sep 20, 2024

Can it handle nested cases, where you have another <WRAP>..</WRAP> in the inner list?

@xrat
Copy link

xrat commented Sep 21, 2024

Yes.

@jgm jgm closed this as completed in 16a9df8 Sep 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants