Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

when --toc, GFM now outputs raw HTML instead or Markdown syntax for TOC #8131

Closed
cderv opened this issue Jun 17, 2022 · 5 comments
Closed

when --toc, GFM now outputs raw HTML instead or Markdown syntax for TOC #8131

cderv opened this issue Jun 17, 2022 · 5 comments

Comments

@cderv
Copy link
Contributor

cderv commented Jun 17, 2022

This is a follow up of https://groups.google.com/g/pandoc-discuss/c/gQZrKunCvB4/m/j7177M-3BwAJ which remained unanswered.

With recent Pandoc (here 2.18), we have this type of outputs

❯ ./pandoc --to gfm -f markdown --toc -s
# Head

Content
^Z
- <a href="#head" id="toc-head">Head</a>

# Head

 Content

I believe this is a consequence to this change #7907. By adding ids on all TOC elements, it triggers raw HTML as output to keep the id in markdown, if raw_html is possible.

Is this expected that now activating TOC for markdown output will always return HTML except if raw_html extensions is deactivated ?

❯ pandoc --to gfm-raw_html -f markdown --toc -s
# HEAD

Content
^Z
-   [HEAD](#head)

# HEAD

Content

I am not sure adding ID on TOC link by default is something expected for Markdown output. For HTML, I guess it does not hurt.
Adding ID on TOC could be not set by default for markdown output maybe, or be part of an extension that could be deactivated ?

We adapted to this change of having raw HTML for TOC now, but I wanted to bring this change to discussion and confirm it is expected and not an unknown side effect

Thanks.

@jgm
Copy link
Owner

jgm commented Jun 17, 2022

It's a consequence of #7907. I can see why this would be undesirable for markdown output if the markdown flavor doesn't allow you to encode the ID attribute.

I can think of a few potential solutions:

  1. If the markdown flavor doesn't support attributes, then omit the id attribute on the links produced in the TOC.
  2. Modify the writer for Link so that, if the link contains attributes, instead of falling back to HTML for the whole link, we wrap the link in raw HTML tags <span ...attributes> .. </span>. This would still be a bit ugly in gfm output, but less ugly.

1 seems simplest, but it's hard to predict whether people are already relying on these anchors for their gfm output.

@cderv
Copy link
Contributor Author

cderv commented Jun 17, 2022

1 is indeed the simplest and what I would have expected probably when I asked on Pandoc-discuss. I should have maybe open an issue here as I discovered when it was only in Nightly at the time. I understand that 2 could be more desirable now in case people are already using it. Sorry for that.

Regarding 2, do you mean like it was before with --toc --number-sections

❯ ./pandoc --to gfm -f markdown --toc -s --number-sections
# Head

Content
^Z
-   [<span class="toc-section-number">1</span> Head](#head)

# Head

Content

Which is now with Pandoc 2.18 following the ID change:

❯ pandoc --to gfm -f markdown --toc -s -N
# Head

Content
^Z
-   <a href="#head" id="toc-head"><span class="toc-section-number">1</span>
    Head</a>

# Head

Content

If so, this will be two spans following in this case

-  [<span id="toc-head"><span class="toc-section-number">1</span> Head</span>](#head)`

Unless we can merge into one and put the ID toc-head on the number span

or that --number-sections should not add any numbers as the headers in the body won't be numbered - only the TOC element will (--number-sections is not really working for GFM I believe).

@tarleb
Copy link
Collaborator

tarleb commented Dec 18, 2022

I'd strongly prefer the first option; the switch to a new epoch version seems like a good opportunity to introduce some minor breakage. Besides, I'd be quite surprised to learn that people rely on this when targeting gfm.

@jgm
Copy link
Owner

jgm commented Dec 18, 2022

Yes, the first option seems best to me too.

@tarleb
Copy link
Collaborator

tarleb commented Dec 20, 2022

If #8485 gets merged in its current form then the old behavior could be restored with a custom writer:

Template = pandoc.template.default 'markdown'

function Writer (doc, opts)
  local toc = pandoc.structure.table_of_contents(doc)
  opts.variables['table-of-contents'] =
    pandoc.write(pandoc.Pandoc{toc}, 'gfm')
  return pandoc.write(doc, 'gfm', opts)
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants