Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YAML metadata block treated as table since 2.8.0.1 #6823

Closed
Zash opened this issue Nov 10, 2020 · 2 comments
Closed

YAML metadata block treated as table since 2.8.0.1 #6823

Zash opened this issue Nov 10, 2020 · 2 comments

Comments

@Zash
Copy link

Zash commented Nov 10, 2020

I have a tool that converts some XML format into Markdown so that I can use Pandoc to turn it into epub for reading on a eink device.

It doesn't trim whitespace everywhere so given input like

<doc>
  <abstract>
    Lorem ipsum
  </abstract>
</doc>

it ended up injecting "\n Lorem ipsum\n " into the YAML metadata. This somehow resulted in the metadata block and the first section being treated as a table.

Trimming the whitespace in the abstract it produces the expected result.

Here's a reduced example that produces the bug:

---
author: John Doe <jdoe@example.com>
title: 'ABC-1234: Lorem ipsum'
abstract: "\n      Nullam blandit imperdiet venenatis. Sed efficitur euismod nisi ut
  varius malesuada.\n    "
...

Introduction {#intro}
============

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Some subsection {#subsection}
---------------

Fusce eget massa risus. Sed dolor risus, posuere vel est eget, pellentesque sagittis risus.

I did a bisect using the release Debian packages and found that it works in 2.8 but not in 2.8.0.1.

pandoc 2.8

Installed pandoc-2.8-1-amd64.deb from the releases on Debian 10.

pandoc -t markdown -s output:

---
abstract: Nullam blandit imperdiet venenatis. Sed efficitur euismod nisi
  ut varius malesuada.
author: 'John Doe <jdoe@example.com>'
title: 'ABC-1234: Lorem ipsum'
---

Introduction {#intro}
============

Lorem ipsum dolor sit amet, consectetur adipiscing elit.

Some subsection {#subsection}
---------------

Fusce eget massa risus. Sed dolor risus, posuere vel est eget,
pellentesque sagittis risus.

This is what I expected.

-t native output:

Pandoc (Meta {unMeta = fromList [("abstract",MetaInlines [Str "Nullam",Space,Str "blandit",Space,Str "imperdiet",Space,Str "venenatis.",Space,Str "Sed",Space,Str "efficitur",Space,Str "euismod",Space,Str "nisi",Space,Str "ut",Space,Str "varius",Space,Str "malesuada."]),("author",MetaInlines [Str "John",Space,Str "Doe",Space,Link ("",["email"],[]) [Str "jdoe@example.com"] ("mailto:jdoe@example.com","")]),("title",MetaInlines [Str "ABC-1234:",Space,Str "Lorem",Space,Str "ipsum"])]})
[Header 1 ("intro",[],[]) [Str "Introduction"]
,Para [Str "Lorem",Space,Str "ipsum",Space,Str "dolor",Space,Str "sit",Space,Str "amet,",Space,Str "consectetur",Space,Str "adipiscing",Space,Str "elit."]
,Header 2 ("subsection",[],[]) [Str "Some",Space,Str "subsection"]
,Para [Str "Fusce",Space,Str "eget",Space,Str "massa",Space,Str "risus.",Space,Str "Sed",Space,Str "dolor",Space,Str "risus,",Space,Str "posuere",Space,Str "vel",Space,Str "est",Space,Str "eget,",Space,Str "pellentesque",Space,Str "sagittis",Space,Str "risus."]]

pandoc 2.8.0.1

Installed pandoc-2.8.0.1-1-amd64.deb from the releases on Debian 10.

Note how the yaml metadata and the first section becomes a single-cell table.

  ---------------------------
  author: John Doe
  <jdoe@example.com> title:
  'ABC-1234: Lorem ipsum'
  abstract:
  "`\n      `{=tex}Nullam
  blandit imperdiet
  venenatis. Sed efficitur
  euismod nisi ut varius
  malesuada.`\n    `{=tex}"
  ...

  Introduction {\#intro}
  ============

  Lorem ipsum dolor sit amet,
  consectetur adipiscing
  elit.

  Some subsection
  {\#subsection}
  ---------------------------

Fusce eget massa risus. Sed dolor risus, posuere vel est eget,
pellentesque sagittis risus.

This is not what I expected.

-t native output:

Pandoc (Meta {unMeta = fromList []})
[Table [] [AlignDefault] [5.555555555555555e-2]
 [[]]
 [[[Plain [Str "author:",Space,Str "John",Space,Str "Doe",Space,Link ("",["email"],[]) [Str "jdoe@example.com"] ("mailto:jdoe@example.com",""),SoftBreak,Str "title:",Space,Quoted SingleQuote [Str "ABC-1234:",Space,Str "Lorem",Space,Str "ipsum"],SoftBreak,Str "abstract:",Space,Quoted DoubleQuote [RawInline (Format "tex") "\\n      ",Str "Nullam",Space,Str "blandit",Space,Str "imperdiet",Space,Str "venenatis.",Space,Str "Sed",Space,Str "efficitur",Space,Str "euismod",Space,Str "nisi",Space,Str "ut",SoftBreak,Str "varius",Space,Str "malesuada.",RawInline (Format "tex") "\\n    "],SoftBreak,Str "\8230"]]]
 ,[[Plain [Str "Introduction",Space,Str "{#intro}",SoftBreak,Str "============"]]]
 ,[[Plain [Str "Lorem",Space,Str "ipsum",Space,Str "dolor",Space,Str "sit",Space,Str "amet,",Space,Str "consectetur",Space,Str "adipiscing",Space,Str "elit."]]]
 ,[[Plain [Str "Some",Space,Str "subsection",Space,Str "{#subsection}"]]]]
,Para [Str "Fusce",Space,Str "eget",Space,Str "massa",Space,Str "risus.",Space,Str "Sed",Space,Str "dolor",Space,Str "risus,",Space,Str "posuere",Space,Str "vel",Space,Str "est",Space,Str "eget,",Space,Str "pellentesque",Space,Str "sagittis",Space,Str "risus."]]
@jgm jgm closed this as completed in 7d01887 Nov 10, 2020
@jgm
Copy link
Owner

jgm commented Nov 10, 2020

Fixed the bug, but you should trim whitespace from the beginning, otherwise it may be interpreted as a code block.

@Zash
Copy link
Author

Zash commented Nov 10, 2020

Thanks!

I'll double check my metadata handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants