Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

underscore conversion for italics broken in pandoc-2.4 #5053

Closed
ousia opened this issue Nov 7, 2018 · 9 comments
Closed

underscore conversion for italics broken in pandoc-2.4 #5053

ousia opened this issue Nov 7, 2018 · 9 comments

Comments

@ousia
Copy link
Contributor

ousia commented Nov 7, 2018

pandoc-2.4 converts

_legibility_

into

<p>—_legibility_—</p>

pandoc-2.3 parsed it as:

<p><em>legibility</em></p>

I’m afraid it might be a bug.

@jgm
Copy link
Owner

jgm commented Nov 7, 2018

My guess is that this is due to commit 9b0bd4e

@jgm
Copy link
Owner

jgm commented Nov 7, 2018

Note: commit 9b0bd4e made possible the fix to #4635, so simply reverting it breaks that test.

The whole lastStrPos thing is a kludge due to not being able to look backwards in the token stream when parsing. #4635 relies on (and other symbols triggering an update to lastStrPos, but the parser for _ emphasis apparently relied on lastStrPos only being updated by regular non-symbol strings.

It would be nice to find a cleaner overall solution. But a simple fix for this would also be welcome.

@bpj
Copy link

bpj commented Nov 11, 2018

Workaround:

—[_legibility_]{}—

outputs as

<p>—<span><em>legibility</em></span>—</p>

Tip: add a dummy class so that you can remove the hack with a filter once the bug gets fixed!

@agusmba
Copy link
Contributor

agusmba commented Nov 12, 2018

Another workaround would be to switch the order, wouldn't it?

_—legibility—_

@ousia
Copy link
Contributor Author

ousia commented Nov 12, 2018

@bpj, another workaround would be to use the <em> tag.

@agusmba, I’m afraid the reply is straightforward:

no-1

The glyph for the em-dash for roman and italic fonts isn’t the same (at least, in TeX Gyre Pagella).

@agusmba
Copy link
Contributor

agusmba commented Nov 13, 2018

Ah, I didn't zoom all the way in to see the difference.
For a moment I thought you were shouting back!
Sorry for the noise.

Although in the example, shouldn't both em-dash be in italic?

@ousia
Copy link
Contributor Author

ousia commented Nov 13, 2018

@agusmba, I only intended to show the difference between the roman and italic glyphs (it was the best way it came to my mind).

@craigbarnes
Copy link

craigbarnes commented Nov 14, 2018

Another example that seems to trigger this regression is:

_filename_|_filetype_

Before 2.4, it used to render as 2 emphasis spans, but now it renders as:

<p><em>filename</em>|_filetype_</p>

@jgm
Copy link
Owner

jgm commented Nov 26, 2018

Fixed by edc6510

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants