Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latex reader: cannot use macro in image path? #3236

Closed
sgdjs opened this issue Nov 16, 2016 · 3 comments
Closed

Latex reader: cannot use macro in image path? #3236

sgdjs opened this issue Nov 16, 2016 · 3 comments

Comments

@sgdjs
Copy link

sgdjs commented Nov 16, 2016

For the same Latex document to come out in different "flavors", I put a macro in the image path, like this:

\documentclass{report}

\usepackage{graphicx}
\newcommand{\mycolor}{red}

\begin{document}

\includegraphics[width=17cm]{\mycolor /header}
Magnificent \mycolor{} header.

\end{document}

So with images named the same in both red and blue folders, changing the value of mycolor once changes all the images of the file.

Pandoc convertion to HTML converts only the second macro correctly: pandoc source.tex -o index.html --default-image-extension=png gives

<p><img src="\mycolor /header.png" alt="image" width="642" /></p>
<p>Magnificent <span>red</span> header.</p>

How to convert such a latex document to HTML with images ? thanks

@cagix
Copy link
Contributor

cagix commented Nov 17, 2016

Imho the content/arguments of latex macros (and environments) will be passed to the writer as "raw tex". This works fine for the latex writer, since latex can expand your macro inside the \includegraphics. However, the html writer does not understand the tex code inside the \includegraphics (i.e. \mycolor /header in the given example), so the parameters are not processed by pandoc nor the html writer (it is "raw tex").

You could try some kind of pandoc filter, like

from pandocfilters import toJSONFilter, Str, Image
import re

image = re.compile('\\\\includegraphics.*?\{(.*)\}$')

def textohtml(key, value, format, meta):
    if key == 'RawInline':
        fmt, s = value
        if fmt == "tex":
            m = image.match(s)
            if m:
                return Image([Str("description")], [m.group(1),""])  

if __name__ == "__main__":
    toJSONFilter(textohtml)

(not tested, should work for pandocfilters < 1.3.0 (Image has got some more parameters in pandocfilters >= 1.3.0)).

@sgdjs
Copy link
Author

sgdjs commented Nov 17, 2016

Hello, thanks for the answer. I have not used filters before but I'll test your solution when I can. I suppose this issue can be closed if it's the best way.
Thanks again

@cagix
Copy link
Contributor

cagix commented Nov 18, 2016

Hmmm, it seems, I had an "out of coffee exception" without proper exception handler in place ;(

The proposed solution will not solve your problem completely, since the argument of \includegraphics is still not processed, i.e. you would still end up in the html with <img src="\mycolor /header.png">. The filter needs to process the m.group(1) before returning the image ...

jgm added a commit that referenced this issue Jul 5, 2017
This rewrite is primarily motivated by the need to
get macros working properly (#982, #934, #3779, #3236,
 #1390, #2888, #2118).

We now tokenize the input text, then parse the token stream.
Macros modify the token stream, so they should now be effective in any
context, including math. (Thus, we no longer need the clunky macro
processing capacities of texmath.)

A custom state LaTeXState is used instead of ParserState.
This, plus the tokenization, will require some rewriting
of the exported functions rawLaTeXInline, inlineCommand,
rawLaTeXBlock.
jgm added a commit that referenced this issue Jul 6, 2017
This rewrite is primarily motivated by the need to
get macros working properly (#982, #934, #3779, #3236,
 #1390, #2888, #2118).  A side benefit is that the
reader is significantly faster (27s -> 19s in one
benchmark, and there is a lot of room for further
optimization).

We now tokenize the input text, then parse the token stream.

Macros modify the token stream, so they should now be effective
in any context, including math. Thus, we no longer need the clunky
macro processing capacities of texmath.

A custom state LaTeXState is used instead of ParserState.
This, plus the tokenization, will require some rewriting
of the exported functions rawLaTeXInline, inlineCommand,
rawLaTeXBlock.

* Added Text.Pandoc.Readers.LaTeX.Types (new exported module).
  Exports Macro, Tok, TokType, Line, Column.  [API change]
* Text.Pandoc.Parsing: adjusted type of `insertIncludedFile`
  so it can be used with token parser.
* Removed old texmath macro stuff from Parsing.
  Use Macro from Text.Pandoc.Readers.LaTeX.Types instead.
* Removed texmath macro material from Markdown reader.
* Changed types for Text.Pandoc.Readers.LaTeX's
  rawLaTeXInline and rawLaTeXBlock.  (Both now return a String,
  and they are polymorphic in state.)
* Added orgMacros field to OrgState.  [API change]
* Removed readerApplyMacros from ReaderOptions.
  Now we just check the `latex_macros` reader extension.
jgm added a commit that referenced this issue Jul 7, 2017
@jgm jgm closed this as completed in 0feb750 Jul 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants