-
-
Notifications
You must be signed in to change notification settings - Fork 3.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Relative images are relative to working directory, not file #3752
Comments
Yes, current behavior is intended. Pandoc just acts on a stream of text that may come from files (possibly several files in different directories) or from stdin; it doesn't keep track of what directory the text came from. Thus See #852, which added a |
I can think of a somewhat complex way in which this might be improved. Not sure it's worth it, though. Instead of having the reader take a Text as argument, we could have it take something like a list of pairs of filenames and Texts:
Most of the readers use parsec parsers; we could define a custom Stream instance for
We could store the name of the current source file in the "common state" of the pandoc monad. The readers could then check for the image file, first in the local directory and then in the working directory or resource path, and adjust the path accordingly. (Note: we don't normally do anything like this until the writers.) This would help with your use case, at the expense of making the pandoc API considerably more complicated. Not sure it's worth it. @mb21 @jkr I'd be curious if you have any thoughts about this. @jkr, can |
That's a hard one. I've run into this (kind of unexpected behaviour) myself. Then again, it's really nice to have pandoc behave consistently when input is piped to in and when read from a file. With that in mind, I don't think making those intrusive and complicating changes is worth it. |
The unfortunate thing is that this is inconsistent with the way GitHub processes Markdown. So when I make my paths relative to the working directory the images render as broken in the online view. Maybe I'll just write a script to pre-process all my files into another directory before running pandoc... |
another thought: we could abstract the file handling from the readers, so they would only get a mediabag or similar interface of files to query which could be instantiated with either files from the working directory or current source file directory – depending on a command line setting for example. |
@mb21 Any change that would search for images in the working directory of the source file (when multiple files are specified on the command line) would have to keep track of which source file includes the given image, so we'd need the more complex interface I sketched above. I did have one thought for a more limited change: perhaps we could automatically set the resource path for images to include the directory of the first file argument. When |
@mb21 isn't this (specifying the image path via command line option) helpful also for multi-target publishing scenarios (HTML vs. PDF) where you need images in different resolution, which could be accomplished by changing folders? I find using the |
@jgm I did actually try |
A possible workaround for now could be writing a wrapper script that cd's into the correct directory, generates an output file for each input, and later concatenates the result. This is cumbersome, and I'm not even sure if it can be done with Another equally cumbersome workaround could be to pipe the markdown to I suppose we shall have to wait for the |
Does anyone know if |
Actually, |
I guess to emulate my desired behaviour I can provide all the directories as |
@Porges, depending on your workflow, wrappers that automatically cd into different directories such as |
@agusmba I'm generating a single file output from all the input files, so I don't think that would work. |
I came across this problem when exploring using Pandoc. I ended up renaming each image to something custom like the endpoint or the title of the post hypen image name. Then I ran this script to generate the command to run. YMMV https://gist.github.com/jacebenson/f6eba3a293def19bf8184defbf274dcc |
@tarleb by using --resource-path=.. Images defined using markdown syntax works but those with latex syntax don't work. |
I've run into this issue as well. My directory structure is like this. And I just define input files like A filter like this one, except for pandoc 2, would work nicely. Except we'd still need a way to determine what folder our markdown files are in. #3342 (comment) |
I keep re-reading this sentence and not sure I get it. I mean, I get:
but why does this imply that:
Why do we have to call the reader with a list of inputs There seem to be four modes:
I'm not sure what's the best name for this |
I already kind of do this but lean on the shell to expand globs: |
@mb21 if you do it the way we do with file-scope, calling the reader for each one, then things like reference links and footnotes won't work between files (as with file-scope). |
@jgm I meant not do it exactly the same way... but in a similar way: call the reader multiple times in App.hs, once for each input file, but share/pass the state along each time. I guess that would mean exporting a second function from each reader, e.g. |
It's tricky because state types are very heterogeneous. I suppose one approach would be to put something in PandocMonad for the inputs: Indeed, a lot of this logic is already included in our current facility for handling include files in latex, RST, and other formats. |
As for implementation, I'm thinking this would involve a new reader option and changes to the Markdown reader. It would also be possible to implement this as an AST transform after the reader, but this would still require changes to the reader, which would have to insert the needed information about the source location of the image or link elements. |
Btw I think there's an ambiguity in |
I would suggest |
I'm not quite sure what you have in mind here. I'm somewhat tempted to drop the argument and just always make the rewriting relative to the working directory. I can think of a few cases where the argument could be helpful, but in all those cases you could just change directories before calling pandoc to deal with the issue.
|
Oh, I thought |
Further simplification, removing the optional argument. (If there is a demand for this, we can always add it later without compromising backwards compatibility.)
Rewrite relative paths for Link and Image elements, depending The use of this option is best understood by example.
Without this option, you would have to use |
Implementing this would require:
|
I'm having the same issue with relative paths for ReST. The ReST documentation at https://docutils.sourceforge.io/docs/ref/rst/directives.html#including-an-external-document-fragment states
Therefore for a directory structure as follows with the following contents: foo.rst
bar.rst
This should work if I'm reading the specification of reST correctly. I'm not sure about the comment of @jgm #3752 (comment) , even if the main file passed could be from stdin such as using cat, the include statements still have an actual file location, and as the only file initially given, is the main file, the only assumption that would have to be made would be that about the location of the main file. In which case I would assume the working directory of pandoc.
It would make life so much easier, as one could write modules containing documentations and move them around freely without worrying about the context they are living in. |
One last piece of feedback from me, as a person who encountered this issue in the wild, I would also like to propose this: Hint for ambiguities Flags: nothing special
Hint for broken references Flags: nothing special
Ignore ambiguities if a specific flag is specified Flags: The pattern of prefixing with One shortcoming of this proposal is that it doesn't work great if you want to opt one Markdoc file into rebasing relative paths, but not another, but I don't envisage that being a popular usecase. |
Here's are some more options that should be considered: Option 4: No extra command-line option. Do the image resolution (and load binary data into the media bag) in the Markdown reader. Always look for the image first relative to the file containing the image link, and then in the resource path. Modify the image path to match the first matching image. Emit an INFO message indicating which path has been matched. Emit a WARNING message if nothing is found (this is done already now, but in the writers, and only for formats that include image data and not just links). Potential advantages:
Potential drawbacks:
Option 5: Like Option 4, but activate this behavior only if a command-line option is used (say, |
- Add manual entry for `--rebase-relative-paths`. - Add option `--rebase-relative-paths`, which rewrites relative image and link paths by prepending the (relative) directory of the containing file. - Enable `rebase-relative-paths` in defaults files. - Add `readerRebaseRelativePaths` to ReaderOptions record [API change]. - Make Markdown reader sensitive to `readerRebaseRelativePaths`. - Add tests for #3752. Closes #3752.
I've pushed a |
- Add manual entry for `--rebase-relative-paths`. - Add option `--rebase-relative-paths`, which rewrites relative image and link paths by prepending the (relative) directory of the containing file. - Enable `rebase-relative-paths` in defaults files. - Add `readerRebaseRelativePaths` to ReaderOptions record [API change]. - Make Markdown reader sensitive to `readerRebaseRelativePaths`. - Add tests for #3752. Closes #3752.
- Add manual entry for (non-default) extension `rebase_relative_paths`. - Add constructor `Ext_rebase_relative_paths` to `Extensions` in Text.Pandoc.Extensions [API change]. When enabled, this extension rewrites relative image and link paths by prepending the (relative) directory of the containing file. - Make Markdown reader sensitive to the new extension. - Add tests for #3752. Closes #3752. NB. currently the extension applies to markdown and associated readers but not commonmark/gfm.
My new thought is that it makes more sense for this to be an extension. |
- Add manual entry for (non-default) extension `rebase_relative_paths`. - Add constructor `Ext_rebase_relative_paths` to `Extensions` in Text.Pandoc.Extensions [API change]. When enabled, this extension rewrites relative image and link paths by prepending the (relative) directory of the containing file. - Make Markdown reader sensitive to the new extension. - Add tests for #3752. Closes #3752. NB. currently the extension applies to markdown and associated readers but not commonmark/gfm.
This is in master branch now. Currently the extension only works for |
The immediate reason for this is to allow the test output of #3752 to work on both windows and linux.
TLDR
<!-- ej1.md -->
# Hello world
data:image/s3,"s3://crabby-images/f947e/f947ebffbe25dea194cc4102f093db043f1bdc6d" alt="Image" Render every chapter with: pandoc "--from=markdown+rebase_relative_paths" --output=Book.pdf Chapter**/*.md More infoSearch for |
I'm using 1.19.2.1.
I have a setup like:
In the .md files I'm trying to include the images via relative links like
data:image/s3,"s3://crabby-images/b51c3/b51c3abf8efd79d11d9f2d0e8a004e7960b87db1" alt=""
, however when I do this and build from the top level (passing all the .md files as arguments), Pandoc cannot find the images. Instead I must supply the path as relative to the working directory (sosrc/001/001.jpg
), which is a bit clunky.So: a) is there any way to get my desired behaviour, and b) is the current behaviour intended? I would have expected paths to be relative to the file that they appear in.
Thanks!
The text was updated successfully, but these errors were encountered: