-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Reference] fix image embedding in DOCX output #40
Conversation
import subprocess | ||
import os | ||
import sys | ||
from pandocfilters import toJSONFilter, Para, Image |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How is pandocfilters
getting installed? Should it be added to environment.yml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be. I just forgot it. Will add it in.
except OSError: | ||
mtime = -1 | ||
if mtime < os.path.getmtime(src): | ||
cmd_line = ['inkscape', option[0], eps_name, file_name] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this method depend on inkscape? That's a pretty heavy dependency? Can inkscape be installed via conda?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dhimmel Yest this method does depend on inkscape. There seems to have been a working conda install, but that seems broken now inkscape-feedstock.
content/03.figures.md
Outdated
@@ -5,4 +5,4 @@ The figures can be referenced in the text by using `@fig:label`. | |||
|
|||
Figure @fig:googletrends shows the interest for "Sci-Hub" and "LibGen" over time. | |||
|
|||
![Google Trends Search interest for Sci-Hub and LibGen.](https://cdn.rawgit.com/greenelab/scihub/7891082161dbcfcd5eeb1d7b76ee99ab44b95064/explore/trends/google-trends.svg){#fig:googletrends} | |||
![Google Trends Search interest for Sci-Hub and LibGen.](images/google-trends.svg){#fig:googletrends} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the method work with URL images?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dhimmel Yes it can work with url's just need to modify the code to regex check if url.
@@ -52,6 +52,8 @@ then | |||
--to=docx \ | |||
--filter pandoc-fignos \ | |||
--filter pandoc-tablenos \ | |||
--filter pandoc-img-glob \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
--filter pandoc-img-glob
@dhimmel is restructuring the relative path to the image for docx conversion to absolute paths relative to markdown. Without it I get these errors:
[pandoc warning] Could not find image "images/image2.png", skipping...
@vsmalladi thanks a lot for helping us find workaround for the SVG issues. Before you go further with this pull request, I should note that this may not be a feature we want to merge for the following reasons:
Correct me on any of these considerations if I'm wrong. Also I'd like @agitter's sense whether I'm being too picky here. I am really interested in resolving our image issues. But we have to balance the benefit-to-complexity ratio carefully. At a minimum, we can leave this PR open and reference it in the README as a workaround for users where this functionality is essential. So why this is in no way a final decision at this point, I wanted to bring it up in case it changes your calculus on how much time to spend here. |
Agreed it is heavy and maybe there is a a better way that would resolve this later
Actually this can also work for PDF files, I just wanted to try it with DOCX output first. Other implementation have other use cases. So users could change the behavior they would require.
I will explore a little more. But I think its at least a workaround that we can reference. |
Updated Document using PDF for images, fixes transparency. |
I agree that the SVG issues are important, so thanks for working on this.
@vsmalladi should I be able to see the figure in this latest version? Is the image not embedded in the docx file? In Word 2013 on Windows I get: @dhimmel I don't have a firm opinion on merging this or the direction we want to go with images. I think it depends in part on whether we expect users will mostly include images using URLs pointing to third party resources or include the images in their manuscript's repository. If they are managing the images themselves in the repo, then we can state the current limitations and suggest suitable file formats and workarounds. If they are going to be incorporating third party images via URL, users might expect that the build system is robust enough to work for most image types. That would involve a heavier build system and perhaps inkscape or some graphics software. As long as we can still set up the environment completely within conda, I'm okay with more dependencies. |
@agitter You should be able to see the image. I though it was embedded in the docx file. I uploaded another version. I have had no issues with opening in microsoft365 on mac |
@vsmalladi that version of the docx works for me. The icons look like they didn't convert perfectly. They are warped and cropped a little on the right. |
@agitter I will look into that. Might be an issue with image formatting in word. Hopefully a quick fix. Also working on seeing how to export to PDF without going to the HTML first. |
@vsmalladi before moving away from |
@agitter agreed. I will experiment and see what problem is fixes and what new problems it might introduce. |
@agitter Fixed the ratio of the images in the word document, but had to specify width in the markdown.
|
The images are broken for me again in this latest version. |
@agitter I see what the issue is. The mac word saves the document in compatibility mode, which makes it a doc file compatibility with word 97-2003. If you convert, in word, then the document saves to docx format and zips the images into the document. This is a little annoying but workable. |
@vsmalladi I'm not sure what's going on. If I open your latest docx file as a zip file, I see a In |
@agitter Thanks for looking into this. I will look closer at what is causing this issue. |
@agitter What version of word are you using to open the document and what system? |
@vsmalladi I'm on Word 2013 on Windows 10. I looked at the version that did work to see if anything jumped out in the xml. The figures had been converted to emf files in
Those are referenced in
|
@agitter I am still investigating, but decided to follow up on your emf vs pdf issue. I have made both versions. Let me know if the images can be opened in either one? |
@vsmalladi The images in |
With LibreOffice on Linux (Version: |
I tested These docx issues are making me less excited about supporting that as one of our build formats. I don't think we want to continue debugging multiple docx editors. @vsmalladi do you think we should work toward a specific editor or subset of editors? Or continue to recommend copy/paste from HTML? |
@agitter The more and more issues we are seeing I am not sure we can support docx as a format right now. I think we can open and issue and post some examples of how this could be done. And let users decided on their own. Thoughts? |
@vsmalladi I agree that general docx support is too much to take on based on your experiments. We can illustrate what is known to work - combinations of image types and docx editors - but not support all combinations. @dhimmel Where should this information go? Would you like to leave this pull request open, create a new issue, or something else? |
I updated the PR title and added a note to the first comment. Adding a note to the DOCX output documentation in the README that references this PR is what I'd recommend. |
This build is based on 247fdbe. This commit was created by the following Travis CI build and job: https://travis-ci.org/greenelab/manubot-rootstock/builds/259134255 https://travis-ci.org/greenelab/manubot-rootstock/jobs/259134256 [ci skip] The full commit message that triggered this build is copied below: README: reference DOCX image embedding PR (#43) Refs #40
This build is based on 247fdbe. This commit was created by the following Travis CI build and job: https://travis-ci.org/greenelab/manubot-rootstock/builds/259134255 https://travis-ci.org/greenelab/manubot-rootstock/jobs/259134256 [ci skip] The full commit message that triggered this build is copied below: README: reference DOCX image embedding PR (#43) Refs #40
[{{author.github}}](https://github.com/{{author.github}}) | ||
{%- endif %} | ||
{%- if author.twitter is defined %} | ||
· ![Twitter icon](images/twitter.svg){height="13px"} | ||
· ![Twitter icon](images/twitter.svg){height="13px" width="13px"} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vsmalladi would specifying width="13px"
for these SVGs be valuable independent of the rest of the changes in this PR? These SVGs export to DOCX currently, just they are misshapen?
If these changes are valuable alone, you should open a quick PR with just those changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dhimmel I will test them out based on the updates and see if works.
Now width and height are both specified at 13 pixels, to constrain the aspect ratio of these SVGs as square. Previously, the icons appeared squished in DOCX exports. See #40
This build is based on aba5246. This commit was created by the following Travis CI build and job: https://travis-ci.org/greenelab/manubot-rootstock/builds/283792648 https://travis-ci.org/greenelab/manubot-rootstock/jobs/283792649 [ci skip] The full commit message that triggered this build is copied below: Specify width of front-matter SVGs (#79) Now width and height are both specified at 13 pixels, to constrain the aspect ratio of these SVGs as square. Previously, the icons appeared squished in DOCX exports. See #40
This build is based on aba5246. This commit was created by the following Travis CI build and job: https://travis-ci.org/greenelab/manubot-rootstock/builds/283792648 https://travis-ci.org/greenelab/manubot-rootstock/jobs/283792649 [ci skip] The full commit message that triggered this build is copied below: Specify width of front-matter SVGs (#79) Now width and height are both specified at 13 pixels, to constrain the aspect ratio of these SVGs as square. Previously, the icons appeared squished in DOCX exports. See #40
Now width and height are both specified at 13 pixels, to constrain the aspect ratio of these SVGs as square. Previously, the icons appeared squished in DOCX exports. See manubot/rootstock#40
With Pandoc 2.9 (and possibly before), SVGs are converting to DOCX fine on my system. This probably depends on whether |
I cannot verify this on macOS. I have pandoc 2.9.1 installed. |
Now width and height are both specified at 13 pixels, to constrain the aspect ratio of these SVGs as square. Previously, the icons appeared squished in DOCX exports. See manubot/rootstock#40
Do not merge: See discussion below. Pull request is now for reference.
Fix export of svg images to eps images for word document. All links currently have to be local
manuscript.docx