Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comments in docx writer #2994

Closed
jkr opened this issue Jun 23, 2016 · 27 comments
Closed

Comments in docx writer #2994

jkr opened this issue Jun 23, 2016 · 27 comments

Comments

@jkr
Copy link
Collaborator

jkr commented Jun 23, 2016

We added track-changes comments to the docx reader with 8bb739f. (See discussion in #2884). It would be nice to add them to the writer, just as we have with insertions and deletions.

Note that since commented-upon sections can extend across blocks, we have two spans:

<span class="comment-start" id="3" author="XYZ" date="DATE">comment</span>This is the 
text that is commented on.<span class="comment-end" id=3"></span>

Not the prettiest markup ever, but it seems to be best at capturing and the meaning of the source.

@jgm
Copy link
Owner

jgm commented Dec 7, 2016

Can you give the OOXML that should be emitted for the sample above?

@jkr
Copy link
Collaborator Author

jkr commented Dec 8, 2016

It might be easier to look in tests/docx/comments.{docx,native}. The native of the first paragraph there maps to the following markdown:

I want <span class="comment-start" id="0" author="Jesse Rosenthal"
date="2016-05-09T16:13:00Z">I left a comment.</span>some text to have a
comment <span class="comment-end" id="0"></span>on it.

The corresponding OOXML, with indentation, is in two files:

document/word.xml:

    <w:p w14:paraId="645F12BE" w14:textId="408376B0"
    w:rsidR="0008235A" w:rsidRDefault="0008235A">
      <w:r>
        <w:t xml:space="preserve">
I want 
</w:t>
      </w:r>
      <w:commentRangeStart w:id="0" />
      <w:r>
        <w:t xml:space="preserve">
some text to have a comment 
</w:t>
      </w:r>
      <w:commentRangeEnd w:id="0" />
      <w:r>
        <w:rPr>
          <w:rStyle w:val="CommentReference" />
        </w:rPr>
        <w:commentReference w:id="0" />
      </w:r>
      <w:r>
        <w:t>on it.</w:t>
      </w:r>
    </w:p>

and document/comments.xml:

  <w:comment w:id="0" w:author="Jesse Rosenthal"
  w:date="2016-05-09T16:13:00Z" w:initials="jkr">
    <w:p w14:paraId="49E49F95" w14:textId="77777777"
    w:rsidR="0008235A" w:rsidRDefault="0008235A">
      <w:pPr>
        <w:pStyle w:val="CommentText" />
      </w:pPr>
      <w:r>
        <w:rPr>
          <w:rStyle w:val="CommentReference" />
        </w:rPr>
        <w:annotationRef />
      </w:r>
      <w:r>
        <w:t>I left a comment.</w:t>
      </w:r>
    </w:p>
  </w:comment>

@tolot27
Copy link
Contributor

tolot27 commented Aug 10, 2017

Is there any progress on this issue?

@jgm jgm added this to the pandoc 2.0 milestone Aug 10, 2017
@jgm
Copy link
Owner

jgm commented Aug 10, 2017

This looks straightforward enough, and it would be handy to retain these things on round-trip.
I've added this to the pandoc 2.0 milestone.

@jgm jgm closed this as completed in 418bda8 Aug 13, 2017
@iandol
Copy link
Contributor

iandol commented Aug 13, 2017

Hm, so if I follow the formatting, I should be able to make a filter to convert something like this bracketed span [here is some text]{.comment comment="blah blah"} into that the DOCX writer can understand?

@jgm
Copy link
Owner

jgm commented Aug 14, 2017 via email

@kschach
Copy link

kschach commented Oct 27, 2018

@iandol did you end up coming up with a Markdown to Word Comment filter? I did a quick search through your (very interesting repo) but couldn't find it. I'd be interested in it if you've created one. Thanks!

@tolot27
Copy link
Contributor

tolot27 commented Oct 27, 2018

@kschach What is your intend? Pandoc almost fully supports track changes including comments during conversion from and to docx. And there is also a lua filter which enhances it to LaTeX/PDF and HTML and can also filter it out for other output formats.

@kschach
Copy link

kschach commented Oct 27, 2018 via email

@iandol
Copy link
Contributor

iandol commented Oct 31, 2018

@kschach — I didn't write a filter as I use Scrivener for all my writing and it allows me to transform its native RTF comments into the markup automatically on Compile so there was no need for me to write a Pandoc filter in the end, sorry.

@tolot27
Copy link
Contributor

tolot27 commented Oct 31, 2018

The problem is that pandoc does not have a spec or AST support for comments. See #2873 for related discussion. You may want to try the pandoc preprocessor pancritic.

Nested comments are still problematic but not with the current syntax which the docx reader produce. I thought simplifying my lua filter to automatically add the author, ID and date attributes if not provided in the markdown. But for the date attribute it makes not much sence, except you don't want to track the time of comment/modification or do a roundtrip conversion from docx back to your md.

@dbaynard
Copy link

dbaynard commented Oct 31, 2018

@kschach I have something. It's a haskell filter — are you set up for that?

I'll have to separate it from the rest of the module. Here's the outline, though it's missing imports (PB is Text.Pandoc.Builder). I'd need a short while to filter out the superfluous stuff.

main :: IO ()
main = toJSONFilter . inlineFilter

inlineFilter :: Inline -> Inline
inlineFilter = (0 &) . evalState . docxComment

pattern Comment on c = Span ("", ["comment"], [("comment", c)]) on
pattern Todo on t = Span ("", ["todo"], [("todo", t)]) on
pattern TodoEx on t = Span ("", ["todo", "experiment"], [("todo", t)]) on

-- TODO add date and author
pattern DocxCommentBegin i c = Span (i, ["comment-start"], []) c
pattern DocxCommentEnd i = Span (i, ["comment-start"], []) []

type CommentCount = "Comment" `Tagged` Int

docxComment
    :: MonadState CommentCount m
    => Inline
    -> m Inline
docxComment (Comment on c) = docxComment' on c
docxComment x = pure x

docxComment'
    :: MonadState CommentCount m
    => [Inline]
    -> String
    -> m Inline
docxComment' on c = state $ \i -> (,i+1) . Span PB.nullAttr . mconcat $
    [ [DocxCommentBegin (show i) . PB.toList . PB.str $ c]
    , on
    , [DocxCommentEnd (show i)]
    ]

@v4dkou
Copy link

v4dkou commented Jan 13, 2022

I've just stumbled upon a peculiar case, where two markers of ending comments are on the same character index.
Tags become nested in eachother:

[[]{.comment-end id="1"}]{.comment-end
id="3"}

@jgm Is there a reason why "comment-end" tags nest internally or is it possible to make them appear in a sequence like this for ease of parsing?

[]{.comment-end id="1"}[]{.comment-end id="3"}

P.S. "comment-start" tags on the same index do not nest.

@jgm
Copy link
Owner

jgm commented Jan 13, 2022

@v4dkou I have no idea. @jkr is the one who would know!

@v4dkou
Copy link

v4dkou commented Jan 31, 2022

@jkr Sorry to ping, but is there any useful info regarding nesting of "comment-end" tags?

@rkingett
Copy link

rkingett commented Oct 4, 2023

Following this because I'd love for a way to write in Markdown but have other users read comments in their native programs function, like if I comment in an MD file with some kind of cyntax it should work with PDF and Words built in comment tracking system. I know about the reader interpretation of track changes but now I am looking for a way to make suggestions in a Markdown file but have the DOCX writer see it as suggestions and comments.

@tolot27
Copy link
Contributor

tolot27 commented Oct 4, 2023

@rkingett You can try the track-changes lua filter. Currently, it does not support nested comments but plain comments.

@marviro
Copy link

marviro commented Feb 7, 2024

Hi, this feature is very useful! Thank you @jgm ! Two questions:

  1. I think this is not well documented in the pandoc documentation. There is information about docx2md (with the filter --track-changes but I find nothing about the possibility of creating comments in the md and to transform them in docx.
  2. It would be helpful to modify the other writers (especially tex and pdf)... at least to make them ignore the comments? Should I open an issue about that?

@jgm
Copy link
Owner

jgm commented Feb 7, 2024

The other writers will automatically ignore all raw HTML, including HTML comments.

@marviro
Copy link

marviro commented Feb 7, 2024

The other writers will automatically ignore all raw HTML, including HTML comments.

Yes if it's a comment like <!--some text->. But no, if it's like:

I want [I left a comment.]{.comment-start id="0"
author="Jesse Rosenthal" date="2016-05-09T16:13:00Z"}some text to have a
comment []{.comment-end id="0"}on it.

which is the kind of comment that iterests this issue, if I understand well.

@jgm
Copy link
Owner

jgm commented Feb 7, 2024

Sorry, I confused two different threads.

@jgm
Copy link
Owner

jgm commented Feb 7, 2024

Perhaps you should open a new issue for these things.

@marviro
Copy link

marviro commented Feb 7, 2024

I'll do it! Thanks

@tolot27
Copy link
Contributor

tolot27 commented Feb 7, 2024

As mentioned above, you can use the track-changes Lua filter with the parameter --track-changes=reject to remove the comments in the output format.

@marviro
Copy link

marviro commented Feb 8, 2024

As mentioned above, you can use the track-changes Lua filter with the parameter --track-changes=reject to remove the comments in the output format.

Thank you! Sorry I did not understand that this was actually the purpose of the filter!

Anyway, I have the feeling that this should be addressed by pandoc itself - as far as the comments are in the ast... So I guess that an issue is pertinent.

@tolot27
Copy link
Contributor

tolot27 commented Feb 8, 2024

I did not understand that this was actually the purpose of the filter!

The main purpose of the filter is to see the changes in LaTeX/PDF and HTML as well. Filter them out was just a necessary side purpose.

Anyway, I have the feeling that this should be addressed by pandoc itself - as far as the comments are in the ast... So I guess that an issue is pertinent.

Indeed yes and a pull request will be welcome. 😄

@marviro
Copy link

marviro commented Feb 8, 2024

Indeed yes and a pull request will be welcome. 😄

I'm still working on my haskell skills... not ready yet ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

9 participants