Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ODT native footnotes instead of references/hyperlinks #144

Closed
jgm opened this issue Jun 10, 2011 · 7 comments
Closed

Use ODT native footnotes instead of references/hyperlinks #144

jgm opened this issue Jun 10, 2011 · 7 comments
Assignees

Comments

@jgm
Copy link
Owner

jgm commented Jun 10, 2011

I'm looking for a way to convert LaTeX files with complex footnotes to ODT
and thence to MS Word. Pandoc is almost perfect, but when I open the
output in OpenOffice, the footnotes are not the sort of thing you get when
you use OpenOffice to insert a footnote. Rather, it is a hyperlink to text
at the end of the file. It would be great if Pandoc could be configured to
use OpenOffice's native footnote and/or endnote type instead.

Also, a spurious paragraph break is generated after the footnote mark if it
appears mid-paragraph (using version 0.46).

Best wishes,

Peter

Google Code Info:
Issue #: 222
Author: phes...@gmail.com
Created On: 2010-03-04T15:15:23.000Z
Closed On: 2010-03-22T16:54:16.000Z

@ghost ghost assigned jgm Jun 10, 2011
@jgm jgm closed this as completed Jun 10, 2011
@jgm
Copy link
Owner Author

jgm commented Jun 10, 2011

Pandoc does use native opendocument footnotes. Here's an example:

% pandoc -t opendocument
hi^[there]

<text:p text:style-name="Text_20_body">hi<text:note text:id="ftn0" text:note-
class="footnote">text:note-citation1/text:note-citation<text:note-
body><text:p text:style-name="Footnote">there/text:p</text:note-
body>/text:note/text:p

And, when I look at pandoc-generated ODTs in OpenOffice, the footnotes appear as
proper footnotes, not as you describe. Have you tried opening the ODT in
OpenOffice? Maybe it's an issue with Word's conversion. Another possibility is that
it's something that was fixed in a more recent version of pandoc (we're on 1.4 now),
though I can't remember changing footnote behavior.

If all that fails, it would be helpful if you'd send a text file that produces the
undesired results.

Google Code Info:
Author: fiddloso...@gmail.com
Created On: 2010-03-04T16:49:18.000Z

@jgm
Copy link
Owner Author

jgm commented Jun 10, 2011

You are right. Pandoc 1.4 does the right thing with footnotes. I was
using Ubuntu's out-of-date version. I did try to check that this had
not already been fixed before submitting the feature request by trying
it out on the on-line web-based pandoc, which I assumed would be up to
date, but I guess it is not. I am using OpenOffice, not Word (that's
just the destination format).

Sorry for the false alarm.

I did discover a bug in Pandoc 1.4, however. When converting a
complex Latex file, OpenOffice (version 3.1) refuses to open the
resulting odt file. It throws up an error message in a dialog box:

Format error discovered in the file in sub-document content.xml at
700,167 (row,col).

When I unzip it and examine line 700 of content.xml, there is indeed
an anomaly at column 167:

699 <style:style style:name="T696" style:family="text"><style:text-properties
fo:font-style="italic" style:font-style-asian="italic"
style:font-style-complex="italic" />/style:style
700 <style:style style:name="T697"
style:family="text"><style:text-properties fo:font-style="italic"
style:font-style-asian="italic" style:font-style-complex="italic" *
fo:font-style="italic" style:font-style-asian="italic"
style:font-style-complex="italic" />/style:style

I have put an asterisk at column 167, where several of the attributes
of this element are spuriously repeated. This happens again in one
other line of that file. When I edit the file to remove these
extraneous attributes, zip up the odt and open it in OpenOffice, the
error is gone and the file looks fine.

Another glitch:

In my file, pandoc fails to convert the \section{foobar}
command, even though I see that there is code to handle sectioning
commands in pandoc. It removes the \section but leaves the curly
braces and the section title as normal body text.

Also, pandoc similarly does not handle the \textsc{} command for small
caps, but in this case there seems to be no code to do so. It would
be a nice, small addition.

Thanks for a very nice piece of software, which means I can stop using latex2rtf.

With best wishes,

Peter

Google Code Info:
Author: phes...@gmail.com
Created On: 2010-03-06T21:41:11.000Z

@jgm
Copy link
Owner Author

jgm commented Jun 10, 2011

PS. Also, the LaTeX reader does not know about "~" as a non-breaking space, though
the writer does.

Google Code Info:
Author: phes...@gmail.com
Created On: 2010-03-06T23:03:28.000Z

@jgm
Copy link
Owner Author

jgm commented Jun 10, 2011

Thanks for all of this. I've contacted the author of the opendocument module to see
if we can figure out what's causing the duplicated attributes. But if we don't get
to the bottom of that, I can do a simpler fix, telling pandoc's XML formatting module
not to allow duplicated attributes.

On the \section{foobar} issue: I can't reproduce this. Can you send
a few lines from the actual text? Do you perhaps have some bracketed options or
something in the section command?

Google Code Info:
Author: fiddloso...@gmail.com
Created On: 2010-03-10T22:31:01.000Z

@jgm
Copy link
Owner Author

jgm commented Jun 10, 2011

I did some experimenting with Pandoc 1.4 and it appears that the \section{foobar} bug
is triggered when that command is not the first text on its line -- any text, even
leading whitespace, will confuse the parser. You may object that any normal, sane
Latex file will have \section{} commands on their own line, but I am trying to use
Pandoc to convert files that have already been partially converted by machine, so it
can happen that \section{} commands occur in the middle of the run of text. What
exactly I am trying to do is explained here:
http://www.dur.ac.uk/p.j.heslin/Software/Latex/latex2doc.php.

With regard to the other issue, would it be possible to convert the Latex tilde (~)
to a non-breaking space in the output?

Best wishes,

Peter

Google Code Info:
Author: phes...@gmail.com
Created On: 2010-03-12T17:05:12.000Z

@jgm
Copy link
Owner Author

jgm commented Jun 10, 2011

Thanks for the clarification. And yes, I'm on board with the ~ change.

Google Code Info:
Author: fiddloso...@gmail.com
Created On: 2010-03-12T17:30:57.000Z

@jgm
Copy link
Owner Author

jgm commented Jun 10, 2011

The remaining bug (with duplicate attributes) is resolved in
1b1ba25
I hope to include this in a released version very soon.

All the other issues are addressed in 1.5.0.1.

So I'm closing the bug.

Google Code Info:
Author: fiddloso...@gmail.com
Created On: 2010-03-22T16:54:16.000Z

jgm added a commit that referenced this issue Feb 27, 2017
Add keywords to HTML templates; realign.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant