Skip to content

Commit

Permalink
OpenDocument writer: Allow references for internal links
Browse files Browse the repository at this point in the history
This commit adds two extensions to the OpenDocument writer,
`references_over_links` and `number_prefix_references`.

The first extension, `references_over_links`, substitutes document
internal links for references to headers, figures and tables.
Text in references is kept consistent with the referenced header,
table or figure which is an improvement if the document is edited
after being generated by pandoc.

The second extension `number_prefix_references` will prefix the
header references with the number according to the style of the
referenced heading in the final document. As noted in the MANUAL.txt
the document will need to have indexes updated for these numbers
to be generated - similarly to table of contents in OpenDocument.
Figure and table references are not number prefixed as the numbers
for those are inline in the caption.
  • Loading branch information
Nils Carlson authored and pyssling committed Nov 21, 2020
1 parent 56ceaf4 commit a76e73e
Show file tree
Hide file tree
Showing 4 changed files with 142 additions and 18 deletions.
28 changes: 28 additions & 0 deletions MANUAL.txt
Original file line number Diff line number Diff line change
Expand Up @@ -3027,6 +3027,34 @@ This extension can be enabled/disabled for the following formats:
output formats
: `odt`, `opendocument`

#### Extension: `references_over_links` ####

Links to headings, figures and tables inside the document are
substituted with references which will be updated if the heading
or caption is modified within the generated document.

This extension can be enabled/disabled for the following formats:

output formats
: `odt`, `opendocument`

#### Extension: `number_prefix_references` ####

Add a number prefix to references within the document that lack one.
For example a first heading of "Introduction" that is number prefixed
in the final document as "1 Introduction" will in references also be
number prefixed as "1 Introduction".

This typically requires the final document to have indexes refreshed
in a native editor such as libreoffice as pandoc is not aware of the
number style of the document.

This further assumes that references are used within the document
instead of links - see the `references_over_links` extension.

output formats
: `odt`, `opendocument`

#### Extension: `styles` #### {#ext-styles}

When converting from docx, read all docx styles as divs (for
Expand Down
4 changes: 4 additions & 0 deletions src/Text/Pandoc/Extensions.hs
Original file line number Diff line number Diff line change
Expand Up @@ -127,13 +127,15 @@ data Extension =
| Ext_native_spans -- ^ Use Span inlines for contents of <span>
| Ext_native_numbering -- ^ Use output format's native numbering for figures and tables
| Ext_ntb -- ^ ConTeXt Natural Tables
| Ext_number_prefix_references -- ^ If substituting links for references - prefix references with number as needed
| Ext_old_dashes -- ^ -- = em, - before number = en
| Ext_pandoc_title_block -- ^ Pandoc title block
| Ext_pipe_tables -- ^ Pipe tables (as in PHP markdown extra)
| Ext_raw_attribute -- ^ Allow explicit raw blocks/inlines
| Ext_raw_html -- ^ Allow raw HTML
| Ext_raw_tex -- ^ Allow raw TeX (other than math)
| Ext_raw_markdown -- ^ Parse markdown in ipynb as raw markdown
| Ext_references_over_links -- ^ Prefer references over links in formats which support both
| Ext_shortcut_reference_links -- ^ Shortcut reference links
| Ext_simple_tables -- ^ Pandoc-style simple tables
| Ext_smart -- ^ "Smart" quotes, apostrophes, ellipses, dashes
Expand Down Expand Up @@ -465,6 +467,8 @@ getAllExtensions f = universalExtensions <> getAll f
getAll "opendocument" = extensionsFromList
[ Ext_empty_paragraphs
, Ext_native_numbering
, Ext_references_over_links
, Ext_number_prefix_references
]
getAll "odt" = getAll "opendocument" <> autoIdExtensions
getAll "muse" = autoIdExtensions <>
Expand Down
88 changes: 70 additions & 18 deletions src/Text/Pandoc/Writers/OpenDocument.hs
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,7 @@ module Text.Pandoc.Writers.OpenDocument ( writeOpenDocument ) where
import Control.Arrow ((***), (>>>))
import Control.Monad.State.Strict hiding (when)
import Data.Char (chr)
import Data.Foldable (find)
import Data.List (sortOn, sortBy, foldl')
import qualified Data.Map as Map
import Data.Maybe (fromMaybe, isNothing)
Expand All @@ -35,6 +36,7 @@ import Text.DocLayout
import Text.Pandoc.Shared (linesToPara, tshow, blocksToInlines)
import Text.Pandoc.Templates (renderTemplate)
import qualified Text.Pandoc.Translations as Term (Term(Figure, Table))
import Text.Pandoc.Walk
import Text.Pandoc.Writers.Math
import Text.Pandoc.Writers.Shared
import qualified Text.Pandoc.Writers.AnnotatedTable as Ann
Expand All @@ -54,6 +56,11 @@ plainToPara x = x

type OD m = StateT WriterState m

data ReferenceType
= HeaderRef
| TableRef
| ImageRef

data WriterState =
WriterState { stNotes :: [Doc Text]
, stTableStyles :: [Doc Text]
Expand All @@ -69,6 +76,7 @@ data WriterState =
, stImageId :: Int
, stTableCaptionId :: Int
, stImageCaptionId :: Int
, stIdentTypes :: [(Text,ReferenceType)]
}

defaultWriterState :: WriterState
Expand All @@ -86,6 +94,7 @@ defaultWriterState =
, stImageId = 1
, stTableCaptionId = 1
, stImageCaptionId = 1
, stIdentTypes = []
}

when :: Bool -> Doc Text -> Doc Text
Expand Down Expand Up @@ -235,6 +244,12 @@ writeOpenDocument opts (Pandoc meta blocks) = do
meta
((body, metadata),s) <- flip runStateT
defaultWriterState $ do
let collectInlineIdent (Image (ident,_,_) _ _) = [(ident,ImageRef)]
collectInlineIdent _ = []
let collectBlockIdent (Header _ (ident,_,_) _) = [(ident,HeaderRef)]
collectBlockIdent (Table (ident,_,_) _ _ _ _ _) = [(ident,TableRef)]
collectBlockIdent _ = []
modify $ \s -> s{ stIdentTypes = query collectBlockIdent blocks ++ query collectInlineIdent blocks }
m <- metaToContext opts
(blocksToOpenDocument opts)
(fmap chomp . inlinesToOpenDocument opts)
Expand Down Expand Up @@ -398,7 +413,7 @@ blockToOpenDocument o bs
inTags True "text:list" [ ("text:style-name", "L" <> tshow ln)]
<$> orderedListToOpenDocument o pn b
table :: PandocMonad m => Ann.Table -> OD m (Doc Text)
table (Ann.Table _ (Caption _ c) colspecs thead tbodies _) = do
table (Ann.Table (ident, _, _) (Caption _ c) colspecs thead tbodies _) = do
tn <- length <$> gets stTableStyles
pn <- length <$> gets stParaStyles
let genIds = map chr [65..]
Expand All @@ -419,7 +434,7 @@ blockToOpenDocument o bs
then return empty
else inlinesToOpenDocument o (blocksToInlines c) >>=
if isEnabled Ext_native_numbering o
then numberedTableCaption
then numberedTableCaption ident
else unNumberedCaption "TableCaption"
th <- colHeadsToOpenDocument o (map fst paraHStyles) thead
tr <- mapM (tableBodyToOpenDocument o (map fst paraStyles)) tbodies
Expand All @@ -428,36 +443,39 @@ blockToOpenDocument o bs
, ("table:style-name", name)
] (vcat columns $$ th $$ vcat tr)
return $ captionDoc $$ tableDoc
figure attr caption source title | null caption =
figure attr@(ident, _, _) caption source title | null caption =
withParagraphStyle o "Figure" [Para [Image attr caption (source,title)]]
| otherwise = do
imageDoc <- withParagraphStyle o "FigureWithCaption" [Para [Image attr caption (source,title)]]
captionDoc <- inlinesToOpenDocument o caption >>=
if isEnabled Ext_native_numbering o
then numberedFigureCaption
then numberedFigureCaption ident
else unNumberedCaption "FigureCaption"
return $ imageDoc $$ captionDoc


numberedTableCaption :: PandocMonad m => Doc Text -> OD m (Doc Text)
numberedTableCaption caption = do
numberedTableCaption :: PandocMonad m => Text -> Doc Text -> OD m (Doc Text)
numberedTableCaption ident caption = do
id' <- gets stTableCaptionId
modify (\st -> st{ stTableCaptionId = id' + 1 })
capterm <- translateTerm Term.Table
return $ numberedCaption "TableCaption" capterm "Table" id' caption
return $ numberedCaption "TableCaption" capterm "Table" id' ident caption

numberedFigureCaption :: PandocMonad m => Doc Text -> OD m (Doc Text)
numberedFigureCaption caption = do
numberedFigureCaption :: PandocMonad m => Text -> Doc Text -> OD m (Doc Text)
numberedFigureCaption ident caption = do
id' <- gets stImageCaptionId
modify (\st -> st{ stImageCaptionId = id' + 1 })
capterm <- translateTerm Term.Figure
return $ numberedCaption "FigureCaption" capterm "Illustration" id' caption
return $ numberedCaption "FigureCaption" capterm "Illustration" id' ident caption

numberedCaption :: Text -> Text -> Text -> Int -> Doc Text -> Doc Text
numberedCaption style term name num caption =
numberedCaption :: Text -> Text -> Text -> Int -> Text -> Doc Text -> Doc Text
numberedCaption style term name num ident caption =
let t = text $ T.unpack term
r = num - 1
s = inTags False "text:sequence" [ ("text:ref-name", "ref" <> name <> tshow r),
ident' = case ident of
"" -> "ref" <> name <> tshow r
_ -> ident
s = inTags False "text:sequence" [ ("text:ref-name", ident'),
("text:name", name),
("text:formula", "ooow:" <> name <> "+1"),
("style:num-format", "1") ] $ text $ show num
Expand Down Expand Up @@ -593,7 +611,9 @@ inlineToOpenDocument o ils
else do
report $ InlineNotRendered ils
return empty
Link _ l (s,t) -> mkLink s t <$> inlinesToOpenDocument o l
Link _ l (s,t) -> do
identTypes <- gets stIdentTypes
mkLink o identTypes s t <$> inlinesToOpenDocument o l
Image attr _ (s,t) -> mkImg attr s t
Note l -> mkNote l
where
Expand All @@ -605,10 +625,6 @@ inlineToOpenDocument o ils
unhighlighted s = inlinedCode $ preformatted s
preformatted s = handleSpaces $ escapeStringForXML s
inlinedCode s = return $ inTags False "text:span" [("text:style-name", "Source_Text")] s
mkLink s t = inTags False "text:a" [ ("xlink:type" , "simple")
, ("xlink:href" , s )
, ("office:name", t )
] . inSpanTags "Definition"
mkImg (_, _, kvs) s _ = do
id' <- gets stImageId
modify (\st -> st{ stImageId = id' + 1 })
Expand All @@ -635,6 +651,42 @@ inlineToOpenDocument o ils
addNote nn
return nn

mkLink :: WriterOptions -> [(Text,ReferenceType)] -> Text -> Text -> Doc Text -> Doc Text
mkLink o identTypes s t d =
let maybeIdentAndType = case T.uncons s of
Just ('#', ident) -> find ((ident ==) . fst) identTypes
_ -> Nothing
d' = inSpanTags "Definition" d
ref refType format ident = inTags False refType
[ ("text:reference-format", format ),
("text:ref-name", ident) ]
bookmarkRef = ref "text:bookmark-ref"
sequenceRef = ref "text:sequence-ref"
headerReference ident = if isEnabled Ext_number_prefix_references o
-- The number prefix on header references is empty
-- as generated by pandoc - indices need to be
-- refreshed for the number to be generated.
then bookmarkRef "number" ident mempty
<>
selfClosingTag "text:s" []
<>
bookmarkRef "text" ident d
else bookmarkRef "text" ident d
tableReference ident = bookmarkRef "text" ident d
imageReference ident = sequenceRef "text" ident d
link = inTags False "text:a" [ ("xlink:type" , "simple")
, ("xlink:href" , s )
, ("office:name", t )
] d'
linkOrReference = case maybeIdentAndType of
Just (ident, HeaderRef) -> headerReference ident
Just (ident, TableRef) -> tableReference ident
Just (ident, ImageRef) -> imageReference ident
_ -> link
in if isEnabled Ext_references_over_links o
then linkOrReference
else link

bulletListStyle :: PandocMonad m => Int -> OD m (Int,(Int,[Doc Text]))
bulletListStyle l = do
let doStyles i = inTags True "text:list-level-style-bullet"
Expand Down
40 changes: 40 additions & 0 deletions test/command/6774.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
```
% pandoc -f native -t opendocument --quiet
[Header 1 ("chapter1",[],[]) [Str "The",Space,Str "Chapter"]
,Para [Str "Chapter",Space,Str "1",Space,Str "references",Space,Link ("",[],[]) [Str "The",Space,Str "Chapter"] ("#chapter1","")]]
^D
<text:h text:style-name="Heading_20_1" text:outline-level="1"><text:bookmark-start text:name="chapter1" />The
Chapter<text:bookmark-end text:name="chapter1" /></text:h>
<text:p text:style-name="First_20_paragraph">Chapter 1 references
<text:a xlink:type="simple" xlink:href="#chapter1" office:name=""><text:span text:style-name="Definition">The
Chapter</text:span></text:a></text:p>
```
```
% pandoc -f native -t opendocument+references_over_links --quiet
[Header 1 ("chapter1",[],[]) [Str "The",Space,Str "Chapter"]
,Para [Str "Chapter",Space,Str "1",Space,Str "references",Space,Link ("",[],[]) [Str "The",Space,Str "Chapter"] ("#chapter1","")]
,Para [Image ("lalune",[],[]) [Str "lalune"] ("lalune.jpg","fig:Voyage dans la Lune")]
,Para [Str "Image",Space,Str "1",Space,Str "references",Space,Link ("",[],[]) [Str "La",Space,Str "Lune"] ("#lalune","")]]
^D
<text:h text:style-name="Heading_20_1" text:outline-level="1"><text:bookmark-start text:name="chapter1" />The
Chapter<text:bookmark-end text:name="chapter1" /></text:h>
<text:p text:style-name="First_20_paragraph">Chapter 1 references
<text:bookmark-ref text:reference-format="text" text:ref-name="chapter1">The
Chapter</text:bookmark-ref></text:p>
<text:p text:style-name="FigureWithCaption"><draw:frame draw:name="img1"><draw:image xlink:href="lalune.jpg" xlink:type="simple" xlink:show="embed" xlink:actuate="onLoad" /></draw:frame></text:p>
<text:p text:style-name="FigureCaption">lalune</text:p>
<text:p text:style-name="Text_20_body">Image 1 references
<text:sequence-ref text:reference-format="text" text:ref-name="lalune">La
Lune</text:sequence-ref></text:p>
```
```
% pandoc -f native -t opendocument+references_over_links+number_prefix_references --quiet
[Header 1 ("chapter1",[],[]) [Str "The",Space,Str "Chapter"]
,Para [Str "Chapter",Space,Str "1",Space,Str "references",Space,Link ("",[],[]) [Str "The",Space,Str "Chapter"] ("#chapter1","")]]
^D
<text:h text:style-name="Heading_20_1" text:outline-level="1"><text:bookmark-start text:name="chapter1" />The
Chapter<text:bookmark-end text:name="chapter1" /></text:h>
<text:p text:style-name="First_20_paragraph">Chapter 1 references
<text:bookmark-ref text:reference-format="number" text:ref-name="chapter1"></text:bookmark-ref><text:s /><text:bookmark-ref text:reference-format="text" text:ref-name="chapter1">The
Chapter</text:bookmark-ref></text:p>
```

0 comments on commit a76e73e

Please sign in to comment.