Skip to content

Extensions

Vladimir Schneider edited this page May 1, 2023 · 261 revisions

Contents

Extensions need to extend the parser, HTML renderer, formatter, or any combination of these. To use an extension, the builder objects can be configured with a list of extensions. Because extensions are optional, they live in separate artifacts, requiring additional dependencies to be added to the project.

Let's look at how to enable tables from GitHub Flavored Markdown or MultiMarkdown, which ever your prefer. First, add all modules as a dependency (see Maven Central for individual modules):

<dependency>
    <groupId>com.vladsch.flexmark</groupId>
    <artifactId>flexmark-all</artifactId>
    <version>0.60.0</version>
</dependency>

Configure the extension on the builders:

import com.vladsch.flexmark.ext.tables.TablesExtension;
import com.vladsch.flexmark.ext.gfm.strikethrough.StrikethroughExtension;

class SomeClass {
    static final DataHolder OPTIONS = new MutableDataSet()
                .set(Parser.EXTENSIONS, Arrays.asList(TablesExtension.create(), StrikethroughExtension.create()))
                .toImmutable();

    Parser parser = Parser.builder(options).build();
    HtmlRenderer renderer = HtmlRenderer.builder(options).build();
}

Configuring Options

A generic options API allows easy configuration of the parser, renderer and extensions. It consists of DataKey<T> instances defined by various components. Each data key defines the type of its value and a default value.

The values are accessed via the DataHolder and MutableDataHolder interfaces, with the former being a read only container. Since the data key provides a unique identifier for the data there is no collision for options.

To configure the parser or renderer, pass a data holder to the builder() method with the desired options configured, including extensions.

import com.vladsch.flexmark.html.HtmlRenderer;
import com.vladsch.flexmark.parser.Parser;

public class SomeClass {
    static final DataHolder OPTIONS = new MutableDataSet()
            .set(Parser.REFERENCES_KEEP, KeepType.LAST)
            .set(HtmlRenderer.INDENT_SIZE, 2)
            .set(HtmlRenderer.PERCENT_ENCODE_URLS, true)

            // for full GFM table compatibility add the following table extension options:
            .set(TablesExtension.COLUMN_SPANS, false)
            .set(TablesExtension.APPEND_MISSING_COLUMNS, true)
            .set(TablesExtension.DISCARD_EXTRA_COLUMNS, true)
            .set(TablesExtension.HEADER_SEPARATOR_COLUMN_MATCH, true)
            .set(Parser.EXTENSIONS, Arrays.asList(TablesExtension.create()))
            .toImmutable();

    static final Parser PARSER = Parser.builder(OPTIONS).build();
    static final HtmlRenderer RENDERER = HtmlRenderer.builder(OPTIONS).build();
}

In the code sample above, Parser.REFERENCES_KEEP defines the behavior of references when duplicate references are defined in the source. In this case it is configured to keep the last value, whereas the default behavior is to keep the first value.

The HtmlRenderer.INDENT_SIZE and HtmlRenderer.PERCENT_ENCODE_URLS define options to use for rendering. Similarly, extension options can be added at the same time. Any options not set, will default to their respective defaults as defined by their data keys.

All markdown element reference types should be stored using a subclass of NodeRepository<T> as is the case for references, abbreviations and footnotes. This provides a consistent mechanism for overriding the default behavior of these references for duplicates from keep first to keep last.

By convention, data keys are defined in the extension class and in the case of the core in the Parser or HtmlRenderer.

Data keys are described in their respective extension classes and in Parser and HtmlRenderer.

Core

Core implements parser, Html renderer and formatter functionality for CommonMark markdown elements.

Parser

Unified options handling added which are also used to selectively disable loading of core parsers and processors.

Parser.builder() now implements MutableDataHolder so you can use get/set to customize properties directly on it or pass it a DataHolder with predefined options.

Defined in Parser class:

  • Parser.ASTERISK_DELIMITER_PROCESSOR : default true enable asterisk delimiter inline processing.

  • Parser.BLANK_LINES_IN_AST default false, set to true to include blank line nodes in the AST.

  • Parser.BLOCK_QUOTE_ALLOW_LEADING_SPACE : default true, when true leading spaces before > are allowed

  • Parser.BLOCK_QUOTE_EXTEND_TO_BLANK_LINE : default false, when true block quotes extend to next blank line. Enables more customary block quote parsing than commonmark strict standard

  • Parser.BLOCK_QUOTE_IGNORE_BLANK_LINE : default false, when true block quotes will include blank lines between block quotes and treat them as if the blank lines are also preceded by the block quote marker

  • Parser.BLOCK_QUOTE_INTERRUPTS_ITEM_PARAGRAPH : default true, when true block quotes can interrupt list item text, else need blank line before to be included in list items

  • Parser.BLOCK_QUOTE_INTERRUPTS_PARAGRAPH : default true, when true block quote can interrupt a paragraph, else needs blank line before

  • Parser.BLOCK_QUOTE_PARSER : default true, when true parsing of block quotes is enabled

  • Parser.BLOCK_QUOTE_WITH_LEAD_SPACES_INTERRUPTS_ITEM_PARAGRAPH : default true, when true block quotes with leading spaces can interrupt list item text, else need blank line before or no leading spaces

  • Parser.CODE_BLOCK_INDENT, default Parser.LISTS_ITEM_INDENT, setting which determines how many leading spaces will treat the following text as an indented block.

    ℹ️ In parsers, use state.getParsing().CODE_BLOCK_INDENT to ensure that all parsers have the same setting. Parsing copies the setting from options on creation so having this option changed after parsing phase has started will have no effect.

  • Parser.CODE_SOFT_LINE_BREAKS : default false, set to true to include soft line break nodes in code blocks

  • Parser.CODE_STYLE_HTML_CLOSE : default (String)null override HTML to use for wrapping style, if null then no override

  • Parser.CODE_STYLE_HTML_OPEN : default (String)null override HTML to use for wrapping style, if null then no override

  • Parser.EMPHASIS_STYLE_HTML_CLOSE : default (String)null override HTML to use for wrapping style, if null then no override

  • Parser.EMPHASIS_STYLE_HTML_OPEN : default (String)null override HTML to use for wrapping style, if null then no override

  • Parser.EXTENSIONS : default empty list list of extension to use for builders. Can use this option instead of passing extensions to parser builder and renderer builder.

  • Parser.FENCED_CODE_BLOCK_PARSER : default true enable parsing of fenced code blocks

  • Parser.FENCED_CODE_CONTENT_BLOCK : default false add CodeBlock as child to contain the code of a fenced block. Otherwise the code is contained in a Text node.

  • Parser.HARD_LINE_BREAK_LIMIT : default false only treat the last 2 spaces of a hard line break in the HardLineBreak node if set to true. Otherwise all spaces are used.

  • Parser.HEADING_CAN_INTERRUPT_ITEM_PARAGRAPH : default true allow headings to interrupt list item paragraphs

  • Parser.HEADING_NO_ATX_SPACE : default false allow headers without a space between # and the header text if true

  • Parser.HEADING_NO_EMPTY_HEADING_WITHOUT_SPACE default false, set to true to not recognize empty headings without a space following #

  • Parser.HEADING_NO_LEAD_SPACE : default false do not allow non-indent spaces before # for atx headers and text or -/= marker for setext, if true (pegdown and GFM), if false commonmark rules.

  • Parser.HEADING_PARSER : default true enable parsing of headings

  • Parser.HEADING_SETEXT_MARKER_LENGTH : default 1 sets the minimum number of - or = needed under a setext heading text before it being recognized as a heading.

  • Parser.HTML_ALLOW_NAME_SPACE : default false, when true will allow a name space prefix for HTML elements. ❗ HTML Deep Parser always enables name spaces and ignores this option.

  • Parser.HTML_BLOCK_COMMENT_ONLY_FULL_LINE : default false, when true will allow embedded HTML comments to interrupt paragraphs.

  • Parser.HTML_BLOCK_DEEP_PARSE_BLANK_LINE_INTERRUPTS_PARTIAL_TAG default true, when true blank line interrupts partially open tag ie. <TAG without a corresponding > and having a blank line blank line before >

  • Parser.HTML_BLOCK_DEEP_PARSE_BLANK_LINE_INTERRUPTS default true, when true Blank line interrupts HTML block when not in raw tag, otherwise only when closed

  • Parser.HTML_BLOCK_DEEP_PARSE_FIRST_OPEN_TAG_ON_ONE_LINE default false, when true do not parse open tags unless they are contained on one line.

  • Parser.HTML_BLOCK_DEEP_PARSE_INDENTED_CODE_INTERRUPTS default false, when true Indented code can interrupt HTML block without a preceding blank line.

  • Parser.HTML_BLOCK_DEEP_PARSE_MARKDOWN_INTERRUPTS_CLOSED default false, when true Other markdown elements can interrupt a closed HTML block without an intervening blank line

  • Parser.HTML_BLOCK_DEEP_PARSE_NON_BLOCK default true, parse non-block tags inside HTML blocks

  • Parser.HTML_BLOCK_DEEP_PARSER default false - enable deep HTML block parsing

  • Parser.HTML_BLOCK_PARSER : default true enable parsing of html blocks

  • Parser.HTML_BLOCK_START_ONLY_ON_BLOCK_TAGS, default for deep html parser is true and regular parser false, but you can set your desired specific value and override the default. When true will not start an HTML block on a non-block tag, when false any tag will start an HTML block.

  • Parser.HTML_BLOCK_TAGS, default list: address, article, aside, base, basefont, blockquote, body, caption, center, col, colgroup, dd, details, dialog, dir, div, dl, dt, fieldset, figcaption, figure, footer, form, frame, frameset, h1, h2, h3, h4, h5, h6, head, header, hr, html, iframe, legend, li, link, main, math, menu, menuitem, meta, nav, noframes, ol, optgroup, option, p, param, section, source, summary, table, tbody, td, tfoot, th, thead, title, tr, track, ul, sets the HTML block tags

  • Parser.HTML_COMMENT_BLOCKS_INTERRUPT_PARAGRAPH : default true enables HTML comments to interrupt paragraphs, otherwise comment to be interpreted as HTML blocks need a blank line before.

  • Parser.HTML_FOR_TRANSLATOR : default false, when true the parser is modified to allow translator placeholders to be recognized

  • Parser.INDENTED_CODE_BLOCK_PARSER : default true enable parsing of indented code block

  • Parser.INDENTED_CODE_NO_TRAILING_BLANK_LINES : default true enable removing trailing blank lines from indented code blocks

  • Parser.INLINE_DELIMITER_DIRECTIONAL_PUNCTUATIONS default false, set to true to use delimiter parsing rules that take bracket open/close into account

  • Parser.INTELLIJ_DUMMY_IDENTIFIER : default false add '\u001f' to all parse patterns as an allowable character, used by plugin to allow for IntelliJ completion location marker

  • Parser.LINKS_ALLOW_MATCHED_PARENTHESES : default true, when false uses spec 0.27 rules for link addresses and does not resolve matched ()

  • Parser.LIST_BLOCK_PARSER : default true enable parsing of lists

  • Parser.LIST_REMOVE_EMPTY_ITEMS, default false. If true then empty list items or list items which only contain a BlankLine node are removed from the output.

  • Parser.LISTS_END_ON_DOUBLE_BLANK : default false, when true lists are terminated by a double blank line as per spec 0.27 rules.

  • Parser.LISTS_ITEM_PREFIX_CHARS : default "*-+", specify which characters mark a list item

  • Parser.LISTS_LOOSE_WHEN_CONTAINS_BLANK_LINE : default false, when true a list item which contains a blank line will be considered a loose list item.

  • Parser.LISTS_LOOSE_WHEN_LAST_ITEM_PREV_HAS_TRAILING_BLANK_LINE : default false, when true will consider a list with last item followed by blank line as a loose list.

  • Parser.MATCH_CLOSING_FENCE_CHARACTERS : default true whether the closing fence character has to match opening character, when false then back ticks can open and tildes close and vice versa. The number of characters in the opener and close still have to be the same.

  • Parser.MATCH_NESTED_LINK_REFS_FIRST : default true custom link ref processors that take tested [] have priority over ones that do not. ie. [[^f]][test] is a wiki link with ^f as page ref followed by ref link test when this option is true. IF false then the same would be a ref link test with a footnote ^f refernce for text

  • Parser.PARSE_INNER_HTML_COMMENTS : default false when true will parse inner HTML comments in HTML blocks

  • Parser.PARSE_JEKYLL_MACROS_IN_URLS default false, set to true to allow any characters to appear between {{ and }} in URLs, including spaces, pipes and backslashes

  • Parser.PARSE_MULTI_LINE_IMAGE_URLS : default false allows parsing of multi line urls:

  • Parser.PARSER_EMULATION_PROFILE default ParserEmulationProfile.COMMONMARK, set to desired one of the ParserEmulationProfile values

  • Parser.REFERENCE_PARAGRAPH_PRE_PROCESSOR : default true enable parsing of reference definitions

  • Parser.REFERENCES_KEEP : default KeepType.FIRST which duplicates to keep.

  • Parser.REFERENCES : default new repository repository for document's reference definitions

  • Parser.SPACE_IN_LINK_ELEMENTS default false, set to true to allow whitespace between ![] or [] and () of links or images.

  • Parser.SPACE_IN_LINK_URLS : default false will accept spaces in link address as long as they are not followed by "

  • Parser.STRONG_EMPHASIS_STYLE_HTML_CLOSE : default (String)null override HTML to use for wrapping style, if null then no override

  • Parser.STRONG_EMPHASIS_STYLE_HTML_OPEN : default (String)null override HTML to use for wrapping style, if null then no override

  • Parser.STRONG_WRAPS_EMPHASIS : default false, when true uses spec 0.27 rules for emphasis/strong resolution

  • Parser.THEMATIC_BREAK_PARSER : default true enable parsing of thematic breaks

  • Parser.THEMATIC_BREAK_RELAXED_START : default true enable parsing of thematic breaks which are not preceded by a blank line

  • Parser.TRACK_DOCUMENT_LINES default false. When true document lines are tracked in the document's lineSegments list and offset to line method can be used to get the 0-based line number for the given offset. When false these functions compute the line number by counting EOL sequences before the offset.

  • Parser.UNDERSCORE_DELIMITER_PROCESSOR : default true whether to process underscore delimiters

  • Parser.USE_HARDCODED_LINK_ADDRESS_PARSER : default true, when false uses regex for link address parsing, which causes StackOverflowError for long URLs

  • Parser.WWW_AUTO_LINK_ELEMENT : default false, when true www. prefix is recognized as an auto-link prefix

ℹ️ Parser.USE_HARDCODED_LINK_ADDRESS_PARSER set to true is the default because the regex based parsing requires much more stack space and will cause StackOverflowError error when attempting to parse link URLs larger than about 1.5k characters. This option is available only for backwards compatibility and in case someone customizes the regex for parsing. Performance of the hard-coded parser is on par with the regex one while requiring no stack space for parsing.

Test Regex Parser Hardcoded Parser Options
emphasisClosersWithNoOpeners 205 ms 221 ms Default
emphasisOpenersWithNoClosers 162 ms 170 ms Default
linkClosersWithNoOpeners 61 ms 65 ms Default
linkOpenersAndEmphasisClosers 277 ms 286 ms Default
linkOpenersWithNoClosers 87 ms 89 ms Default
StackOverflow longImageLinkTest 136 ms 738 ms Default
longLinkTest 77 ms 63 ms Default
mismatchedOpenersAndClosers 264 ms 342 ms Default
nestedBrackets 85 ms 72 ms Default
nestedStrongEmphasis 8 ms 6 ms Default
emphasisClosersWithNoOpeners 173 ms 113 ms Space in URLs
emphasisOpenersWithNoClosers 163 ms 123 ms Space in URLs
linkClosersWithNoOpeners 55 ms 56 ms Space in URLs
linkOpenersAndEmphasisClosers 216 ms 229 ms Space in URLs
linkOpenersWithNoClosers 85 ms 85 ms Space in URLs
longImageLinkTest Stack Overflow 684 ms Space in URLs
longLinkTest Stack Overflow 71 ms Space in URLs
mismatchedOpenersAndClosers 214 ms 327 ms Space in URLs
nestedBrackets 55 ms 79 ms Space in URLs
nestedStrongEmphasis 5 ms 6 ms Space in URLs

List Parsing Options

Because list parsing is the greatest discrepancy between Markdown parser implementations. Before CommonMark there was no hard specification for parsing lists and every implementation took artistic license with how it determines what the list should look like.

flexmark-java implements four parser families based on their list processing characteristics. In addition to ParserEmulationProfile setting, each of the families has a standard set of common options that control list processing, with defaults set by each but modifiable by the end user.

There are a few ways to configure the list parsing options:

  1. the recommended way is to apply ParserEmulationProfile to options via MutableDataHolder.setFrom(ParserEmulationProfile) to have all options configured for a particular parser.
  2. start with the ParserEmulationProfile.getOptions() and modify defaults for the family and then pass it to MutableDataHolder.setFrom(MutableDataSetter)
  3. by configuring an instance of MutableListOptions and then passing it to MutableDataHolder.setFrom(MutableDataSetter)
  4. first via individual keys
List Item Options
  • Parser.LISTS_CODE_INDENT : default 4, can be the same as LISTS_ITEM_INDENT or double that for parser emulation families which count indentation from first list item indent
  • Parser.LISTS_ITEM_INDENT : default 4, is also the INDENTED CODE INDENT. Parser emulation family either does not use this or expects it to be the number of columns to next indent item (in this case indented code is the same)
  • Parser.LISTS_NEW_ITEM_CODE_INDENT : default 4, new item with content indent >= this value causes an empty item with code indent child, weird standard, so far only for CommonMark
  • empty list items require explicit space after marker Parser.LISTS_ITEM_MARKER_SPACE, ListOptions.itemMarkerSpace
  • mismatch item type starts a new list: Parser.LISTS_ITEM_TYPE_MISMATCH_TO_NEW_LIST, ListOptions.itemTypeMismatchToNewList
  • mismatch item type start a sub-list: Parser.LISTS_ITEM_TYPE_MISMATCH_TO_SUB_LIST, ListOptions.itemTypeMismatchToSubList
  • bullet or ordered item delimiter mismatch starts a new list: Parser.LISTS_DELIMITER_MISMATCH_TO_NEW_LIST, ListOptions.delimiterMismatchToNewList
  • ordered items only with . after digit, otherwise ) is also allowed: Parser.LISTS_ORDERED_ITEM_DOT_ONLY, ListOptions.orderedItemDotOnly
  • first ordered item prefix sets start number of list: Parser.LISTS_ORDERED_LIST_MANUAL_START, ListOptions.orderedListManualStart
  • item is loose if it contains a blank line after its item text: Parser.LISTS_LOOSE_WHEN_BLANK_LINE_FOLLOWS_ITEM_PARAGRAPH, ListOptions.looseWhenBlankLineFollowsItemParagraph
  • item is loose if it is first item and has trailing blank line or if previous item has a trailing blank line: Parser.LISTS_LOOSE_WHEN_PREV_HAS_TRAILING_BLANK_LINE, ListOptions.looseWhenPrevHasTrailingBlankLine
  • item is loose if it has loose sub-item: Parser.LISTS_LOOSE_WHEN_HAS_LOOSE_SUB_ITEM, ListOptions.looseWhenHasLooseSubItem
  • item is loose if it has trailing blank line in it or its last child: Parser.LISTS_LOOSE_WHEN_HAS_TRAILING_BLANK_LINE, ListOptions.looseWhenHasTrailingBlankLine
  • item is loose if it has non-list children: Parser.LISTS_LOOSE_WHEN_HAS_NON_LIST_CHILDREN, ListOptions.looseWhenHasNonListChildren
  • all items are loose if any in the list are loose: Parser.LISTS_AUTO_LOOSE, ListOptions.autoLoose
  • auto loose list setting Parser.LISTS_AUTO_LOOSE only applies to simple 1 level lists: Parser.LISTS_AUTO_LOOSE_ONE_LEVEL_LISTS, ListOptions.autoLooseOneLevelLists
  • list item marker suffixes, content indent computed after suffix: Parser.LISTS_ITEM_MARKER_SUFFIXES, ListOptions.itemMarkerSuffixes
  • list item marker suffixes apply to numbered items: Parser.LISTS_NUMBERED_ITEM_MARKER_SUFFIXED, ListOptions.numberedItemMarkerSuffixed
  • list item marker suffixes are not used to affect indent offset for child items: Parser.LISTS_ITEM_CONTENT_AFTER_SUFFIX, default false, set to true to treat the item suffix as part of the list item marker after which the content begins for indentation purposes. Gfm Task List uses the default setting.
  • Parser.LISTS_AUTO_LOOSE_ONE_LEVEL_LISTS, default false, when true will determine looseness of list item by its content, otherwise will use looseness of contained lists
  • Parser.LISTS_LOOSE_WHEN_LAST_ITEM_PREV_HAS_TRAILING_BLANK_LINE, default false, when true a last list item which is preceded by a blank line will be marked as loose item.
  • Parser.LISTS_LOOSE_WHEN_CONTAINS_BLANK_LINE, default false, when true an item will be marked as loose if it contains a blank line
  • Parser.LISTS_ITEM_PREFIX_CHARS, default "*-+", set to allowed list item prefixes

⚠️ If both LISTS_ITEM_TYPE_MISMATCH_TO_NEW_LIST and LISTS_ITEM_TYPE_MISMATCH_TO_SUB_LIST are set to true then a new list will be created if the item had a blank line, otherwise a sub-list is created.

List Item Paragraph Interruption Options
  • bullet item can interrupt a paragraph: Parser.LISTS_BULLET_ITEM_INTERRUPTS_PARAGRAPH, ListOptions.itemInterrupt.bulletItemInterruptsParagraph
  • ordered item can interrupt a paragraph: Parser.LISTS_ORDERED_ITEM_INTERRUPTS_PARAGRAPH, ListOptions.itemInterrupt.orderedItemInterruptsParagraph
  • ordered non 1 item can interrupt a paragraph: Parser.LISTS_ORDERED_NON_ONE_ITEM_INTERRUPTS_PARAGRAPH, ListOptions.itemInterrupt.orderedNonOneItemInterruptsParagraph
  • empty bullet item can interrupt a paragraph: Parser.LISTS_EMPTY_BULLET_ITEM_INTERRUPTS_PARAGRAPH, ListOptions.itemInterrupt.emptyBulletItemInterruptsParagraph
  • empty ordered item can interrupt a paragraph: Parser.LISTS_EMPTY_ORDERED_ITEM_INTERRUPTS_PARAGRAPH, ListOptions.itemInterrupt.emptyOrderedItemInterruptsParagraph
  • empty ordered non 1 item can interrupt a paragraph: Parser.LISTS_EMPTY_ORDERED_NON_ONE_ITEM_INTERRUPTS_PARAGRAPH, ListOptions.itemInterrupt.emptyOrderedNonOneItemInterruptsParagraph
  • bullet item can interrupt a paragraph of a list item: Parser.LISTS_BULLET_ITEM_INTERRUPTS_ITEM_PARAGRAPH, ListOptions.itemInterrupt.bulletItemInterruptsItemParagraph
  • ordered item can interrupt a paragraph of a list item: Parser.LISTS_ORDERED_ITEM_INTERRUPTS_ITEM_PARAGRAPH, ListOptions.itemInterrupt.orderedItemInterruptsItemParagraph
  • ordered non 1 item can interrupt a paragraph of a list item: Parser.LISTS_ORDERED_NON_ONE_ITEM_INTERRUPTS_ITEM_PARAGRAPH, ListOptions.itemInterrupt.orderedNonOneItemInterruptsItemParagraph
  • empty bullet item can interrupt a paragraph of a list item: Parser.LISTS_EMPTY_BULLET_ITEM_INTERRUPTS_ITEM_PARAGRAPH, ListOptions.itemInterrupt.emptyBulletItemInterruptsItemParagraph
  • empty ordered non 1 item can interrupt a paragraph of a list item: Parser.LISTS_EMPTY_ORDERED_ITEM_INTERRUPTS_ITEM_PARAGRAPH, ListOptions.itemInterrupt.emptyOrderedItemInterruptsItemParagraph
  • empty ordered item can interrupt a paragraph of a list item: Parser.LISTS_EMPTY_ORDERED_NON_ONE_ITEM_INTERRUPTS_ITEM_PARAGRAPH, ListOptions.itemInterrupt.emptyOrderedNonOneItemInterruptsItemParagraph
  • empty bullet sub-item can interrupt a paragraph of a list item: Parser.LISTS_EMPTY_BULLET_SUB_ITEM_INTERRUPTS_ITEM_PARAGRAPH, ListOptions.itemInterrupt.emptyBulletSubItemInterruptsItemParagraph
  • empty ordered non 1 sub-item can interrupt a paragraph of a list item: Parser.LISTS_EMPTY_ORDERED_SUB_ITEM_INTERRUPTS_ITEM_PARAGRAPH, ListOptions.itemInterrupt.emptyOrderedSubItemInterruptsItemParagraph
  • empty ordered sub-item can interrupt a paragraph of a list item: Parser.LISTS_EMPTY_ORDERED_NON_ONE_SUB_ITEM_INTERRUPTS_ITEM_PARAGRAPH, ListOptions.itemInterrupt.emptyOrderedNonOneSubItemInterruptsItemParagraph

Renderer

Unified options handling added, existing configuration options were kept but now they modify the corresponding unified option.

Renderer Builder() now has an indentSize(int) method to set size of indentation for hierarchical tags. Same as setting HtmlRenderer.INDENT_SIZE data key in options.

Defined in HtmlRenderer class:

  • AUTOLINK_WWW_PREFIX : default "http://" prefix to add to autolink if it starts with www.

  • CODE_STYLE_HTML_CLOSE default ``(String) null, custom inline code close HTML

  • CODE_STYLE_HTML_OPEN default (String) null, custom inline code open HTML

  • DO_NOT_RENDER_LINKS default false, Disable link rendering in the document. This will cause sub-contexts to also have link rendering disabled.

  • EMPHASIS_STYLE_HTML_CLOSE default (String) null, custom emphasis close HTML

  • EMPHASIS_STYLE_HTML_OPEN default (String) null, custom emphasis open HTML

  • ESCAPE_HTML_BLOCKS default value of ESCAPE_HTML, escape html blocks found in the document

  • ESCAPE_HTML_COMMENT_BLOCKS default value of ESCAPE_HTML_BLOCKS, escape html comment blocks found in the document.

  • ESCAPE_HTML default false, escape all html found in the document

  • ESCAPE_INLINE_HTML_COMMENTS default value of ESCAPE_HTML_BLOCKS, escape inline html found in the document

  • ESCAPE_INLINE_HTML default value of ESCAPE_HTML, escape inline html found in the document

  • FENCED_CODE_LANGUAGE_CLASS_MAP, default new HashMap(), provides individual language to class mapping string. If language string is not in the map then FENCED_CODE_LANGUAGE_CLASS_PREFIX + language will be used.

  • FENCED_CODE_LANGUAGE_CLASS_PREFIX default "language-", prefix used for generating the <code> class for a fenced code block, only used if info is not empty and language is not defined in FENCED_CODE_LANGUAGE_CLASS_MAP

  • FENCED_CODE_NO_LANGUAGE_CLASS default "" ,<code> class for <code> tag of indented code or fenced code block without info (language) specification

  • FORMAT_FLAGS default 0, Flags used for FormattingAppendable used for rendering HTML

  • GENERATE_HEADER_ID default false, Generate a header id attribute using the configured HtmlIdGenerator but not render it. The id may be used by another element such as an anchor link.

  • HARD_BREAK default "<br />\n", string to use for rendering hard breaks

  • HEADER_ID_ADD_EMOJI_SHORTCUT, default false. When set to true, emoji shortcut nodes add the shortcut to collected text used to generate heading id.

  • HEADER_ID_GENERATOR_NO_DUPED_DASHES default false, When true duplicate - in id will be replaced by a single -

  • HEADER_ID_GENERATOR_NON_ASCII_TO_LOWERCASE, default true. When set to false changes the default header id generator to not convert non-ascii alphabetic characters to lowercase. Needed for GitHub id compatibility.

  • HEADER_ID_GENERATOR_RESOLVE_DUPES default true, When true will add an incrementing integer to duplicate ids to make them unique

  • HEADER_ID_GENERATOR_TO_DASH_CHARS default "_", set of characters to convert to - in text used to generate id, non-alpha numeric chars not in set will be removed

  • HEADER_ID_REF_TEXT_TRIM_LEADING_SPACES, default true. When set to false then leading spaces in link reference text in heading is not trimmed for text used to generate id.

  • HEADER_ID_REF_TEXT_TRIM_TRAILING_SPACES, default true. When set to false then trailing spaces in link reference text in heading is not trimmed for text used to generate id.

  • HTML_BLOCK_CLOSE_TAG_EOL default true, When false will suppress EOL after HTML block tags which are automatically generated during html rendering.

  • HTML_BLOCK_OPEN_TAG_EOL default true, When false will suppress EOL before HTML block tags which are automatically generated during html rendering.

  • INDENT_SIZE default 0, how many spaces to use for each indent level of nested tags

  • INLINE_CODE_SPLICE_CLASS default (String) null, used for splitting and splicing inline code spans for source line tracking

  • MAX_TRAILING_BLANK_LINES default 1, Maximum allowed trailing blank lines for rendered HTML

  • OBFUSCATE_EMAIL_RANDOM default true, When false will not use random number generator for obfuscation. Used for tests

  • OBFUSCATE_EMAIL default false, When true will obfuscate mailto links

  • PERCENT_ENCODE_URLS default false, percent encode urls

  • RECHECK_UNDEFINED_REFERENCES default false, Recheck the existence of refences in Parser.REFERENCES for link and image refs marked undefined. Used when new references are added after parsing

  • RENDER_HEADER_ID default false, Render a header id attribute for headers using the configured HtmlIdGenerator

  • SOFT_BREAK default "\n", string to use for rendering soft breaks

  • SOURCE_POSITION_ATTRIBUTE default "", name of the source position HTML attribute, source position is assigned as startOffset + '-' + endOffset

  • SOURCE_POSITION_PARAGRAPH_LINES default false, if true enables wrapping individual paragraph source lines in span with source position attribute set

  • SOURCE_WRAP_HTML_BLOCKS default value of SOURCE_WRAP_HTML, if generating source position attribute, then wrap HTML blocks in div with source position

  • SOURCE_WRAP_HTML default false, if generating source position attribute, then wrap HTML with source position

  • STRONG_EMPHASIS_STYLE_HTML_CLOSE default (String) null, custom strong emphasis close HTML

  • STRONG_EMPHASIS_STYLE_HTML_OPEN default (String) null, custom strong emphasis open HTML

  • SUPPRESS_HTML_BLOCKS default value of SUPPRESS_HTML, suppress html output for html blocks

  • SUPPRESS_HTML_COMMENT_BLOCKS default value of SUPPRESS_HTML_BLOCKS, suppress html output for html comment blocks

  • SUPPRESS_HTML default false, suppress html output for all html

  • SUPPRESS_INLINE_HTML_COMMENTS default value of SUPPRESS_INLINE_HTML, suppress html output for inline html comments

  • SUPPRESS_INLINE_HTML default value of SUPPRESS_HTML, suppress html output for inline html

  • SUPPRESS_LINKS default value of "javascript:.*", a regular expression to suppress any links that match. The test occurs before the link is resolved using a link resolver. Therefore any link matching will be suppressed before it is resolved. Likewise, a link resolver returning a suppressed link will not be suppressed since this is up to the custom link resolver to handle.

    Suppressed links will render only the child nodes, effectively [Something New](javascript:alert(1)) will render as if it was Something New.

    Link suppression based on URL prefixes does not apply to HTML inline or block elements. Use HTML suppression options for this.

  • TYPE default "HTML", renderer type. Renderer type extensions can add their own. JiraConverterExtension defines JIRA

  • UNESCAPE_HTML_ENTITIES default true, When false will leave HTML entities as is in the rendered HTML.

❗ All the escape and suppress options have dynamic default value. This allows you to set the ESCAPE_HTML and have all html escaped. If you set a value of a specific key then the set value will be used for that key. Similarly, comment affecting keys take their values from the non-comment counterpart. If you want to exclude comments from being affected by suppression or escaping then you need to set the corresponding comment key to false and set the non-comment key to true.

Formatter

Formatter renders the AST as markdown with various formatting options to clean up and make the source consistent and possibly convert from one indentation rule set to another. Formatter API allows extensions to provide and handle formatting options for custom nodes.

ℹ️ in versions prior to 0.60.0 formatter functionality was implemented in flexmark-formatter module and required an additional dependency.

See: Markdown Formatter

Formatter can also be used to help translate the markdown document to another language by extracting translatable strings, replacing non-translatable strings with an identifier and finally replacing the translatable text spans with translated versions.

See: Translation Helper API

Formatter can be used to merge multiple markdown documents into a single document while preserving reference resolution to references within each document, even when reference ids conflict between merged documents.

See: Markdown Merge API

PDF Output Module

HTML to PDF conversion is done using Open HTML To PDF library by PdfConverterExtension in flexmark-pdf-converter module.

See: PDF Renderer Converter

Usage PDF Output

Available Extensions

The following extensions are developed with this library, each in their own artifact.

Extension options are defined in their extension class.

Abbreviation

Allows to create abbreviations which will be replaced in plain text into <abbr></abbr> tags or optionally into <a></a> with titles for the abbreviation expansion.

Use class AbbreviationExtension from artifact flexmark-ext-abbreviation.

The following options are available:

Defined in AbbreviationExtension class:

Static Field Default Value Description
ABBREVIATIONS new repository repository for document's abbreviation definitions
ABBREVIATIONS_KEEP KeepType.FIRST which duplicates to keep.
USE_LINKS false use <a> instead of <abb> tags for rendering html
ABBREVIATIONS_PLACEMENT ElementPlacement.AS_IS formatting option see: Markdown Formatter
ABBREVIATIONS_SORT ElementPlacement.AS_IS formatting option see: Markdown Formatter

Admonition

To create block-styled side content. Based on Admonition Extension, Material for MkDocs (Personal opinion: Material for MkDocs is eye-candy. If you have not taken a look at it, you are missing out on a visual treat.). See Admonition Extension

Use class AbbreviationExtension from artifact flexmark-ext-admonition.

CSS and JavaScript must be included in your page

Default CSS and JavaScript are contained in the jar as resources:

Their content is also available by calling AdmonitionExtension.getDefaultCSS() and AdmonitionExtension.getDefaultScript() static methods.

The script should be included at the bottom of the body of the document and is used to toggle open/closed state of collapsible admonition elements.

AnchorLink

Automatically adds anchor links to heading, using GitHub id generation algorithm

⚠️ This extension will only render an anchor link for headers that have an id attribute associated with them. You need to have the HtmlRenderer.GENERATE_HEADER_ID option to set to true so that header ids are generated.

Use class AnchorLinkExtension from artifact flexmark-ext-anchorlink.

The following options are available:

Defined in AnchorLinkExtension class:

Static Field Default Value Description
ANCHORLINKS_SET_ID true whether to set the id attribute to the header id, if true
ANCHORLINKS_SET_NAME false whether to set the name attribute to the header id, if true
ANCHORLINKS_WRAP_TEXT true whether to wrap the heading text in the anchor, if true
ANCHORLINKS_TEXT_PREFIX "" raw html prefix. Added before heading text, wrapped or unwrapped
ANCHORLINKS_TEXT_SUFFIX "" raw html suffix. Added before heading text, wrapped or unwrapped
ANCHORLINKS_ANCHOR_CLASS "" class for the a tag

Aside

Same as block quotes but uses | for prefix and generates <aside> tags. To make it compatible with the table extension, aside block lines cannot have | as the last character of the line, and if using this extension the tables must have the lines terminate with a | otherwise they will be treated as aside blocks.

Use class AsideExtension from artifact flexmark-ext-aside.

Defined in AsideExtension class:

Static Field Default Value Description
IGNORE_BLANK_LINE false aside block will include blank lines between aside blocks and treat them as if the blank lines are also preceded by the aside block marker
EXTEND_TO_BLANK_LINE false aside blocks extend to blank line when true. Enables more customary a la block quote parsing than commonmark strict standard
ALLOW_LEADING_SPACE true when true leading spaces before > are allowed
INTERRUPTS_ITEM_PARAGRAPH true when true block quotes can interrupt list item text, else need blank line before to be included in list items
INTERRUPTS_PARAGRAPH true when true block quote can interrupt a paragraph, else needs blank line before
WITH_LEAD_SPACES_INTERRUPTS_ITEM_PARAGRAPH true when true block quotes with leading spaces can interrupt list item text, else need blank line before or no leading spaces

AsideExtension option keys are dynamic data keys dependent on corresponding Parser block quote options for their defaults. If they are not explicitly set then they will take their default value from the value of the corresponding block quote value (prefix BLOCK_QUOTE_ to AsideExtension key name to get Parser block quote key name).

⚠️ This can potentially break code relying on versions of the extension before 0.40.20 because parsing rules can change depending on which block quote options are changed from their default values.

To ensure independent options for aside blocks and block quotes, set aside options explicitly. The following will set all aside options to default values, independent from block quote options:

.set(EXTEND_TO_BLANK_LINE, false)
.set(IGNORE_BLANK_LINE, false)
.set(ALLOW_LEADING_SPACE, true)
.set(INTERRUPTS_PARAGRAPH, true)
.set(INTERRUPTS_ITEM_PARAGRAPH, true)
.set(WITH_LEAD_SPACES_INTERRUPTS_ITEM_PARAGRAPH, true)

Attributes

Converts attributes {...} syntax into attributes AST nodes and adds an attribute provider to set attributes for immediately preceding sibling element during HTML rendering. See Attributes Extension

Defined in AttributeExtension from artifact flexmark-ext-attributes

  • ASSIGN_TEXT_ATTRIBUTES, default true. When false attribute assignment rules to nodes are changed not to allow text elements to get attributes.

  • FENCED_CODE_INFO_ATTRIBUTES, default true. When false attribute assignment at end of fenced code info string will be ignored.

  • USE_EMPTY_IMPLICIT_AS_SPAN_DELIMITER, default false. When set to true will treat {#} or {.}, without embedded spaced, as start attribute span delimiter to mark start of attribute assignment to text between {.} or {#} and the matching attributes element.

Use class AttributesExtension from artifact flexmark-ext-attributes.

Full spec: ext_attributes_ast_spec

Autolink

Turns plain links such as URLs and email addresses into links (based on autolink-java).

⚠️ current implementation has significant performance impact on large files.

Use class AutolinkExtension from artifact flexmark-ext-autolink.

Defined in AsideExtension class:

  • IGNORE_LINKS , default "", a regex expression to match link text which should not be auto-linked. This can include full urls like www.google.com or parts by including wildcard match patterns. Any recognized auto-link which matches the expression will be rendered as text.

Definition Lists

Converts definition syntax of Php Markdown Extra Definition List to <dl></dl> HTML and corresponding AST nodes.

Definition items can be preceded by : or ~, depending on the configured options.

Use class DefinitionExtension from artifact flexmark-ext-definition.

The following options are available:

Defined in DefinitionExtension class:

Static Field Default Value Description
COLON_MARKER true enable use of : as definition item marker
MARKER_SPACES 1 minimum number of spaces after definition item marker for valid definition item
TILDE_MARKER true enable use of ~ as definition item marker
DOUBLE_BLANK_LINE_BREAKS_LIST false When true double blank line between definition item and next definition term will break a definition list
FORMAT_MARKER_SPACES 3 formatting option see: Markdown Formatter
FORMAT_MARKER_TYPE DefinitionMarker.ANY formatting option see: Markdown Formatter

ℹ️ this extension uses list parsing and indentation rules and will its best to align list item and definition item parsing according to selected options. For non-fixed indent family of parsers will use the definition item content indent column for sub-items, otherwise uses the Parser.LISTS_ITEM_INDENT value for sub-items.

Wiki: Definition List Extension

Docx Converter

Renders the parsed Markdown AST to docx format using the docx4j library.

artifact: flexmark-docx-converter

See the DocxConverterCommonMark Sample for code and Customizing Docx Rendering for an overview and information on customizing the styles.

Pegdown version can be found in DocxConverterPegdown Sample

For details see Docx Renderer Extension

Emoji

Allows to create image link to emoji images from emoji shortcuts using Emoji-Cheat-Sheet.com and optionally to replace with its unicode equivalent character with mapping by Mark Wunsch found at mwunsch/rumoji

Use class EmojiExtension from artifact flexmark-ext-emoji.

The following options are available:

Defined in EmojiExtension class:

  • ATTR_ALIGN, default "absmiddle", attributes to use for rendering in the absence of attribute provider overrides
  • ATTR_IMAGE_SIZE, default "20" , attributes to use for rendering in the absence of attribute provider overrides
  • ATTR_IMAGE_CLASS, default "", if not empty will set the image tag class attribute to this value.
  • ROOT_IMAGE_PATH, default "/img/" , root path for emoji image files. See https://github.com/arvida/emoji-cheat-sheet.com
  • USE_SHORTCUT_TYPE, default EmojiShortcutType.EMOJI_CHEAT_SHEET, select type of shortcuts:
    • EmojiShortcutType.EMOJI_CHEAT_SHEET
    • EmojiShortcutType.GITHUB
    • EmojiShortcutType.ANY_EMOJI_CHEAT_SHEET_PREFERRED use any shortcut from any source. If image type options is not UNICODE_ONLY, will generate links to Emoji Cheat Sheet files or GitHub URL, with preference given to Emoji Cheat Sheet files.
    • EmojiShortcutType.ANY_GITHUB_PREFERRED - use any shortcut from any source. If image type options is not UNICODE_ONLY, will generate links to Emoji Cheat Sheet files or GitHub URL, with preference given to GitHub URL.
  • USE_IMAGE_TYPE, default EmojiImageType.IMAGE_ONLY, to select what type of images are allowed.
    • EmojiImageType.IMAGE_ONLY, only use image link
    • EmojiImageType.UNICODE_ONLY convert to unicode and if there is no unicode treat as invalid emoji shortcut
    • EmojiImageType.UNICODE_FALLBACK_TO_IMAGE convert to unicode and if no unicode use image. emoji shortcuts using Emoji-Cheat-Sheet.com https://www.webfx.com/tools/emoji-cheat-sheet

Enumerated Reference

Used to create numbered references and numbered text labels for elements in the document. Enumerated References Extension

Use class EnumeratedReferenceExtension from artifact flexmark-ext-enumerated-reference.

❗ Note Attributes extension is needed in order for references to be properly resolved for rendering.

Footnotes

Creates footnote references in the document. Footnotes are placed at the bottom of the rendered document with links from footnote references to footnote and vice-versa. Footnotes Extension

Converts: [^footnote] to footnote references and [^footnote]: footnote definition to footnotes in the document.

Gfm-Issues

Enables Gfm issue reference parsing in the form of #123

Use class GfmIssuesExtension in artifact flexmark-ext-gfm-issues.

The following options are available:

Defined in GfmIssuesExtension class:

  • GIT_HUB_ISSUES_URL_ROOT : default "issues" override root used for generating the URL
  • GIT_HUB_ISSUE_URL_PREFIX : default "/" override prefix used for appending the issue number to URL
  • GIT_HUB_ISSUE_URL_SUFFIX : default "" override suffix used for appending the issue number to URL
  • GIT_HUB_ISSUE_HTML_PREFIX : default "" override HTML to use as prefix for the issue text
  • GIT_HUB_ISSUE_HTML_SUFFIX : default "" override HTML to use as suffix for the issue text

Gfm-Strikethrough/Subscript

Enables strikethrough of text by enclosing it in ~~. For example, in hey ~~you~~, you will be rendered as strikethrough text.

Use class StrikethroughExtension in artifact flexmark-ext-gfm-strikethrough.

Enables subscript of text by enclosing it in ~. For example, in hey ~you~, you will be rendered as subscript text.

Use class SubscriptExtension in artifact flexmark-ext-gfm-strikethrough.

To enables both subscript and strike through:

Use class StrikethroughSubscriptExtension in artifact flexmark-ext-gfm-strikethrough.

⚠️ Only one of these extensions can be included in the extension list. If you want both strikethrough and subscript use the StrikethroughSubscriptExtension.

The following options are available:

Defined in StrikethroughSubscriptExtension class:

  • STRIKETHROUGH_STYLE_HTML_OPEN : default (String)null override HTML to use for wrapping style, if null then no override
  • STRIKETHROUGH_STYLE_HTML_CLOSE : default (String)null override HTML to use for wrapping style, if null then no override
  • SUBSCRIPT_STYLE_HTML_OPEN : default (String)null override HTML to use for wrapping style, if null then no override
  • SUBSCRIPT_STYLE_HTML_CLOSE : default (String)null override HTML to use for wrapping style, if null then no override

Gfm-TaskList

Enables list items based task lists whose text begins with: [ ], [x] or [X]

Use class TaskListExtension in artifact flexmark-ext-gfm-tasklist.

The following options are available:

Defined in TaskListExtension class:

Static Field Default Value Description
ITEM_DONE_MARKER <input
   type="checkbox"
   class="task-list-item-checkbox"
   checked="checked"
   disabled="disabled" />
string to use for the item done marker html.
ITEM_NOT_DONE_MARKER <input
   type="checkbox"
   class="task-list-item-checkbox"
   disabled="disabled" />
string to use for the item not done marker html.
ITEM_CLASS "task-list-item" tight list item class attribute
ITEM_ITEM_DONE_CLASS "" list item class for done task list item
ITEM_ITEM_NOT_DONE_CLASS "" list item class for not done task list item
LOOSE_ITEM_CLASS value of ITEM_CLASS loose list item class attribute, if not set then will use value of tight item class
PARAGRAPH_CLASS "" p tag class attribute, only applies to loose list items
FORMAT_LIST_ITEM_CASE TaskListItemCase.AS_IS formatting option see: Markdown Formatter
FORMAT_LIST_ITEM_PLACEMENT TaskListItemPlacement.AS_IS formatting option see: Markdown Formatter
  • TaskListItemCase

    • AS_IS: no change
    • LOWERCASE: change [X] to [x]
    • UPPERCASE: change [x] to [X]
  • TaskListItemPlacement

    • AS_IS: no change
    • INCOMPLETE_FIRST: sort all lists to put incomplete task items first
    • INCOMPLETE_NESTED_FIRST: sort all lists to put incomplete task items or items with incomplete task sub-items first
    • COMPLETE_TO_NON_TASK: sort all lists to put incomplete task items first and change complete task items to non-task items
    • COMPLETE_NESTED_TO_NON_TASK: sort all lists to put incomplete task items or items with incomplete task sub-items first and change complete task items to non-task items

Gfm-Users

Enables Gfm user reference parsing in the form of #123

Use class GfmUsersExtension in artifact flexmark-ext-gfm-users.

The following options are available:

Defined in GfmUsersExtension class:

  • GIT_HUB_USERS_URL_ROOT : default "https://github.com" override root used for generating the URL
  • GIT_HUB_USER_URL_PREFIX : default "/" override prefix used for appending the user name to URL
  • GIT_HUB_USER_URL_SUFFIX : default "" override suffix used for appending the user name to URL
  • GIT_HUB_USER_HTML_PREFIX : default "<strong>" override HTML to use as prefix for the user name text
  • GIT_HUB_USER_HTML_SUFFIX : default "</strong>" override HTML to use as suffix for the user name text

GitLab Flavoured Markdown

Parses and renders GitLab Flavoured Markdown.

  • Add: video link renderer to convert links to video files to embedded content. The valid video extensions are .mp4, .m4v, .mov, .webm, and .ogv.

    <div class="video-container">
    <video src="video.mp4" width="400" controls="true"></video>
    <p><a href="video.mp4" target="_blank" rel="noopener noreferrer" title="Download 'Sample Video'">Sample Video</a></p>
    </div>
    
  • Multiline Block quote delimiters >>>

  • Deleted text markers {- -} or [- -]

  • Inserted text markers {+ +} or [+ +]

  • Math, inline via $``$ or as fenced code with math info string requiring inclusion of Katex in the rendered HTML page.

  • Graphing via Mermaid as fenced code with mermaid info string, via Mermaid inclusion similar to Math solution above.

Use class GitLabExtension in artifact flexmark-ext-gitlab.

The following options are available:

Defined in GitLabExtension class:

  • INS_PARSER : default true, enable ins parser

  • DEL_PARSER : default true, enable del parser

  • BLOCK_QUOTE_PARSER : default true, enable multi-line block quote parser

  • INLINE_MATH_PARSER : default true, enable inline math parser

  • RENDER_BLOCK_MATH : default true, enable rendering math fenced code blocks as math elements

  • RENDER_BLOCK_MERMAID : default true, enable rendering mermaid fenced code blocks as mermaid elements

  • RENDER_VIDEO_IMAGES : default true, enable rendering image links with video extensions as video elements

  • RENDER_VIDEO_LINK : default true, enable rendering video link with video images

  • INLINE_MATH_CLASS : default "katex", inline math class attribute

  • BLOCK_MATH_CLASS : default "katex" block math class attribute

  • BLOCK_MERMAID_CLASS : default "mermaid" block mermaid class attribute

  • VIDEO_IMAGE_CLASS : default "video-container" class attribute

  • VIDEO_IMAGE_LINK_TEXT_FORMAT : default "Download '%s'" format for video link text, %s replaced with image alt text

  • BLOCK_INFO_DELIMITERS : default " ", delimiters used for extracting math and mermaid strings from fenced code info string

  • VIDEO_IMAGE_EXTENSIONS : default "mp4,m4v,mov,webm,ogv", extensions of video files which to render as video

ℹ️ to have Math and Mermaid properly rendered requires inclusion of Katex and Mermaid scripts in the HTML page.

If you have the files in the same directory as the HTML page, somewhere between the <head> and </head> tags, you need to include:

<link rel="stylesheet" href="katex.min.css">
<script src="katex.min.js"></script>
<script src="mermaid.min.js"></script>

In addition to the Katex script you need to add JavaScript to the bottom of the page body to convert math elements when the DOM is loaded:

<script>
    (function () {
      document.addEventListener("DOMContentLoaded", function () {
        var mathElems = document.getElementsByClassName("katex");
        var elems = [];
        for (const i in mathElems) {
            if (mathElems.hasOwnProperty(i)) elems.push(mathElems[i]);
        }

        elems.forEach(elem => {
            katex.render(elem.textContent, elem, { throwOnError: false, displayMode: elem.nodeName !== 'SPAN', });
        });
    });
})();

Html To Markdown

Not really an extension but useful if you need to convert HTML to Markdown.

Converts HTML to Markdown, assumes all non-application specific extensions are going to be used:

  • abbreviations
  • aside
  • block quotes
  • bold, italic, inline code
  • bullet and numbered lists
  • definition
  • fenced code
  • strike through
  • subscript
  • superscript
  • tables
  • Gfm Task list item
  • will also handle conversion for multi-line URL images

This converter has an extension API to allow custom HTML tag to Markdown conversion, link URL replacement and options.

Use class FlexmarkHtmlConverter in artifact flexmark-html2md-converter.

Use FlexmarkHtmlConverter.builder(options).build().convert(htmlString) to get equivalent to old FlexmarkHtmlParser.parse(htmlString, options)

The following options are available:

Defined in FlexmarkHtmlConverter class:

  • BR_AS_EXTRA_BLANK_LINES default true, when true <br> encountered after a blank line is already in output or current output ends in <br> then will insert an inline HTML <br> into output to create extra blank lines in rendered result.

  • BR_AS_PARA_BREAKS default true, when true <br> encountered at the beginning of a new line will be treated as a paragraph break

  • CODE_INDENT, default 4 spaces, indent to use for indented code

  • DEFINITION_MARKER_SPACES, default 3, min spaces after : for definitions

  • DIV_AS_PARAGRAPH default false, when true will treat <div> wrapped text as if it was <p> wrapped by adding a blank line after the text.

  • DOT_ONLY_NUMERIC_LISTS, default true. When set to false closing parenthesis as a list delimiter will be used in Markdown if present in MS-Word style list. Otherwise parenthesis delimited list will be converted to dot ..

  • EOL_IN_TITLE_ATTRIBUTE, default " ", string to use in place of EOL in image and link title attribute.

  • LISTS_END_ON_DOUBLE_BLANK default false, when set to true consecutive lists are separated by double blank line, otherwise by an empty HTML comment line.

  • LIST_CONTENT_INDENT, default true, continuation lines of list items and definitions indent to content column otherwise 4 spaces

  • MIN_SETEXT_HEADING_MARKER_LENGTH, default 3, min 3, minimum setext heading marker length

  • MIN_TABLE_SEPARATOR_COLUMN_WIDTH, default 1, min 1, minimum number of - in separator column, excluding alignment colons :

  • MIN_TABLE_SEPARATOR_DASHES, default 3, min 3, minimum separator column width, including alignment colons :

  • NBSP_TEXT, default " ", string to use in place of non-break-space

  • ORDERED_LIST_DELIMITER, default '.', delimiter for ordered items

  • OUTPUT_UNKNOWN_TAGS, default false, when true unprocessed tags will be output, otherwise they are ignored

  • RENDER_COMMENTS, default false. When set to true HTML comments will be rendered in the Markdown.

  • SETEXT_HEADINGS, default true, if true then use Setext headings for h1 and h2

  • THEMATIC_BREAK, default "*** ** * ** ***", <hr> replacement

  • UNORDERED_LIST_DELIMITER, default '*', delimiter for unordered list items

  • OUTPUT_ATTRIBUTES_ID, default true, if true id attribute will be rendered in Attributes syntax

  • OUTPUT_ID_ATTRIBUTE_REGEX, default "^user-content-(.*)", regex to use for processing id attributes, if matched then will concatenate all groups which are not empty, if result string is empty after trimming then no id will be generated. If value empty then no processing is done and id will be rendered as in the HTML.

    ℹ️ Default value will strip out GitHub rendered HTML heading id prefix of user-content-.

  • OUTPUT_ATTRIBUTES_NAMES_REGEX default "", if not empty then will output attributes extension syntax for attribute names that are matched by the regex.

  • ADD_TRAILING_EOL, default false. Will add trailing EOL to generated markdown text.

  • to convert some markdown formatting elements to their inner text, default for all is false:

    • SKIP_HEADING_1 - heading 1
    • SKIP_HEADING_2 - heading 2
    • SKIP_HEADING_3 - heading 3
    • SKIP_HEADING_4 - heading 4
    • SKIP_HEADING_5 - heading 5
    • SKIP_HEADING_6 - heading 6
    • SKIP_ATTRIBUTES - attribute extension conversion
    • SKIP_FENCED_CODE - skip fenced code extension conversion
    • SKIP_LINKS - convert links to plain text
    • SKIP_CHAR_ESCAPE - no escaped characters conversion
  • to control conversion of inline elements, default is ExtensionConversion.MARKDOWN

    • EXT_INLINE_STRONG - strong
    • EXT_INLINE_EMPHASIS - emphasis
    • EXT_INLINE_CODE - code
    • EXT_INLINE_DEL - del
    • EXT_INLINE_INS - ins
    • EXT_INLINE_SUB - sub
    • EXT_INLINE_SUP - sup

    Available settings:

    • ExtensionConversion.MARKDOWN - convert to markdown
    • ExtensionConversion.TEXT - convert to inner text
    • ExtensionConversion.HTML - leave HTML as is
  • UNWRAPPED_TAGS, default new String[] { "article", "address", "frameset", "section", "small", "iframe", }, defines tags whose inner html content should be rendered

  • WRAPPED_TAGS, default new String[] { "kbd", "var" }, defines tags which should render as outer HTML. Inner text will be converted to markdown.

  • TYPOGRAPHIC_REPLACEMENT_MAP option key taking a Map<String,String>, if not empty will be used instead of the bundled map. Any typographic characters or HTML entities missing from the map will be output without conversion. If you want to suppress a typographic so it is not output add "" to its mapped value.

    The bundled map is filled as follows:

    class HtmlParser {
        static Map<String, String> specialCharsMap = new HashMap<>();
        static {
            specialCharsMap.put("“", "\"");
            specialCharsMap.put("”", "\"");
            specialCharsMap.put("&ldquo;", "\"");
            specialCharsMap.put("&rdquo;", "\"");
            specialCharsMap.put("‘", "'");
            specialCharsMap.put("’", "'");
            specialCharsMap.put("&lsquo;", "'");
            specialCharsMap.put("&rsquo;", "'");
            specialCharsMap.put("&apos;", "'");
            specialCharsMap.put("«", "<<");
            specialCharsMap.put("&laquo;", "<<");
            specialCharsMap.put("»", ">>");
            specialCharsMap.put("&raquo;", ">>");
            specialCharsMap.put("…", "...");
            specialCharsMap.put("&hellip;", "...");
            specialCharsMap.put("–", "--");
            specialCharsMap.put("&endash;", "--");
            specialCharsMap.put("—", "---");
            specialCharsMap.put("&emdash;", "---");
        }
    }
  • For controlling how the converted HTML tables are rendered in markdown you can use any of the Formatter table options since formatter is used by the HTML parser to render tables.

Uses the excellent jsoup HTML parsing library and emoji shortcuts using Emoji-Cheat-Sheet.com

Ins

Enables ins tagging of text by enclosing it in ++. For example, in hey ++you++, you will be rendered as inserted text.

Use class InsExtension in artifact flexmark-ext-ins.

The following options are available:

Defined in InsExtension class:

  • INS_STYLE_HTML_CLOSE : default (String)null override HTML to use for wrapping style, if null then no override
  • INS_STYLE_HTML_OPEN : default (String)null override HTML to use for wrapping style, if null then no override

Jekyll Tags

Allows rendering of Jekyll {% tag %} with and without parameters.

Use class JekyllTagExtension in artifact flexmark-ext-jekyll-tag.

ℹ️ you can process the include tags and add the content of these files to the INCLUDED_HTML map so that the content is rendered. If the included content starts off as Markdown and may contain references that are used by the document which includes the file, then you can transfer the references from the included document to the including document using the Parser.transferReferences(Document, Document). See: JekyllIncludeFileSample.java

The following options are available:

Defined in JekyllTagExtension class:

Static Field Default Value Description
ENABLE_INLINE_TAGS true parse inline tags.
ENABLE_BLOCK_TAGS true parse block tags.
ENABLE_RENDERING false render tags as text
INCLUDED_HTML (Map<String, String>)null map of include tag parameter string to HTML content to be used to replace the tag element in rendering
LIST_INCLUDES_ONLY true only add includes tags to TAG_LIST if true, else add all tags
TAG_LIST new ArrayList<JekyllTag>() list of all jekyll tags in the document

Jira-Converter

Allows rendering of the markdown AST as JIRA formatted text

Use class JiraConverterExtension in artifact flexmark-jira-converter.

No options are defined. Extensions that do no support JIRA formatting will not generate any output for their corresponding nodes.

Macros

Macro Definitions are block elements which can contain any markdown element(s) but can be expanded in a block or inline context, allowing block elements to be used where only inline elements are permitted by the syntax. Macros Extension

Use class MacroExtension in artifact flexmark-ext-macros.

Media Tags

Media link transformer extension courtesy Cornelia Schultz (GitHub @CorneliaXaos) transforms links using custom prefixes to Audio, Embed, Picture, and Video HTML5 tags.

  • !A[Text](link|orLinks) - audio
  • !E[Text](link|orLinks) - embed
  • !P[Text](link|orLinks) - picture
  • !V[Text](link|orLinks) - video

Use class MediaTagsExtension in artifact flexmark-ext-media-tags.

No options are defined.

Spec Example

Parses the flexmark extended spec examples into example nodes with particulars broken and many rendering options. Otherwise these parse as fenced code. Flexmark Spec Example Extension

This extension is used by Markdown Navigator plugin for JetBrains IDEs to facilitate work with flexmark-java test spec files.

Superscript

Enables ins tagging of text by enclosing it in ^. For example, in hey ^you^, you will be rendered as superscript text.

Use class SuperscriptExtension in artifact flexmark-ext-superscript.

The following options are available:

Defined in SuperscriptExtension class:

  • SUPERSCRIPT_STYLE_HTML_CLOSE : default (String)null override HTML to use for wrapping style, if null then no override
  • SUPERSCRIPT_STYLE_HTML_OPEN : default (String)null override HTML to use for wrapping style, if null then no override

Tables

Enables tables using pipes as in GitHub Flavored Markdown. With added options to handle column span syntax, ability to have more than one header row, disparate column numbers between rows, etc. Tables Extension

Table of Contents

Table of contents extension is really two extensions in one: [TOC] element which renders a table of contents and a simulated [TOC]:# element which also renders a table of contents but is also intended for post processors that will append a table of contents after the element. Resulting in a source file with a table of contents element which will render on any markdown processor.

Use class TocExtension or SimTocExtension in artifact flexmark-ext-toc.

For details see Table of Contents Extension

Typographic

Converts plain text to typographic characters. Typographic Extension

  • ' to apostrophe &rsquo;
  • ... and . . . to ellipsis &hellip;
  • -- en dash &ndash;
  • --- em dash &mdash;
  • single quoted 'some text' to &lsquo;some text&rsquo; ‘some text’
  • double quoted "some text" to &ldquo;some text&rdquo; “some text”
  • double angle quoted <<some text>> to &laquo;some text&raquo; «some text»

WikiLinks

Enables wiki links [[page reference]] and optionally wiki images ![[image reference]]

To properly resolve wiki link to URL you will most likely need to implement a custom link resolver to handle the logic of your project. Please see: PegdownCustomLinkResolverOptions

Use class WikiLinkExtension in artifact flexmark-ext-wikilink.

The following options are available:

Defined in WikiLinkExtension class:

Static Field Default Value Description
ALLOW_INLINES false to allow delimiter processing in text part when `
DISABLE_RENDERING false disables wiki link rendering if true wiki links render the text of the node. Used to parse WikiLinks into the AST but not render them in the HTML
IMAGE_LINKS false enables wiki image link format ![[]] same as wiki links but with a ! prefix, alt text is same as wiki link text and affected by LINK_FIRST_SYNTAX
IMAGE_PREFIX "" Prefix to add to the to generated link URL
IMAGE_PREFIX_ABSOLUTE value of IMAGE_PREFIX Prefix to add to the to generated link URL for absolute wiki links (starting with /)
IMAGE_FILE_EXTENSION "" Extension to append to generated link URL
LINK_FIRST_SYNTAX false When two part syntax is used `[[first
LINK_PREFIX "" Prefix to add to the to generated link URL
LINK_PREFIX_ABSOLUTE value of LINK_PREFIX Prefix to add to the to generated link URL for absolute wiki links (starting with /)
LINK_FILE_EXTENSION "" Extension to append to generated link URL
LINK_ESCAPE_CHARS " +/<>" characters to replace in page ref by replacing them with corresponding characters in LINK_REPLACE_CHARS
LINK_REPLACE_CHARS "-----" characters to replace in page ref by replacing corresponding characters in LINK_ESCAPE_CHARS

XWiki Macro Extension

Application provided macros of the form {{macroName}}content{{/macroName}} or the block macro form:

{{macroName}}
content
{{/macroName}}

Use class MacroExtension in artifact flexmark-ext-xwiki-macros.

The following options are available:

Defined in MacroExtension class:

Static Field Default Value Description
ENABLE_INLINE_MACROS true enable inline macro processing
ENABLE_BLOCK_MACROS true enable block macro processing
ENABLE_RENDERING false enable rendering of macro nodes as text

YAML front matter

Adds support for metadata through a YAML front matter block. This extension only supports a subset of YAML syntax. Here's an example of what's supported:

---

key: value list: - value 1 - value 2 literal: | this is literal value.

literal values 2
---

document starts here

Use class YamlFrontMatterExtension in artifact flexmark-ext-yaml-front-matter. To fetch metadata, use YamlFrontMatterVisitor.

YouTrack-Converter

Allows rendering of the markdown AST as YouTrack formatted text

Use class YouTrackConverterExtension in artifact flexmark-youtrack-converter.

No options are defined. Extensions that do no support YouTrack formatting will not generate any output for their corresponding nodes.

YouTube Embedded Link Transformer

YouTube link transformer extension courtesy Vyacheslav N. Boyko (GitHub @bvn13) transforms simple links to youtube videos to embedded video iframe.

Use class YouTubeLinkExtension in artifact flexmark-ext-youtube-embedded.

No options are defined.

Converts simple links to youtube videos to embedded video HTML code.