Skip to content

Latest commit

 

History

History
536 lines (391 loc) · 15.6 KB

flexndex.asciidoc

File metadata and controls

536 lines (391 loc) · 15.6 KB

Flexndex - a document index generator

1. Introduction

Most documentation tools let you generate an index of terms in the document, but often it is useful to be able to generate indices for special purposes. For example if you are documenting software and index to functions, an index to types etc may be useful.

Flexndex is a flexible tool to generate multiple indices as document internal links, making it usable with most XML based document formats, eg xhtml, docbook or Open Document Format.

2. Terminology

index

a list of index entries organised by category

index entry

a link to a target in the document where there is content relevant to the categories of the entry

target

the point in the document that you want an index entry to point to with a list of terms for categorisation

terms

hierarchical set of text items to categorize the entry within the index

3. Operation

Flexndex takes index target information encoded in XML comments, generates the link anchor at that position and records the specified index terms. It also takes requests for index inclusion encoded in XML comments and generates lists of links to the relevant targets at that position.

The markup of anchors and link lists is configurable and selected by name then document type, so documents in several formats can be accomodated.

Flexndex does not specify how the formatted XML comments are included in the document, but most document tools allow their inclusion, see Using with Asciidoc for an example of configuring the Asciidoc tool.

3.1. Index Generation Process

The index content is created by the following steps:

  • selection of entries to include,

  • sorting of entries,

  • generation of missing entries,

  • iteration through the entries performing:

    • filtering of entries, and

    • substitution of terms and attributes of the entry into output markup templates from the configuration settings.

3.1.1. Selection of Entries

Selection is by the name of the index and a set of terms that must be a prefix of the terms in the entries. Specifying no terms means no selection, all entries for the index are shown.

3.1.2. Sorting of entries

The selected entries are sorted by each level in order comparing the terms by Unicode code point.

3.1.3. Generation

Only the leaf nodes of the hierarchy of terms are specified by the ix comment. However some display styles require the internal levels for display. Missing levels are generated during the iteration process.

See Configuration Settings for settings controlling generation.

3.1.4. Iteration

The default iteration of entries is simply in sorted order start to finish, but options outlined in Index Inclusion Request allow iteration orders suitable for multiple column output.

For each entry the terms in the entry are iterated in increasing level order.

Filtering

In some use-cases not all entries need to be displayed, but they are needed for generation of missing entries. The display filters specify which entries to display.

Filtering can also limit which terms within entries are displayed.

See Index Inclusion Request for filter control attributes.

Substitution

The substitution templates are chosen using the iteration components (eg level). See Configuration Settings for the components supported.

The attributes are substituted as defined in [Substitutions].

4. Comment Format

4.1. Index Target Definitions

The format of the comments specifying index terms is:

<!-- ix indexname <attribute_list> -->

Note that the spaces shown above are part of the format, use only one space.

The indexname is the name of the index this target is part of.

The attribute_list is a comma separated list of:

  • index terms in order of primary, secondary, tertiary etc, followed by

  • named attributes in name=value form.

To include comma or equals in an attribute value or name include them twice.

Flexndex does not restrict how many levels an index has.

Named attributes other than the predefined ones will be available for substitution into the output markup, either in the anchor or the entry in the index.

The predefined named attributes are:

text

the text to use in the index entry for links to this target.

4.2. Index Inclusion Request

The format of the comments requesting inclusion of an index is:

<!-- ixhere indexname <attribute_list> -->

Note that the spaces shown above are part of the format, use only one space.

The indexname selects name of the index to put here.

The attribute list has the same format as the index term definition. Any initial index terms select only entries that are prefixed by the specified terms.

To include comma or equals in an attribute value or name include them twice.

Named attributes other than the predefined ones will be available for substitution into the output markup of the index.

The predefined named attributes are:

cols

specify collimation parameters, format is nnidbbb where:

nn

is the number of columns (decimal number)

id

is the iteration and direction control, l = linear or i = interlaced and r = by row or c = by column. Note: Only lc is implemented in v0.1.

bbb

is the column break control, default (no bbb) = make columns as close to same length as possible, lnn = break at level nn (or less), but won’t change column length by more than 20%

indents

how much to indent each column by, level*this value is put in {ixindent} attribute, default 0

levels

filter the output to entries whose length falls in the range specified and limit display to these levels.

n1-n2

from level n1 to level n2 inclusive

n1-

from level n1 up inclusive

-n1, n1

up to level n1 inclusive

where n1 and n2 are decimal numbers counting from one. Missing levels attribute means all. Terms for levels below n1 are not displayed.

sort

sort as specified, default is all levels by increasing Unicode code point. The attribute value is an optionally whitespace separated list of the following:

levels

limit sort to the specified levels, levels is followed by:

n1-n2

from level n1 to level n2 inclusive

n1-

from level n1 up inclusive

-n1, n1

up to level n1 inclusive

where n1 and n2 are decimal numbers counting from one.

style

name of the style of index to output, see Predefined Index Styles

5. Configuration

5.1. Basic Configuration file format

The basic setting file format is a list of key=value entries, each starting on a new line.

Keys are a dot separated list of keys for each level in the hierarchy.

Values are all text following the equals (=) not including leading or trailing whitespace. Values extend to include the next line(s) if the last non-whitespace character of the line is a backslash (\\).

Lines beginning with # are coments and ignored.

For typing convenience lines beginning with open square bracket specify a starting position in the hierarchy for all following keys until another square bracket line. The line consists of a dot separated list of keys for each level followed by a close square bracket.

5.2. Configuration Settings

In the following table items shown like <this> are placeholders to be filled by the user with the appropriate values as explained after the table. All other characters in the setting are expected verbatim.

The substitutions column identifies which built-in attributes are substituted, see the following table.

Setting

Optional, default

Substitutions

Use

default_style

yes, simple-dotted

nothing

Name of default style if not specified in ixhere comment

attribute.<name>

yes, nothing

nothing

Value to substitute for occurrances of {name}

anchors.<backend>

yes, nothing

std tgt target

Markup to output after the ix comment, usually defines a link anchor

[styles.<style_name>.<backend>]

complete

yes, "none"

nothing

Set to start with "e" to generate entries for the complete term hierarchy, ie if a,b,c were the first terms then a and a,b would also be generated. Set to "t" to expand multi targets to entries as well.

prefix

yes, nothing

std here

Markup to output before the index

postfix

yes, nothing

std here

Markup to output after the index

empty_message

yes, "Empty Index

std here

Markup to output if the index has no contents, prefix and postfix not output

entry_start

yes, nothing

std here

Markup to output before each entry, if uncollimated

entry_end

yes, nothing

std here

Markup to output after each entry, if uncollimated

col_start.<col_no>

yes, nothing

std here

Markup to output before specified column if multiple column

col_end.<col_no>

yes, nothing

std here

Markup to output after specified column if multiple column

row_start.<row_no>

yes, nothing

std here

Markup to output before specified row if multiple column

row_end.<row_no>

yes, nothing

std here

Markup to output after specified row if multiple column

[styles.<style_name>.<backend>.levels.<level_no>]

text_internal

yes, nothing

std here term

The markup to output if this term is not the last one for the target entry

text_last

yes, nothing

std here term

The markup to output for the last term if it cannot to be a link, ie it has more than one target

link_last

yes, nothing

std here term tgt target

The markup to output for the last term if it can be a link

multi_target

yes, nothing

std here term tgt target

The markup to output for each of multiple targets

std

Built-in and configured attributes

here

Keyword attributes from the ixhere comment

term

Computed attributes in the term group

pref

Computed attributes in the pref group

tgt

Computed attributes in the tgt group

target

The attributes from the ix comment

The meanings of the placeholders are:

backend

is the name of the backend that the setting applies to

col_no

is the settings to use for each column in multi-column output. See note below.

level_no

is the level of the term that this setting applies to. See note below.

row_no

is the settings to use for each column in multi-column output. See note below.

style_name

is the name of a style being defined

[NOTE] Caution, col_no, level_no and row_no are text, they are not a number. They define an order the same as the text sort order. The convention is to use "1", "2" etc, beware "10" sorts before "2".

5.2.1. Substitutions

Attributes are substituted into output text by enclosing the attribute name in {}. Attribute names should be alphanumerics and underscore only. This is not checked.

Substitutions are not recursive, ie {attr} in an attribute value is not substituted except in conditional substitutions.

Conditional Substitutions
Note
Only name?| implemented in v0.1

Conditional substitutions are performed before other attributes so values conditionally substituted may contain other attributes.

Inside values the characters question mark (?), colon (:) and close brace (}) must occur twice to distinguish them from the meta characters.

Substitute

Condition to substitute

\{name?*value}

name is defined substitute value

\{name?#value}

name is not defined substitute value

\{name?

value}

if name is defined substitute the attribute, else value

\{name?=value1?value2:value3}

name = value1 then value2 else value 3

\{name?!value1?value2:value3}

The :value3 may be omitted if value3 is nothing.

Computed Substitutions

The following substitutions are computed in context of the term or the target.

Attribute

Group

Value

ixterm

term

the current level term

ixindent

term

the indent * current level

ixtgt

tgt

is a unique numeric piece of text identified with the target, use it to make link targets

ixtext

tgt

is either the text attribute from the ix comment, if it exists, otherwise the last term.

Built-in Substitutions

The following attributes are built-in, but can be overridden by config files:

Name

Value

Use

sp

' '

Use for leading or trailing spaces where they would otherwise be stripped off

nl

\n

Newline

6. Command Reference

flexndex [options] infile outfile

Note that as the outfile is the same type as the infile there is no obvious way of generating an output filename automatically, so both infile and outfile are required.

Options are:

-b, --backend

specify the backend format to generate output in, built-in options are xhtml11, and docbook45 which are aliased as html, and docbook respectively. Default is xhtml11. Note docbook not supported in v0.1.

-c, --config

specify configuration files to load, can be specified multiple times, settings in files to the right can override those to the left or builtin configuration. There are no default files loaded.

-h, --help

print this reference and exit

--version

print version and exit

7. Predefined Index Styles

dotted

a simple (no CSS) built-in non-grouped style that shows each entry as:

term1.term2.term3
term1.term2.term4 [target1] [target2]

where the term3 has only one target and term4 has multiple targets each shown in []. The text term3, target1 and target2 are links.

simple-grouped

a simple (no CSS) built-in grouped style shows as a traditional grouped index as:

term1
    term2
        term3
        term4 [target1] [target2]

where term3 target1 and target2 are links.

8. Using With Asciidoc

Flexndex can be used with the xml generated by the Asciidoc tool in xhtml11, docbook and ODT backends.

The easiest way of inserting Flexndex comments is to define two macros:

ix:indexname[attribute_list]
ishere::indexname[attrbiute_list]

to generate the index target comments and index comments respectively. Note the ixhere macro is a block macro and ix is inline.

Put the following in an appropriate asciidoc.conf file:

[macros]
(?su)(?<!\w)[\\]?(?P<name>ix):(?P<target>\S*?)\[(?P<attrlist>.*?)\]:
(?u)(?<!\w)[\\]?(?P<name>ixhere)::(?P<target>\S*?)\[(?P<attrlist>.*?)\]: #

[ix-inlinemacro]
<!-- ix {target} <{attrlist}> -->

[ixhere-blockmacro]
<!-- ixhere {target} <{attrlist=}> -->

or if using a release after 8.6.7 or hg newer than release c715f6c96481 (June 10 2012) then you can place:

:macros.(?su)(?<!\w)[\\]?(?P<name>ix):(?P<target>\S*?)\[(?P<attrlist>.*?)\]:
:macros.(?u)(?<!\w)[\\]?(?P<name>ixhere)::(?P<target>\S*?)\[(?P<attrlist>.*?)\]: #
:ix-inlinemacro.: <!-- ix {target} <{attrlist}> -->
:ixhere-blockmacro.: <!-- ixhere {target} <{attrlist=}> -->

in the header of the document and avoid the need for a separate asciidoc.conf.

Run asciidoc to create the .html or .xml file then run flexndex. The xml can then be processed further by a2x as normal by specifying the .xml file output from flexndex as input to a2x.

9. Futures

9.1. Probable

  • separate levels filter and levels display specification

  • docbook built-ins

  • multi-column indices (alpha)

  • inverse indices

  • CSS stylable standard styles

9.2. Possible

  • switch built-ins to a standard config file

  • other backends as standard