Skip to content

Conversion Approach

Eliot Kimber edited this page Oct 7, 2015 · 9 revisions

Conversion Approach

The transform generates DITA map and a set of topics, where the topics mirror the HTML page organization generated by the normal Doxygen HTML generation process. You can see this organization on the current Oculus site: https://developer.oculus.com/doc/0.7.0.0-libovr/index.html

There are two main categories of information:

  • Files
  • Data structures

The Files area describes the main .h files that define the API. The Data structures area defines the data structures and classes the API.

The input to the DITA generation process is the raw Doxygen XML output produced by the Doxygen tool.

Processing The Doxygen XML

The input to the transform is the top-level index.xml file, which then has pointers to all the other files.

The top-level XSLT file is doxygen2dita.xsl, which is a shell that then imports the real main file, doxygen2ditaImpl.xsl. (This organization enables making the code extensible via Open Toolkit plugins by adding a OT extension point to the top-level shell XSLT or by creating a different shell that includes additional custom modules.)

The direct output of the XSLT is the root DITA map, which will have references to all the generated topics (and submaps, if any). The topics are generated in specific modes using xsl:result-document.

The Doxygen XML is processed in three modes:

  • generateKeyDefinitions: Currently doesn't actually do anything but reflects the normal pattern for map generation, so acts as a placeholder for functionality that may be needed at some point.
  • generateTopicrefs: Generates topicrefs within the main map for each of the generated topics. Implemented in generateTopicrefs.xsl
  • generateTopics: Generates the topics from the Doxygen XML. Most of the data processing happens here.

The Doxygen XML

The index.xml file consists of a set of compound elements, representing the top-level constructs in the code being documented.

Each compound element contains references to any member components that contribute to the compound object (e.g., methods of a class, members of a data structure, or includes referenced from a file).

Each compound and member element is a reference to a separate file via the @refid attribute. The value of the @refid attribute is the filename of the target file without an extension, e.g. refid="unionovr_d3_d11_texture" points to the file "unionovr_d3_d11_texture.xml".

Note that the order compounds occur in the index.xml file is not necessarily the order they should occur in the result DITA map.

Each of the other files contains a single compounddef element, which defines a single compound construct of a specific kind (as indicated by the @kind) attribute. Each different kind requires different processing to generate the output but in general each compounddef results in top-level topic (meaning a topic that is the root of its containing XML document) and a corresponding topicref in the generated DITA map.

Each compounddef contains some number of memberdef elements, which define each of the distinct members.

The compounddef will have a compoundname, which serves as the title of the generated topic and possibly as a label or navigation title in other contexts.

Each memberdef will have children of various types depending on the @kind value. In general the components of each memberdef are pretty obvious as to what they are.

Every memberdef should have a briefdescription element (although in some cases it may be empty).

In addition, any memberdef may have a separate detaileddescription.

As a general rule, any member that does not have a detaileddescription will not result in a standalone topic, but members that do have non-empty detaileddescription elements need to result in separate standalone topics, in addition to topics nested within the topic for the compounddef that contains them.

Mode generateKeyDefinitions

The generated DITA uses keys and key references for all cross references.

The "generateKeyDefinitions" mode currently doesn't do anything--it's a placeholder for later generation of keys as needed.

All the topicrefs that refer to topics are given keys that correspond to the ID or filename of the Doxygen construct the topic reflects, but that is part of the generateTopicrefs mode process. Because Doxygen generates unique IDs and filenames for all the things, it makes it easy to use those IDs as DITA keys (which need to be globally unique within the scope of the root map).

Mode generateTopicrefs

This mode walks the Doxygen XML and generates a topicref for each Doxygen construct that will result in a topic.

This mode sets up the navigation organization of the topics, matching the Doxygen-generated HTML organization.

The Doxygen XML consists of "compound" elements, representing compound constructs such as files, data structures, and classes. Each compound element as a @kind value that indicates what kind of thing it is. The XSLT variable $compoundKindsToUse defines the set of kinds to use to generate the top-level topics: page, file, and struct.

Within the index.xml, every compound element with a kind listed in the compoundKindsToUse variable results in a topicref.

In addition, any memberdef that points to a member with a non-empty detaileddescription also needs to generate a topicref (NOTE: The details of how these member-specific topicrefs should be organized is to be determined.)

Mode generateTopics

The generateTopics mode is implemented in the module generateTopics.xsl

The index.xml file is processed in the mode generateTopics.

For each compound element, the reference is resolved to the referenced Doxygen XML document ($sourceDoc). The body content of the document is generated into a variable ($sourceDocBodyText). If the $sourceDocBodyText variable is not empty (that is, an empty string or only whitespace) then a topic is generated for the compound.

TODO: Also generate separate nested topics for each member of the compound for which detaileddescription is not empty.

The topic for a compound is composed of either simple descriptions of members that do not have a detailed description or summaries of members with detailed descriptions with a link to the full topic for the member.

There is of course complexity in mapping the Doxygen XML for a given member to DITA but for the most part it follows the same pattern.

In order to maintain the semantics from the Doxygen XML without creating a new DITA specialization specifically for Docxygen output (which would be useful but is currently out of scope for this project), the code makes heavy use of @outputclass on generic elements (p, section, sectiondiv, etc.).

The general output pattern is:

  • Elements that are direct children of the topic body and have titles become section elements.
  • Elements within sections become sectiondiv elements
  • Elements that should be paragraphs become p elements
  • Elements that correspond to existing DITA list types become lists
  • Inline elements map to either the corresponding highlight type or to ph with @outputclass

The markup tries to preserve the structure and organization of the Doxygen XML as much as possible, rather than trying, for example, to map things to tables where the HTML presentation is a table. The intent is to preserve the original Doxygen structure and semantics as much as possible to keep presentation options as open as possible.