Skip to content

Workflow to transform Express into Metanorma

Nick Nicholas edited this page Nov 27, 2023 · 1 revision

There are three stages of transforming a repository with Express markup, such as https://github.com/metanorma/iso-10303-stepmod-wg12, into Metanorma source code ready for compilation as a collection.

stepmod-utils

https://github.com/metanorma/stepmod-utils transforms Express source annotations into Metanorma Asciidoctor markup, but otherwise leaves the file structure of the underlying repository intact. The transformation happens by isolating the annotations in Express source code, and translating them, while leaving the Express source code itself alone.

The transformation has been set up online to run nightly: it takes the repository https://github.com/metanorma/iso-10303-stepmod-wg12 as input, and populates the repository https://github.com/metanorma/iso-10303-stepmod as output.

I don’t know what else it does, because I am neither Hassan nor Ronald.

stepmod2mn

https://github.com/metanorma/stepmod2mn transforms the Express source files, with annotations in Metanorma Asciidoctor markup, into files that can be compiled by Metanorma. This involves creating the root document and included sections in Metanorma Asciidoctor for each of the resource docs in the repository.

The included sections populate details of the schemas by using a shell clause, sections_common/04-schemas.adoc, which is used to iterate through each of the relevant schemas for the resource doc, and generating a clause with its content.

The code also generates a shell for iterating through each of the schemas to generate attachment files.

And it generates a collection manifest, containing a description of the contents of each resource_doc as a collection of files (the attachment files and the document itself, with the directive to break it down in rendering to one HTML page per clause.)

In order to prepare a local download of iso-10303-stepmod for compilation, the required commands are as follows, where $STEPMOD2MN is the local installation of the stepmod2mn Java executable (…​/stepmod2mn/target/stepmod2mn-*.jar)

java -Xss5m -jar $STEPMOD2MN iso-10303-stepmod/data -svg
java -Xss5m -jar $STEPMOD2MN iso-10303-stepmod

This generates the source files for Metanorma document compilation, and the SVGs that they reference.

A collection is currently generated by using a Publication Index file; the following example for instance is used to generate a collection of ISO 10303-41 through ISO 10303-45:

<!DOCTYPE part1000.publication_index SYSTEM "../../dtd/p1000_publication_index.dtd">

<part1000.publication_index
        name="CR_part42"
        collector_bug="7876"
        wg.number.publication_set="10924"
        wg.number.publication_set_comments="Change request to support AP242 Ed2 IS corrections"
        date.iso_submission="2022"
        date.iso_publication="2022"
        sc4.working_group="12">

  <description>
    The modules and/or resources and/or BO/domain Models that are to be added to Part1000 as part of change request CR_part42.
    The STEPmod CVS repository's files should be tagged as: CR_part42
  </description>

        <contacts>
      <projlead ref="wg12convener"/>
      <editor ref="benurick"/>
        </contacts>

    <precedent_changes>
        <change current_id="CR_part42" precedent_id="SMRLv8"/>
    </precedent_changes>

        <resource_docs>
        <resource_doc name="fundamentals_of_product_description_and_support" version="7" checklist.internal_review="" checklist.project_leader="" checklist.convener="" number="41" wg_number="8433"/>
<resource_doc name="geometric_and_topological_representation" version="9" sc4.working_group="12" checklist.internal_review="7629" checklist.project_leader="7630" checklist.convener="7631" number="42" wg_number="8457"/>
<resource_doc name="representation_structures" version="5" wg.number.supersedes="4827" wg.number.express.supersedes="4828" checklist.internal_review="6136" checklist.project_leader="6137" checklist.convener="6138" number="43" wg_number="6134"/>
<resource_doc name="product_structure_configuration" version="5" sc4.working_group="12" checklist.internal_review="" checklist.project_leader="" checklist.convener="" number="44" wg_number="8374"/>
<resource_doc name="material_and_other_engineering_properties" version="3" checklist.internal_review="5101" checklist.project_leader="5102" checklist.convener="5103" number="45" wg_number="8317"/>
     </resource_docs>
</part1000.publication_index>

In order to generate the needed files for a collection described in publication_index.xml, use the command:

java -Xss5m -jar $STEPMOD2MN publication_index.xml --output iso-10303-stepmod/$OUTPUT_LOCATION

This will generate in the location $OUTPUT_LOCATION a duplicate of all the source files required for compilation of the collection.

It also generates, at the root of iso-10303-stepmod, the configuration files needed for the collection:

  • The collection manifest, collection.yml, containing the locations of all the source resource docs, as generated in $OUTPUT_LOCATION

  • The collection generation script, collection.sh, which is described below

  • cover.html, the cover page for the collection. (This is currently quite bare bones, and contains only a Liquid directive to link to the index of each resource doc.)

  • The output directory iso10303-output, which is currently hardcoded to have that name in collection.sh, and contains the rendered output of the collection

I am vague in what else it does, because I am neither Alex nor Ronald. This needs a lot more description, so that we can trace files being processed by Metanorma back to their Express source.

Metanorma

Metanorma itself generates a collection of documents once the publication_index.xml file has been processed. Metanorma relies heavily to do this on Lutaml, a plugin that populates Liquid templates in Metanorma Asciidoctor with content from extraneous sources. In this instance, Lutaml in turn depends on Expressir, which parses Express source code, and generates structs reflecting the Express source code and its annotations, which can be used to populate Metanorma documents.

It does so by going through the following steps:

  • For each resource doc specified at the top of collections.sh (and extracted from publication_index.xml)

    • Generate the HTML attachments for that resource doc. This is done by iterating through schemas.yaml (containing the Express source locations for each schema), and populating a standalone Metanorma document for each schema, using a Lutaml template. This document is then compiled, at the same file folder as the source resource doc, to generate an HTML document; the document needs to be compiled in the same location in order for all the links in schemas.yaml to work.

      • TODO: the attachment documents are currently then moved to the subdirectory schemadocs, but that means that all the links break; besides that, the links in attachments are currently pointing to Express definitions in the main documents, and not to other attachments. Substantial work will need to be done to get the desired linking behaviour.

    • Generate the Metanorma Semantic XML for each resource document, under $OUTPUT_LOCATION

      • The Semantic XML for each resource document is generated by compiling the document.adoc root document, which includes a separate sections/*.adoc document for each clause.

      • The SVG images for each resource doc have their Express anchor links incorporated in a form that Metanorma can process and resolve: they are treated identically to all other cross-references in Metanorma.

      • sections_common/04-schemas.adoc is fed Express schema files from schemas.yml through Lutaml, to generate the complete annotated clause description for each schema, out of Express annotations and source code.

      • Tracking down the Express content that gives rise to a schema description involves: looking up the schema file in schemas.yml, then perusing its contents, and if needed tracing that content back to the corresponding Express source file in iso-10303-stepmod-wg12. Somehow. I just end up doing string searches: stepmod2mn completely rearranges the files, and it needs to.

      • Lutaml does not output the Metanorma Asciidoctor it populates, so debugging is quite difficult at the moment: we have the bits of Asciidoctor in the source schema annotations, and we have the Metanorma Semantic XML that is generated based on them, but we don’t have a copy saved to disk of a unitary Asciidoctor file that we can scrutinise.

    • Render the collection of all the resource documents, using the collection manifest (collection.yml), collection output directory (iso10303-output), and cover page (cover.html) hardcoded currently in collection.sh.

      • A Metanorma collection is a set of Metanorma documents, with all links between documents resolved as hyperlinks within the collection, to other documents. (This overrides the Metanorma default, which is to refer links to extraneous documents to the document bibliography.)

      • Metanorma collection processing has been enhanced, to look for Express anchors in the entire set of documents in the collection, and resolve them to point to the document containing that anchor, if present.

      • The documents are rendered in HTML. There is a directive in the collection manifest, to break up each resource doc so that each clause is given a separate HTML page. All links are updated in collection processing to point to the right clause HTML page. An index file is generated for each resource doc, linking to each clause HTML page.

      • The cover.html file is used to generate an index.html file in iso10303-output, which points to the index HTML files for each resource doc.

      • The output directory also contains the individual Metanorma Presentation XML file for each HTML page. That gives some traceability of the compilation.

      • The SVG files are encoded as Data URIs inside the Metanorma Presentation XML and the HTML: there are no external image files.

Clone this wiki locally