-
Notifications
You must be signed in to change notification settings - Fork 3
USMTF XML SCHEMA DESIGN
Background. The US Message Text Format (MTF) configuration Control Board (CCB) has adopted National Information Exchange Model (NIEM) naming and design rules for USMTF XML Schema. This document provides a complete overview of the resulting product, and how it serves configuration management and implementation purposes.
Overview. Reference (1) provides NIEM control measures for human understanding. Reference (2) provides the same information in a machine readable format for use in the testing and verification required for secure information exchange, as described in Reference (3). References 4-6 provide a sample XML Schema, annotations schema and local terminology definitions. These resources contain the basis for XML Schema design decisions to support secure information exchange using USMTF.
Scope. This document describes features which accurately define data models to support MTF in a manner which includes security tagging capabilities, and can be independently tested using NIEM rules. Efforts that are independent of NIEM will have to meet these testing requirements independently, or assume risk without them. All options create additional costs for the government.
MTF Data Design. The logical design of MTF data objects is retained in the XML Schema using the application information elements in annotations. Every data object in USMTF is defined using a globally scoped XML Schema SimpleType, for content, a ComplexType to add security tags, and an Element so that it can be referenced by other ComplexTypes. Substitution Groups are used for alternative content at the namespace level, but Choice is used for implementation. Structural Rules are provided in the annotations application information node, and implemented using Schematron XPath expressions.
- a. Single Schema Concept. NIEM incorporates the W3C best practice in defining each namespace using a single namespace. This has many advantages not the least of which is the ability to use older XML parsers which are not namespace aware.
(1) Namespaces. When an XML Schema imports an XML Schema with a different namespace, then the items it uses from that namespace will be validated against it. When an XML Schema is extended, all of the content is included, with additional structures. When an XML Schema is Restricted, content is removed or adjusted, but it will remain valid against the source Schema so a new namespace is not required.
(2) Extension Schema. An extension schema will become the Reference Schema for a new namespace. This must then be restricted for implementation.
(3) Restriction Schema. Restriction Schema are expected to be dynamically generated based on operational requirements in order to meet network capacity and information security requirements. They will retain the namespace of the Reference or Extension Schema that they restrict.
b. Annotations. In accordance with NIEM, all annotations must be defined using a separate XML Schema. This is provided in Appendix (2).
c. Alternative Content. In accordance with NIEM design rules, USMTF XML uses Substitution Groups to model Choice in the Reference Schema, because this allows extension. For normative implementation, using Restriction Schema, extension of alternative content is not desired, so the non extensible XML Schema Choice model is preferred.
(1) SubstitutionGroup. This involves the creation of an "Abstract" Element, which is never reflected in an instance but can be used as the context for choice items by adding a substitutionGroup attribute to any global element with the name of the abstract element.
(2) Choice. Substitution Groups make it possible to add additional alternatives, whereas the W3C XML Schema Choice element (xs:choice) cannot be extended. For implementation, when extension of alternative content is prohibited, it is expected that Substitution Groups will be converted to Choice elements.
(3) MTF Considerations. When alternative content is represented within an XML Element definition and has a "Position Name" this structure is preserved with an Alternative Content type and element. Unnamed alternatives are represented using Abstract Substitution Groups.
(a) Named Alternatives. In the Reference Schema, the Alternative Content contains a reference to the Abstract Substitution Group. For implementation the Alternative Content type will contain the W3C Choice structure.
(b) Unnamed Alternatives. In the Reference Schema, the Alternative Content contains a reference to the Abstract Substitution Group. For implementation the the W3C Choice structure replaces this reference.
- d. Fields. Fields are the data objects in which information is stored. All other data structures are collections of Fields.
(1) Enumerations. Data items with values than can be expressed as a selection of codes are modeled as enumerations. This model does not support selection of multiple values.
(2) Regular Expressions. A regular expression is a sequence of characters that defined a pattern that can be used to restrict a field entry to desired content.
(3) Value Limits. Numeric values are assigned minimum and maximum values.
(4) Length Limits. Text fields are defined with minimum and maximum lengths. If numeric values must have lengths that are not reflected in numeric formats, such as leading zeros, a combination of value limits and regular expression must be used.
- e. Composites. Composites are collections of references to globally defined Fields in a specific sequence, with occurrence information provided.
(1) Occurrence. Used to specify optional components, with minimum occurrence zero, required components, with minimum occurrence greater than zero, and to specify limits for repeating items.
(2) References. All Fields, Composites, Sets, Segments and Messages are defined using global Elements, and included in context using references to these global elements. This supports re-use of data objects by implementing software.
f. Sets. Sets are collections of references to globally defined Fields,Composites, and other Sets in a specific sequence, with occurrence information provided.
g. Segments. Segments are collections of references to globally defined Sets, Composites, Fields and other Segments in a specific sequence, with occurrence information provided.
h. Messages. Messages are collections of references to globally defined Sets, Composites, Fields and Segments in a specific sequence, with occurrence information provided.
(1) Structural Rules. These are used to define required values, and occurrence of specific objects based on the occurrence or values of other Fields, Composites, Sets, or Segments.
(2) Restriction. The normative definitions for MTF messages are necessarily broad to allow the further application of required mission specific restrictions for implementation. Subset and Restriction Schema are required for implementation of MTF Messages in a consistent, and testable manner.
- XML Schema Validation
- a. Reference Schema. All instances of a Restriction Schema must be valid against the restricted Reference Schema. Nodes of an Extension Schema that are imported from a Reference Schema must also be valid.
- (1) Subset Schema. For validation purposes Subset Schema are equivalent to the parent Reference Schema, and must be Restricted for implementation. Individual messages are represented using Subset Schema.
- (2) Extension Schema. Any addition or adjustment that will render an instance invalid against the Reference Schema requires the creation of an Extension Schema with a new namespace. This Extension Schema must then be Restricted for implementation.
b. Restriction Schema. Instances of a Restriction Schema must be valid against both the Restriction Schema and the Reference, Subset, or Extension Schema from which it is derived. Restriction Schema are not expected to be further restricted or extended.
c. Default Values. When required fields are not used in Restriction Schema, fixed values are specified in the Restriction Schema.
d. Required Nodes. When required fields cannot be used due to restrictive Distribution Statements, or other reasons, an Extension Schema will be created without the restricted node.
e. Structural Relationship Rules. Conditional rules are provided using XPath statements formatted using the Schematron namespace, and evaluated using XSLT. This method is used to verify rules applying to any XML document, to include XML Schema, XML instances, and other XSLT products.
XSLT Processing. The Extensible Stylesheet Language for Translation (XSLT) is an XML defined language that is used to process conditional rules associated with messages and security tags. XSLT processing is required for validation of MTF XML Schema,and is also used to create XML extension and restriction schema from the reference schema.
XML Schema Restriction. This is a derivation that is achieved by removing or altering XML Schema nodes in such a way as to create subsets of the original definition. This is considered a Normative Approach to implementing MTF using XML.
XML Schema Extension. This is a derivation that adds or alters information in a way that will support instances that will not be valid against the Extended Schema. This is appropriate for special use-cases. These include backward compatible XML Schema, and alterations required for security purposes.
Implementation
a. Code Generation. The machine readable functionality of XML Schema is applied primary to auto-generation of software artifacts for implementation. These include data structures, and tests to verify that valid messages can be generated and parsed.
b. Authority To Operate. All MTF implementations are expected to implement security tags. Proper evaluation and handling of these security tags is required for deployment. The potential for spillage is reduced by restricting messaged content in accordance to mission specifications. Implementations must support continuous, independent evaluation of functional parameters using external test data.
- Appendix A: MTF XML Schema Design. A representative example schema is provided ain Reference (4) to demonstrate the design of the MTF data model. Samples of the Restriction schema format are provided inline for comparison.
a. Annotations. The top level annotation demonstrates the inclusion of structural relationship rules. Annotations throughout the document reflect the items defined in Reference (5).
b. SimpleTypes
(1) FieldTextSimpleType. This example shows the use of a Regular Expression, as well as minimum and maximum lengths. Length restriction scan be included in the Regular expression, but to prevent the need to parse this field to get the values, MTF fields will always specify length limits explicitly. All text names will have a naming pattern that includes "Text"
(2) EnumerationCodeSimpleType. This example shows the use of enumerations, which are termed "Code Lists" by NIEM. All enumeration names will have a naming patten that includes "Code."
(3) IntegerSimpleType. This example shows the definition of integer data. A Regular expression can be used to enforce leading zeroes or other formatting requirements. All integer names will have a naming patten that includes "Numeric."
(4) DecimalSimpleType. This example shows the definition of decimal data. totallDigits and fractionDigits elements are used to determine the position of the decimal point. All decimal names will have a naming patten that includes "Numeric."
- c. ComplexTypes. Each CompexType extends a SimpleType by adding the "structures:SimpleObjectAttributeGroup." This NIEM resource contains a reference to ISM XML Schema for security tags. Restriction Schema used for IEPDs replace this broad option with a specific reference to ISM.
(1) FieldTextType
(2) EnumerationCodeType
(3) IntegerType
(4) DecimalType
(5) CompositeType
(6) SetType
(7) SegmentType
(8) MessageType
d. AttributeGroups
e. Elements
(1) FieldText
(2) EnumerationCode
(3) IntegerNumeric
(4) DecimalNumeric
f. Attributes
g. Conditional Rules
h. Security Rules
- References
- a. National Information Exchange Model (NIEM) Naming and Design Rules. https://reference.niem.gov/niem/specification/naming-and-design-rules/4.0/niem-ndr-4.0.html
- b. NIEM NDR Machine-Readable Rules. https://github.com/NIEM/NIEM-NDR
- c. Secure Information Exchange
- d. MTF Sample Schema Structure
- e. MTF Annotations Schema
- f. MTF Local Terminology