JSON Schema base URI collision #902

davaya · 2021-04-19T19:17:23Z

Describe the bug

The JSON Schema files for multiple layers all have the same base URI (value of the root $id keyword). This indicates a bug in the schema generation tools, since the URI is intended to (uniquely) identify schema resources.

How do we replicate the issue?

Examine JSON schemas in:

Observe that they begin with:

 { "$schema" : "http://json-schema.org/draft-07/schema#",
  "$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema.json",
  "$comment" : "OSCAL Control Catalog Model: JSON Schema",

{ "$schema" : "http://json-schema.org/draft-07/schema#",
  "$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema.json",
  "$comment" : "OSCAL Profile Model: JSON Schema",

{ "$schema" : "http://json-schema.org/draft-07/schema#",
  "$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema.json",
  "$comment" : "OSCAL Component Definition Model: JSON Schema",

Expected behavior (i.e. solution)

The base URI of each distinct schema should identify no other schema. For example:

"$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema/catalog.json"

"$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema/profile.json",

"$id" : "http://csrc.nist.gov/ns/oscal/1.0-schema/component.json",

The text was updated successfully, but these errors were encountered:

david-waltermire · 2021-04-30T13:38:27Z

We are working to produce a single JSON and XML schema for all of OSCAL. This will address this issue once deployed.

davaya · 2021-05-07T13:17:38Z

At first glance restructuring OSCAL from modular to monolithic seems like a step in the wrong direction. Loose coupling using namespaces would be the natural approach - is there a rationale or pros and cons for using a monolithic JSON and XML schema for all of OSCAL?

GaryGapinski · 2021-05-07T16:28:26Z

I thought there was an aversion to using more than one namespace (if that is what @davaya means — i.e., one namespace per sub-schema).¹

At the moment, there are multiple OSCAL schemas each within the same namespace, which makes

http://nvdl.org/ not possible
https://www.oasis-open.org/committees/download.php/14809/xml-catalogs.html not possible

thus requiring the use of explicit schema association per instance document using

https://www.w3.org/TR/xml-model/, or
https://www.w3.org/TR/xmlschema-1/#schema-loc, or
an association scheme separate from the instance document

¹ OVAL made profligate use of namespaces which IMO markedly decreased its usability by increasing its complexity.

davaya · 2021-06-03T18:05:47Z

BLUF: Schema namespaces yes. Data namespaces no.

After reading the OSCAL metaschema paper https://www.balisage.net/Proceedings/vol23/print/Piez01/BalisageVol23-Piez01.html the motivation for a single namespace becomes clearer. But when discussing the "OSCALizable subset of XML", the distinction between schema and data namespaces is, or appears to be, lost.

JSON data has no namespaces but JSON Schema does - the root $id of each schema file gives that file's namespace. Namespacing enables reuse of definitions - there's no need for OSCAL to re-invent SI units for length, mass and temperature, no need to re-invent GPS coordinates, etc. Those types can be created by experts and referenced when needed. JSON schema facilitates cross-namespace referencing using $ref, but the resulting data has no trace of namespacing because the data format explicitly does not support it.

I think it would be appropriate for each of the OSCAL schema/model layers to have its own namespace - there is no danger of namespace proliferation because the number of layers might grow from 7 to 8 or 9, but not to thousands. It might also be appropriate for the OSCAL XML data to emulate JSON data and be constructed without element prefixes. Data structure provides namespace separation the way filesystem paths ensure that there is no collision between files of the same name in different folders:

<markup>
    <table>
        <head/>
        <body/>
    </table>
</markup>

{
  "markup": {
    "table": {
      "head": [],
      "body": []
}}}

is not confused with:

<furniture>
   <table>
       <material/>
       <weight/>
   </table>
</furniture>

{
  "furniture": {
    "table": {
      "material": "oak",
      "weight": 52
}}}

david-waltermire · 2021-06-03T20:20:14Z

@davaya A JSON schema does not have a namespace. It has a unique schema identifier expressed as a canonical URI. This is not the same as a namespace.

FWIW, we made an early decision that all of OSCAL will be in the same XML namespace, which I think at this point we need to keep for OSCAL v1. This allows us to reuse common information items across the OSCAL models (and schemas). Since in OSCAL XML all information items are in the same namespace, we can avoid having to alternate namespaces, which has been a confusing and problematic issue for users of other efforts that do this (i.e., OVAL, etc.).

wendellpiez · 2021-06-04T13:48:26Z

Additional note: the Metaschema back end gives us a great deal of flexibility in this, for generating schemas (both XML and JSON) with specialized namespaces as well as with unified namespaces when/as appropriate. I am not sure everyone will regard this approach as a solution so much as (again) moving the problem. But it might offer options going forward.

davaya · 2021-06-04T16:17:13Z

@david-waltermire-nist:
"A Package is a namespace for its members, which comprise those elements associated via packagedElement (which are
said to be owned or contained), and those imported." -- https://www.omg.org/spec/UML/2.5.1/PDF Section 12.2.3.1

"Namespace is an abstract named element that contains (or owns) a set of named elements that can be identified by name. In other words, namespace is a container for named elements." -- https://www.uml-diagrams.org/namespace.html#:~:text=UML%20Common%20Structure,package

"When writing computer programs of even moderate complexity, it’s commonly accepted that “structuring” the program into reusable functions is better than copying-and-pasting duplicate bits of code everywhere they are used." -- https://json-schema.org/understanding-json-schema/structuring.html

A JSON schema file with a root $id acts like a package with a namespace and is used like a namespace, so if there is some terminological technicality that says it is not, the distinction will have to be articulated with much greater precision.

How to structure OSCAL is a design decision, and using a single namespace is certainly a valid option. It does require close coupling between the layers, and since they were apparently developed assuming loose coupling, any name collisions will need to be resolved before the single namespace can be realized. That's easily doable, but I would have favored loose coupling.

Cheers.

david-waltermire · 2021-06-04T16:47:32Z

All name collisions within the OSCAL domain are handled by the Metaschema XML and JSON schema processing. The draft JSON and XML schemas produced should not have naming collisions.

FYI. The JSON definition IDs used in the "complete" schema are the same JSON definition ids used in each "model" schema. The same applies for XML types used in the "complete" vs "model" schemas. This allows the common information items to be easily identified.

* Rework of docs focusing on JSON docs and model pipeline * Improvements to composition toolchain * Fixed a few small bugs in the metaschema-check. Improved performance of the compose pruning using an accumulator. * Moved edge-case samples into testing directory * Made shadowing warning a warning * Initial commit of an Oxygen Metaschema framework. * Creation of new compose schematron unit tests. * Cross-linking XML and JSON syntax pages and other improvements to links * Now building XML and JSON indexes to reference pages, with links to steps * Reconfigured docs pipeline (XSLT entry points); adding new files including pipeline steps * Migrating schema generation tools to new/improved composition pipeline * Addressing usnistgov/OSCAL#902 thanks for finding this bug * Enhancements to JSON Schema definition (with better performance too) * Adding support for json-base-uri as a metaschema property * Updated JSON schema $id; factoring out common docs XSLT * Fixing IDs in JSON schema per issue usnistgov/OSCAL#933. * Addressing datatype validation issues: whitespace collapsing; non-empty values; ncname-workalike in JSON Schema - see usnistgov/OSCAL#911 usnistgov/OSCAL#805 also #33 #67 #68 * Improvements to XSD production; fully aligning 'token' datatype across XSD and JSON Schema implementations. * Updating bidirectional XML/JSON converter generators (#143) * Committing a version that handles test data correctly (so far) from rebuilt metaschema composition addressing #51 #53 #76 * Now displaying constraints in documentation at point of definition; * Docs generation revamp Reworked reference and other pages to sketch - #128 and others Co-authored-by: Wendell Piez <wendell.piez@nist.gov>

david-waltermire · 2021-06-07T04:00:31Z

The "complete" XML and JSON schemas have been integrated in PR #948. These will be released in OSCAL 1.0.0.

* Rework of docs focusing on JSON docs and model pipeline * Improvements to composition toolchain * Fixed a few small bugs in the metaschema-check. Improved performance of the compose pruning using an accumulator. * Moved edge-case samples into testing directory * Made shadowing warning a warning * Initial commit of an Oxygen Metaschema framework. * Creation of new compose schematron unit tests. * Cross-linking XML and JSON syntax pages and other improvements to links * Now building XML and JSON indexes to reference pages, with links to steps * Reconfigured docs pipeline (XSLT entry points); adding new files including pipeline steps * Migrating schema generation tools to new/improved composition pipeline * Addressing usnistgov/OSCAL#902 thanks for finding this bug * Enhancements to JSON Schema definition (with better performance too) * Adding support for json-base-uri as a metaschema property * Updated JSON schema $id; factoring out common docs XSLT * Fixing IDs in JSON schema per issue usnistgov/OSCAL#933. * Addressing datatype validation issues: whitespace collapsing; non-empty values; ncname-workalike in JSON Schema - see usnistgov/OSCAL#911 usnistgov/OSCAL#805 also usnistgov#33 usnistgov#67 usnistgov#68 * Improvements to XSD production; fully aligning 'token' datatype across XSD and JSON Schema implementations. * Updating bidirectional XML/JSON converter generators (#143) * Committing a version that handles test data correctly (so far) from rebuilt metaschema composition addressing usnistgov#51 usnistgov#53 usnistgov#76 * Now displaying constraints in documentation at point of definition; * Docs generation revamp Reworked reference and other pages to sketch - #128 and others Co-authored-by: Wendell Piez <wendell.piez@nist.gov>

davaya added the bug label Apr 19, 2021

david-waltermire added this to the OSCAL 1.0.0 (Full Release) milestone Apr 30, 2021

david-waltermire assigned wendellpiez Apr 30, 2021

wendellpiez mentioned this issue May 4, 2021

Add version property to Explicitly Version JSON Schemas like XML Schemas #920

Closed

3 tasks

david-waltermire self-assigned this May 18, 2021

wendellpiez added a commit to wendellpiez/metaschema that referenced this issue May 18, 2021

Addressing usnistgov/OSCAL#902 thanks for finding this bug

be2bb74

wendellpiez added a commit to wendellpiez/metaschema that referenced this issue May 18, 2021

Addressing usnistgov/OSCAL#902 thanks for finding this bug

580c871

david-waltermire pushed a commit to usnistgov/metaschema that referenced this issue May 19, 2021

Addressing usnistgov/OSCAL#902 thanks for finding this bug

4b520a5

david-waltermire closed this as completed Jun 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JSON Schema base URI collision #902

JSON Schema base URI collision #902

davaya commented Apr 19, 2021 •

edited

Loading

david-waltermire commented Apr 30, 2021

davaya commented May 7, 2021

GaryGapinski commented May 7, 2021

davaya commented Jun 3, 2021 •

edited

Loading

david-waltermire commented Jun 3, 2021 •

edited

Loading

wendellpiez commented Jun 4, 2021

davaya commented Jun 4, 2021 •

edited

Loading

david-waltermire commented Jun 4, 2021

david-waltermire commented Jun 7, 2021

JSON Schema base URI collision #902

JSON Schema base URI collision #902

Comments

davaya commented Apr 19, 2021 • edited Loading

Describe the bug

How do we replicate the issue?

Expected behavior (i.e. solution)

david-waltermire commented Apr 30, 2021

davaya commented May 7, 2021

GaryGapinski commented May 7, 2021

davaya commented Jun 3, 2021 • edited Loading

david-waltermire commented Jun 3, 2021 • edited Loading

wendellpiez commented Jun 4, 2021

davaya commented Jun 4, 2021 • edited Loading

david-waltermire commented Jun 4, 2021

david-waltermire commented Jun 7, 2021

davaya commented Apr 19, 2021 •

edited

Loading

davaya commented Jun 3, 2021 •

edited

Loading

david-waltermire commented Jun 3, 2021 •

edited

Loading

davaya commented Jun 4, 2021 •

edited

Loading