Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JSON Schema Generation Yields Undesired Allowed Values #240

Closed
aj-stein-nist opened this issue Sep 22, 2022 · 16 comments · Fixed by #253
Closed

JSON Schema Generation Yields Undesired Allowed Values #240

aj-stein-nist opened this issue Sep 22, 2022 · 16 comments · Fixed by #253
Assignees
Labels
bug Something isn't working XSLT Implementation Issue relates to the XSLT implementation of Metaschema.

Comments

@aj-stein-nist
Copy link
Collaborator

aj-stein-nist commented Sep 22, 2022

Describe the bug

Currently, the JSON Schema generator, for metaschema definitions with allowed-values where allow-other="no", where its target does not have a predicate (target=".") or the target is not the default OSCAL namespace (for any target that is not target=".[has-oscal-namespace('http://csrc.nist.gov/ns/oscal')]"), the JSON generator should not emit the allowed values as part of the JSON Schema. If allow-other="yes", then the current enum style of JSON Schema validation, when needed in such cases like it, is not correctly implemented.

It currently gets emitted like this:

              { "type" : "string" } },
            "required" : 
            [ "type" ],
            "additionalProperties" : false,
            "enum" : 
            [ "equivalent-to",
              "equal-to",
              "subset-of",
              "superset-of",
              "intersects-with" ] }

We need it to be adapted to emit JSON Schema like this:

              "type" : 
              { "anyOf":
              [
                { "type": "string"},
                { 
                  "enum" : [
                    "equivalent-to",
                    "equal-to",
                    "subset-of",
                    "superset-of",
                    "intersects-with" 
                  ]
                }
              ] } }

Who is the bug affecting?

OSCAL tool developers want to convert mappings from OSCAL XML to OSCAL JSON and must have the latter validated with JSON Schema after the fact.

What is affected by this bug?

Schema validation with JSON schemas generated from the scheme generator(s) in this repo.

When does this occur?

Consistently.

How do we replicate the issue?

When debugging issues with use of NIST OSCAL automation from community stakeholders in CIS, we were asked to troubleshoot and debug issues validating the emitted JSON file via ajv -c ajv-formats -s oscal_mapping_schema.json -d CCM_CISControlsv8_Mapping-min.json after converting the mapping in OSCAL XML to OSCAL JSON in CI/CD with GitHub Actions. An example run can be found here with current commit aj-stein-nist/CISControls_OSCAL@cd22c23#diff-b7c9b89f382154d95ec71f7a04dd1b62a11804cd8a3ed0c2d908ad97fba7e66e.

The following errors for this seemingly valid OSCAL JSON document instance are below, despite all the correct values being defined for the given fields and assemblies in the document.

/home/runner/work/CISControls_OSCAL/CISControls_OSCAL/git-content/mappings/json/CCM_CISControlsv8_Mapping-min.json invalid
[
  {
    keyword: 'enum',
    dataPath: '/mapping-collection/mappings/maps/0/relationship',
    schemaPath: '#/properties/relationship/enum',
    params: { allowedValues: [Array] },
    message: 'should be equal to one of the allowed values'
  },
  {
    keyword: 'type',
    dataPath: '/mapping-collection/mappings',
    schemaPath: '#/properties/mappings/anyOf/1/type',
    params: { type: 'array' },
    message: 'should be array'
  },
  {
    keyword: 'anyOf',
    dataPath: '/mapping-collection/mappings',
    schemaPath: '#/properties/mappings/anyOf',
    params: {},
    message: 'should match some schema in anyOf'
  }
]
/home/runner/work/CISControls_OSCAL/CISControls_OSCAL/git-content/mappings/json/CCM_CISControlsv8_Mapping-min.json is invalid.

Expected behavior (i.e. solution)

JSON Schema validates properly.

Other Comments

@aj-stein-nist
Copy link
Collaborator Author

@wendellpiez asked that @aj-stein-nist and others help with testing a potential fix after .5 (approximated 4 hours) of developing a bug to mitigate the fix.

@aj-stein-nist
Copy link
Collaborator Author

Similar behavior in this GHA job run when validating and converting content and this Gitter chat indicate there is a high likelihood this reflects the same bug. I am looking into this to potentially unblock myself of usnistgov/oscal-content#139.

@aj-stein-nist aj-stein-nist self-assigned this Oct 6, 2022
@aj-stein-nist aj-stein-nist moved this from Todo to In Progress in NIST OSCAL Work Board Oct 7, 2022
@aj-stein-nist
Copy link
Collaborator Author

I likely will have to sync up with Wendell as it turns out some of the <json-value-key> stuff is more difficult than I thought and this isn't going to be a small cosmetic change.

@aj-stein-nist
Copy link
Collaborator Author

@wendellpiez, the serialized XML form of the example as it should look based on transforming with the basic JSON->XML serializer in XPath 3.0 (as we discussed on the call):

                            <map key="type">
                                <array key="anyOf">
                                    <map>
                                        <string key="type">string</string>
                                    </map>
                                    <map>
                                        <array key="enum">
                                            <string>equivalent-to</string>
                                            <string>equal-to</string>
                                            <string>subset-of</string>
                                            <string>superset-of</string>
                                            <string>intersects-with</string>
                                        </array>
                                    </map>
                                </array>
                            </map>

@wendellpiez
Copy link
Collaborator

Noting that even a value key is not required for data types allowing an empty string, the XML representation is underspecified: <relationship type="x"/> is schema valid since the token` data type supports an empty string. (In 2.0 we can make token require a character but that's not backward compatible for 1.x.)

At a minimum we need to relax the requirement for the nominal field value property in the JSON to avoid properties with no values (or empty string values).

@aj-stein-nist
Copy link
Collaborator Author

Wendell and I paired for this afternoon. We agreed that I would follow up with a minimal Metaschema definition and help stub out tests outside of the more complex definition examples that make it hard to exercise the possible bug, different solutions, and how certain JSON Schema fields should and should not be handled (additionalProperties and required, specifically).

@aj-stein-nist
Copy link
Collaborator Author

aj-stein-nist commented Oct 13, 2022

Wendell and I had met a few times and paired and he agreed it is best he work on moving forward the Metaschema "mini" (minimally viable model) using the current version to exposed different allowed values sets and write relevant XSpec tests. I got a little ahead of myself and attempted last night to figure out how I can revive the Metaschema test-suite but it is pretty badly broken for now, as I learned in aj-stein-nist@2e67d27 last night. I will unassign myself for now, but sync up and assist accordingly and support Wendell as-is.

@aj-stein-nist aj-stein-nist removed their assignment Oct 13, 2022
wendellpiez added a commit to wendellpiez/metaschema that referenced this issue Oct 13, 2022
aj-stein-nist added a commit to usnistgov/OSCAL that referenced this issue Oct 14, 2022
Per request from @wendellpiez for debugging, I am going to set this up
to add to a test submodule and see emmitted JSON Schema result and
validation against catalogs in usnistgov/oscal-content examples.
aj-stein-nist added a commit to aj-stein-nist/oscal-content-forked that referenced this issue Oct 14, 2022
Per request from @wendellpiez for debugging, I am going to set this up to add to a test submodule and see emmitted JSON Schema result and validation against catalogs in usnistgov/oscal-content examples.
@wendellpiez
Copy link
Collaborator

Now in a working branch we have the beginnings of XSpec, with analysis:

https://github.com/wendellpiez/metaschema/tree/issue240-jsonschemaBugA/test-suite/metaschema-xspec/json-schema-gen

The schema marked CORRECTED3 is both correct and complete, testing both the token datatype and applicable enumerations (i.e., any enumerations not to be suppressed by virtue of a narrowing @target or by allow-other='yes'). Instance documents marked 'valid' or 'invalid' demonstrate validations over data for testing. The draft XSpec casts two of the four examples into their intended form. Running the XSpec shows the bugs in the current process.

Next up:

  • discuss: do we have the correct target model for JSON Schema (does the corrected 3 version look okay);
  • complete XSpec; repair bugs;
  • test out
  • return to functional validations for generated schemas (do they validate the correct cross sections of instances)

@aj-stein-nist
Copy link
Collaborator Author

Looking good so far, just some quick feedback because it is a branch and not a PR yet (note to self: this was reviewed in the above branch at commit wendellpiez@d045cc8):

  • Most look good. I call out the ones that could use improvements below.
  • Sub-schema with "$id" of "#field_oscal-value-testing-mini_constrained-closed" is close but likely needs a slight modification: allOf should be switched to oneOf or you would need to a token-value to match all four enum values of "one", "two", "three", "four" at the same time (because of presumed allow-other being set to no in Metaschema).
  • Sub-schema for "#field_oscal-value-testing-mini_constrained-open" can work with anyOf so long as the source Metaschema implicitly or explicitly resolves to true() for has-oscal-namespace('http://csrc.nist.gov/ns/oscal') does seem to work. I took some time to re-read the JSON Schema spec this evening and the MVP does appear to work correctly.

image

https://gist.github.com/aj-stein-nist/cff37e5580da4cce0d8b689ddd28f486
https://gist.github.com/aj-stein-nist/1b9a14c074980128b9bcb26dc8e070df

I will review again in the morning to make sure I didn't miss anything.

@aj-stein-nist
Copy link
Collaborator Author

aj-stein-nist commented Nov 1, 2022

Looking good so far, just some quick feedback because it is a branch and not a PR yet (note to self: this was reviewed in the above branch at commit wendellpiez@d045cc8):

* Most look good. I call out the ones that could use improvements below.

* Sub-schema with `"$id"` of `"#field_oscal-value-testing-mini_constrained-closed"` is close but likely needs a slight modification: `allOf` should be switched to `oneOf` or you would need to a `token-value` to match all four `enum` values of `"one"`, `"two"`, `"three"`, `"four"` at the same time (because of presumed `allow-other` being set to `no` in Metaschema).

* Sub-schema for `"#field_oscal-value-testing-mini_constrained-open"` can work with `anyOf` so long as the source Metaschema implicitly or explicitly resolves to `true()` for `has-oscal-namespace('http://csrc.nist.gov/ns/oscal')` does seem to work. I took some time to re-read the JSON Schema spec this evening and the MVP does appear to work correctly.

I will review again in the morning to make sure I didn't miss anything.

@wendellpiez, ping ping. I know we discussed via Gitter some limited conversation with Dave, but had you looked at my feedback? It appears the branch (at least remote in github.com/wendellpiez) is still as-is, and I want to make sure we get to this eventually. Lemme know!

aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Dec 28, 2022
@aj-stein-nist aj-stein-nist linked a pull request Dec 28, 2022 that will close this issue
7 tasks
@aj-stein-nist aj-stein-nist moved this from In Progress to Under Review in NIST OSCAL Work Board Dec 29, 2022
@david-waltermire
Copy link
Collaborator

david-waltermire commented Jan 5, 2023

Based on our conversation today, the following should guide the use of anyOf or allOf.

  • Use anyOf if the allowed values are open. i.e. allow-other=yes
  • Use allOf if the allowed values are closed. i.e. allow-other=no

aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Jan 5, 2023
Reflect Dave's feedback and adjust code per the comment in:
usnistgov#240 (comment)
aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Jan 6, 2023
@david-waltermire david-waltermire moved this from Under Review to In Progress in NIST OSCAL Work Board Jan 12, 2023
aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Jan 17, 2023
aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Jan 19, 2023
@aj-stein-nist aj-stein-nist moved this from In Progress to Under Review in NIST OSCAL Work Board Jan 19, 2023
@aj-stein-nist
Copy link
Collaborator Author

Hopefully wrapping this up before the end of the week.

aj-stein-nist added a commit to aj-stein-nist/metaschema that referenced this issue Jan 24, 2023
aj-stein-nist added a commit that referenced this issue Jan 25, 2023
* Fix JSON Schema generation for #240 and add tests.

* Adapt test suite approach per PR feedback.

* Final cleanup of unused parameter.

The shift in test approach, with the shell entrypoint at
ist-metaschema-MAKE-JSON-SCHEMA.xsl and not with the previous
make-json-schema-metamap.xsl replaced means that parameter, which
was not working particularly well in the previous approach.
@aj-stein-nist aj-stein-nist moved this from Under Review to Reviewer Approved in NIST OSCAL Work Board Jan 25, 2023
@aj-stein-nist aj-stein-nist moved this from Reviewer Approved to Done in NIST OSCAL Work Board Jan 25, 2023
aj-stein-nist added a commit to aj-stein-nist/OSCAL-forked that referenced this issue Jan 25, 2023
aj-stein-nist added a commit to aj-stein-nist/oscal-content-forked that referenced this issue Jan 30, 2023
aj-stein-nist added a commit to aj-stein-nist/OSCAL-forked that referenced this issue Jan 31, 2023
 issue.

Determine if this submodule update fixes CI/CD breakage with JSON schema
or if other bugs and development issues leave it broken.
@david-waltermire david-waltermire moved this from Todo to Under Review in NIST OSCAL Work Board Feb 7, 2023
david-waltermire pushed a commit that referenced this issue Mar 9, 2023
* Fix JSON Schema generation for #240 and add tests.

* Adapt test suite approach per PR feedback.

* Final cleanup of unused parameter.

The shift in test approach, with the shell entrypoint at
ist-metaschema-MAKE-JSON-SCHEMA.xsl and not with the previous
make-json-schema-metamap.xsl replaced means that parameter, which
was not working particularly well in the previous approach.
nikitawootten-nist pushed a commit to nikitawootten-nist/metaschema-xslt that referenced this issue Jul 21, 2023
* Fix JSON Schema generation for usnistgov/metaschema#240 and add tests.

* Adapt test suite approach per PR feedback.

* Final cleanup of unused parameter.

The shift in test approach, with the shell entrypoint at
ist-metaschema-MAKE-JSON-SCHEMA.xsl and not with the previous
make-json-schema-metamap.xsl replaced means that parameter, which
was not working particularly well in the previous approach.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working XSLT Implementation Issue relates to the XSLT implementation of Metaschema.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants