-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Whitespace collapse on datatypes in XSD foils bi-directional conversion (Metaschema M3) #67
Comments
wendellpiez
added a commit
to wendellpiez/metaschema
that referenced
this issue
May 21, 2021
…ty values; ncname-workalike in JSON Schema - see usnistgov/OSCAL#911 usnistgov/OSCAL#805 also usnistgov#33 usnistgov#67 usnistgov#68
wendellpiez
added a commit
to wendellpiez/metaschema
that referenced
this issue
May 21, 2021
…ty values; ncname-workalike in JSON Schema - see usnistgov/OSCAL#911 usnistgov/OSCAL#805 also usnistgov#33 usnistgov#67 usnistgov#68
david-waltermire
pushed a commit
that referenced
this issue
May 21, 2021
* Addressing datatype validation issues: whitespace collapsing; non-empty values; ncname-workalike in JSON Schema - see usnistgov/OSCAL#911 usnistgov/OSCAL#805 also #33 #67 #68 * Improvements to XSD production; fully aligning 'token' datatype across XSD and JSON Schema implementations.
david-waltermire
added a commit
that referenced
this issue
Jun 6, 2021
* Rework of docs focusing on JSON docs and model pipeline * Improvements to composition toolchain * Fixed a few small bugs in the metaschema-check. Improved performance of the compose pruning using an accumulator. * Moved edge-case samples into testing directory * Made shadowing warning a warning * Initial commit of an Oxygen Metaschema framework. * Creation of new compose schematron unit tests. * Cross-linking XML and JSON syntax pages and other improvements to links * Now building XML and JSON indexes to reference pages, with links to steps * Reconfigured docs pipeline (XSLT entry points); adding new files including pipeline steps * Migrating schema generation tools to new/improved composition pipeline * Addressing usnistgov/OSCAL#902 thanks for finding this bug * Enhancements to JSON Schema definition (with better performance too) * Adding support for json-base-uri as a metaschema property * Updated JSON schema $id; factoring out common docs XSLT * Fixing IDs in JSON schema per issue usnistgov/OSCAL#933. * Addressing datatype validation issues: whitespace collapsing; non-empty values; ncname-workalike in JSON Schema - see usnistgov/OSCAL#911 usnistgov/OSCAL#805 also #33 #67 #68 * Improvements to XSD production; fully aligning 'token' datatype across XSD and JSON Schema implementations. * Updating bidirectional XML/JSON converter generators (#143) * Committing a version that handles test data correctly (so far) from rebuilt metaschema composition addressing #51 #53 #76 * Now displaying constraints in documentation at point of definition; * Docs generation revamp Reworked reference and other pages to sketch - #128 and others Co-authored-by: Wendell Piez <wendell.piez@nist.gov>
See #68. This will be fixed against the M4 implementation. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Describe the bug
Probably related to the bug described in #66.
In the generated XSDs, a property
whiteSpace
is set tocollapse
on datatyped values. This provides whitespace normalization which (among other things) makes values valid to patterns that would otherwise be invalid due to whitespace. This was detected on a UUID value that passes XSD validation even when it has trailing space.Since the conversion utility does not collapse whitespace, a value (with whitespace) that is not actually valid to the datatype's lexical pattern, appears in the result. This causes problems downstream, for example when the data is cast into JSON, where its schema shows it is invalid.
Who is the bug affecting?
Edge cases with sloppy data in datatyped values - which makes it especially annoying since it only happens sometimes.
What is affected by this bug?
Potentially anyone, especially anyone relying on bidirectional conversion.
When does this occur?
In any datatype now marked with
<whiteSpace value='collapse'/>
in its definition.Note however that for the
markup-line
datatype, which converts to Markdown, we may wish to provide whitespace normalization -- although the problem there might be the reverse, if whitespace that persists in XML mixed content, is stripped in Markdown. More testing is in order.How do we replicate the issue?
Try and validate a document with a datatype value (such as a UUID or boolean) with extra whitespace. It should be valid in XSD, but when converted to JSON, it won't be valid to the corresponding JSON Schema (which does not provide for collapsing).
Expected behavior (i.e. solution)
Data with extraneous whitespace should not validate against datatypes whose patterns do not permit it, in XSD.
Bidirectional conversion of all data, including pathological cases, should work.
Other Comments
This needs to be addressed in both Metaschema M3, and M4.
Also, unit tests for schema validation and data conversion should provide some edge cases of data invalid only due to whitespace anomalies (extra whitespace not permitted by the given pattern).
The text was updated successfully, but these errors were encountered: