Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

format as a annotation by default - separate vocabs (option 3) #1027

Merged
merged 5 commits into from
Nov 24, 2020
Merged
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
116 changes: 49 additions & 67 deletions jsonschema-validation.xml
Original file line number Diff line number Diff line change
Expand Up @@ -514,18 +514,12 @@
<section title="Foreword">
<t>
Structural validation alone may be insufficient to allow an application to correctly
utilize certain values. The "format" annotation keyword is defined to allow schema
utilize certain values. The "format" annotation keyword is defined to allow schema
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we keep unrelated whitespace changes out of the PR? it adds confusion and often conflicts with other things in flight.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was requested by @Relequestual in Slack.

authors to convey semantic information for a fixed subset of values which are
accurately described by authoritative resources, be they RFCs or other external
specifications.
</t>

<t>
Implementations MAY treat "format" as an assertion in addition to an annotation,
and attempt to validate the value's conformance to the specified semantics.
See the Implementation Requirements below for details.
</t>

<t>
The value of this keyword is called a format attribute. It MUST be a string. A
format attribute can generally only validate a given set of instance types. If
Expand All @@ -536,84 +530,66 @@
<xref target="json-schema">core JSON Schema.</xref>
<cref>
Note that the "type" keyword in this specification defines an "integer" type
which is not part of the data model. Therefore a format attribute can be
limited to numbers, but not specifically to integers. However, a numeric
which is not part of the data model. Therefore a format attribute can be
limited to numbers, but not specifically to integers. However, a numeric
format can be used alongside the "type" keyword with a value of "integer",
or could be explicitly defined to always pass if the number is not an integer,
which produces essentially the same behavior as only applying to integers.
</cref>
</t>

<t>
Meta-schemas that do not use "$vocabulary" SHOULD be considered to
utilize this vocabulary as if its URI were present with a value of false.
See the Implementation Requirements below for details.
The current URI for this vocabulary, known as the Format-Annotation vocabulary, is:
&lt;https://json-schema.org/draft/2020-11/vocab/format-annotation&gt;. This vocabulary
is required by this specification.
gregsdennis marked this conversation as resolved.
Show resolved Hide resolved
</t>
<t>
The current URI for this vocabulary, known as the Format vocabulary, is:
&lt;https://json-schema.org/draft/2020-11/vocab/format&gt;.
In addition to the Format-Annotation vocabulary, a secondary vocabulary is available
for custom meta-schemas that defines "format" as an assertion. The URI for the
Format-Assertion vocabulary, is:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove comma

&lt;https://json-schema.org/draft/2020-11/vocab/format-assertion&gt;.
</t>
<t>
The current URI for the corresponding meta-schema is:
<eref target="https://json-schema.org/draft/2020-11/meta/format"/>.
<eref target="https://json-schema.org/draft/2020-11/meta/format"/>. Because the
Relequestual marked this conversation as resolved.
Show resolved Hide resolved
syntactic requirements of "format" do not change between the annotation and assertion
vocabularies, the meta-schema is shared between them.
gregsdennis marked this conversation as resolved.
Show resolved Hide resolved
</t>
<t>
Specifying both the Format-Annotation and the Format-Assertion vocabularies is functionally
equivalent to specifying only the Format-Assertion vocabulary since its requirements
are a superset of the Format-Annotation vocabulary.
Relequestual marked this conversation as resolved.
Show resolved Hide resolved
</t>

</section>

<section title="Implementation Requirements">
<t>
The "format" keyword functions as an annotation, and optionally as an assertion.
<cref>This is due to the keyword's history, and is not in line with current
keyword design principles.</cref> In order to manage this ambiguity, the
"format" keyword is defined in its own separate vocabulary, as noted above.
The true or false value of the vocabulary declaration governs the implementation
requirements necessary to process a schema that uses "format", and the
behaviors on which schema authors can rely.
The "format" keyword functions as defined by the vocabulary which is referenced.
</t>

<section title="As an annotation">
<section title="Format-Annotation Vocabulary">
<t>
The value of format MUST be collected as an annotation, if the implementation
supports annotation collection. This enables application-level validation when
supports annotation collection. This enables application-level validation when
schema validation is unavailable or inadequate.
</t>
<t>
This requirement is not affected by the boolean value of the vocabulary
declaration, nor by the configuration of "format"'s assertion
behavior described in the next section.
Implementations MAY still treat "format" as an assertion in addition to an
Relequestual marked this conversation as resolved.
Show resolved Hide resolved
annotation and attempt to validate the value's conformance to the specified
semantics. The implementation MUST provide options to enable and disable such
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why MUST they, if it can also be done via $vocabulary? I would soften this to a SHOULD. Ideally, the fewer knobs needed in an implementation outside of specifying a $schema, the better.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is solely for when the annotation vocabulary is used. If an implementation supports assertion (even partial assertion) they MAY choose to treat the keyword as such. If they do, they MUST allow this feature to be enabled and disabled.

The assertion vocabulary doesn't have this allowance, instead specifying that implementations MUST implement assertion fully in order to support the vocabulary.

Copy link
Member

@karenetheridge karenetheridge Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still confused. It sounds like a schema (via the metaschema) can specify to use the format-annotation vocabulary, but it might still get assertion behaviour anyway (when using an implementation that supported validations)? That would force the setting of an out-of-band configuration "no, really, when I say annotations I mean only annotations" and there would be no other way to not get validation behaviour as well. If I'm correct so far, then the configuration value MUST default to false so validation behaviour doesn't happen accidentally.

Copy link
Member Author

@gregsdennis gregsdennis Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The schema can say, "You're getting annotations."

But this bit of the spec allows the client/application to say, "I know I'm supposed to get annotations, but this library I'm using supports asserting formats, and that'd save me some work, so I'd like to use that." It gives the choice to the client.

It also the opens the door for implementations to provide partial validation, which is a pain point that @handrews had to deal with (pressure from implementers) and what the previous wording of this section tried so hard to allow.

However, if the schema says, "You're getting assertions," then the implementation MUST be able to provide validation or refuse to process the schema.

Copy link
Member Author

@gregsdennis gregsdennis Nov 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last scenario is <format-assertion>: false. This means, "If you support asserting these things, that's great, please do. If not, it's cool; they'll be annotations (because you don't understand them)." But this is enforced by the schema, not opted-in by the application.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the end, we have

Annotation
Vocab
Assertion
Vocab
Impl Assert
Support
App Config Result
false/true n/a no n/a annotation
false/true n/a yes annotation annotation
false/true n/a yes assertion assertion
n/a false no n/a annotation
n/a false yes annotation/assertion assertion
n/a true no n/a fail
n/a true yes annotation/assertion assertion

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this bit of the spec allows the client/application to say, "I know I'm supposed to get annotations, but this library I'm using supports asserting formats, and that'd save me some work, so I'd like to use that." It gives the choice to the client.

My main point here is that if the implementation can do this, the setting MUST default to NOT doing it. Users must explicitly opt in. Otherwise, there is no way of a schema being processed as annotation only (with any combination of $schema and $vocabulary settings) with an implementation that can also handle assertions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the setting MUST default to NOT doing it

Absolutely. That is covered on the next line: "... evaluation and MUST be disabled by default."

evaluation and MUST be disabled by default. Implementations SHOULD document
their level of support for such validation.
<cref>
Requiring annotation collection even when the vocabulary is declared with
a value of false is atypical, but necessary to ensure that the best
practice of performing application-level validation is possible even when
assertion evaluation is not implemented. Since "format" has always been
a part of this specification, requiring implementations to be aware of it
even with a false vocabulary declaration is deemed to not be a burden.
Specifying the Format-Annotation vocabulary and enabling validation in an
implementation should not be viewed as being equivalent to specifying
the Format-Assertion vocabulary since implementations are not required to
provide full validation support when the Format-Assertion vocabulary
is not specified.
</cref>
</t>
</section>

<section title="As an assertion">
<t>
Regardless of the boolean value of the vocabulary declaration,
an implementation that can evaluate "format" as an assertion MUST provide
options to enable and disable such evaluation. The assertion evaluation
behavior when the option is not explicitly specified depends on
the vocabulary declaration's boolean value.
</t>

<t>
When implementing this entire specification, this vocabulary MUST
be supported with a value of false (but see details below),
and MAY be supported with a value of true.
</t>

<t>
When the vocabulary is declared with a value of false, an implementation:
When the implementation is configured for assertion behavior, it:
<list>
<t>
MUST NOT evaluate "format" as an assertion unless it is explicitly
configured to do so;
</t>
<t>
SHOULD provide an implementation-specific best effort validation
for each format attribute defined below;
Expand All @@ -622,9 +598,6 @@
MAY choose to implement validation of any or all format attributes
as a no-op by always producing a validation result of true;
</t>
<t>
SHOULD document its level of support for validation.
</t>
</list>
<cref>
This matches the current reality of implementations, which provide
Expand All @@ -634,14 +607,24 @@
validation in the application, which is the recommended best practice.
</cref>
</t>
</section>

<section title="Format-Assertion Vocabulary">
<t>
When the Format-Assertion vocabulary is declared with a value of false,
gregsdennis marked this conversation as resolved.
Show resolved Hide resolved
implementations MUST provide full validation support for all of the formats
defined by this specificaion. Implementations that cannot provide full
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "specificaion"

validation support MUST refuse to process the schema.
</t>
<t>
When the vocabulary is declared with a value of true, an implementation
that supports this form of the vocabulary:
An implementation that supports the Format-Assertion vocabulary:
<list>
<t>
MUST evaluate "format" as an assertion unless it is explicitly
configured not to do so;
MUST still collect "format" as an annotation if the implementation
supports annotation collection;
</t>
<t>
MUST evaluate "format" as an assertion;
</t>
<t>
MUST implement syntactic validation for all format attributes defined
Expand Down Expand Up @@ -685,10 +668,9 @@
<t>
Implementations MAY support custom format attributes. Save for agreement between
parties, schema authors SHALL NOT expect a peer implementation to support such
custom format attributes. An implementation MUST NOT fail
validation or cease processing due to an unknown format attribute.
When treating "format" as an annotation, implementations SHOULD collect both
known and unknown format attribute values.
custom format attributes. An implementation MUST NOT fail to collect unknown formats
as annotations. When the Format-Assertion vocabulary is specified, implementations
MUST fail upon encountering unknown formats.
</t>
<t>
Vocabularies do not support specifically declaring different value sets for keywords.
Expand Down