Simplify the processing of "$vocabulary" #1281

handrews · 2022-08-22T18:38:58Z

This would not significantly change anything about the behavior of $vocabulary. While there are ambiguities in the current wording, I will be filing thoughts on clarifying those as separate issues.

Currently, $vocabulary requires cumbersome distinctions between schema vs meta-schema processing. This issue plus #1183 would allow removing such distinctions. It is also outside of the existing keyword classifications, despite not needing such special treatment. We can solve this by:

calling $vocabulary an annotation, which it is
stating that as an annotation, $vocabulary MUST be ignored (but still collected) if it is not from the first dynamic scope (defined as the dynamic scope with empty string as its evaluation path "")
stating that the semantics of $vocabulary are only defined when the instance is a JSON Schema, and that any other interpretation MUST NOT be considered interoperable (I'm leaving the possibility open that someone might figure out some other viable interpretation, perhaps with a media type that embeds JSON Schemas, and there's no point in forbidding it because that usage is outside the scope of JSON Schema)
stating a process for static use of $vocabulary: This was one of the main reasons for the weird description in the first place, which is that we don't want to mandate meta-schema validation as a prerequisite for figuring out the vocabularies. Since the schema object of the first dynamic scope is predictable (it's the object directly referenced by the (meta-)schema's identifier), implementations MAY inspect $vocabulary statically and consider it to have been applied as an annotation to the instance(-schema) root, and MAY cache this result just as they would have cached the annotation resulting from meta-schema evaluation.

Point 4 is the only part of the above that is not already part of the JSON Schema processing model in some way, and it just explains what I'm pretty sure some implementations do already.

While there has been some discussion of removing the restriction in point 2, let's not discuss that here (if anyone feels strongly about it, feel free to file an issue).

This would involve the following changes:

Update the first paragraph of §8.1.2 to state that $vocabulary is an annotation
Replace the last paragraph of §8.1.2 (about how $vocabulary MUST be ignored for non-schema instances) with point 2 (that only the annotation from the first dynamic scope can be used) and point 3 (that the annotation semantics are only defined for schemas). Most importantly, this removes the phrase documents that are not being processed as a meta-schema so that we can move away from special meta-schema processing. It also removes confusing language that implies that $vocabulary has any effect on the schema that contains $vocabulary (because of point 1 about annotations, there should not be any confusion on this point any more)
Add a new subsection before §8.1.2.1 on the static processing of $vocabulary (for avoiding the cost of meta-schema validation, and/or for implementations that do not support annotation collection)
Eliminate §8.1.2.2 Non-inheritability of vocabularies as it is now a clear and direct consequence of point 2, and the explanation of the "first dynamic scope" thing should be written in such a way that this is clear.
Eliminate §9.3.1 Detecting a meta-schema assuming Remove the notion of "canonical URIs" in favour of boundaried schema resources #1183 is resolved in a way that no longer requires this section (if it's not yet resolved, this part can be deferred until it is).

Any objections? The only practical impact is that implementations that collect annotations should now collect $vocabulary (from any/all dynamic scopes, the restriction on dynamic scope is only on the usage, not the collection. Since we don't have annotation tests yet, there is no impact on the existing test suite.

If no one objects in the next week I'll write a PR for this.

The text was updated successfully, but these errors were encountered:

lud-wj · 2024-03-03T18:32:16Z

Hello,

I am trying to make sense about $vocabulary but it does not feel like it is only an annotation. For instance in the test suite, it is expected that vocabulary that is not declared in the metaschema is not applied, even though the given data is not itself a schema (as if we were validating a schema with a meta schema) but mere data.

gregsdennis · 2024-03-03T19:33:20Z

@lud-wj we have a backlog item to create docs for vocabs in general. Until we get that on https://json-schema.org, please have a read through my docs on the subject. That should help.

The test suite does check that assertion keywords that are defined in unlisted vocabularies are not validated, yes, but that's not considered a meta-schema validation.

$vocabulary itself doesn't provide an assertion, and it doesn't contain subschemas (so it's not an applicator). It really is annotative only, but it doesn't create an annotation in the output. Its presence in the meta-schema tells the tooling what keywords are defined for the schema. Thus, if a meta-schema is used that doesn't list the Validation vocabulary, then none of those keywords (e.g. maximum, minLength, etc.) should be processed, and those keywords become "unknown."

Note: since 2020-12 provides annotations for unknown keywords' values, you will get annotations for those. This means that annotation-only keywords behave the same whether their vocabulary is listed or not.

lud-wj · 2024-03-03T20:38:14Z

Thanks @gregsdennis your website is really helpful. I think I get a better understanding of where the "split" is between what capabilities a vocabulary declares and the part that it declares to validate schemas using those new keywords.

gregsdennis · 2024-06-18T10:26:30Z

These items need to be incorporated into whatever vocabularies ends up being. Regardless, it's being extracted into the feature life cycle, so it'll need to be worked out there.

handrews mentioned this issue Aug 22, 2022

Restriction of processing $vocabulary to meta-schemas is unnecessary and confusing #1098

Closed

handrews added the core label Aug 22, 2022

handrews mentioned this issue Aug 22, 2022

Remove the notion of "canonical URIs" in favour of boundaried schema resources #1183

Open

handrews mentioned this issue Sep 18, 2022

Consider whether "$schema" should allow fragments #1292

Closed

gregsdennis closed this as completed Jun 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simplify the processing of "$vocabulary" #1281

Simplify the processing of "$vocabulary" #1281

handrews commented Aug 22, 2022 •

edited

Loading

lud-wj commented Mar 3, 2024 •

edited

Loading

gregsdennis commented Mar 3, 2024

lud-wj commented Mar 3, 2024

gregsdennis commented Jun 18, 2024

Simplify the processing of "$vocabulary" #1281

Simplify the processing of "$vocabulary" #1281

Comments

handrews commented Aug 22, 2022 • edited Loading

lud-wj commented Mar 3, 2024 • edited Loading

gregsdennis commented Mar 3, 2024

lud-wj commented Mar 3, 2024

gregsdennis commented Jun 18, 2024

handrews commented Aug 22, 2022 •

edited

Loading

lud-wj commented Mar 3, 2024 •

edited

Loading