Why is `$schema` restricted to root schemas? #431

handrews · 2017-10-02T18:25:43Z

In PR #248 we forbade the use of $schema in subschemas. I can't remember why I approved this.

One use case we've noted over and over is that of "packing" multiple schema files into one. It's the main justification for the base-uri-changing functionality of "$id". What that means is that the only time something being a root schema matters is if it is the root schema of the entry point file for processing.

Once you are using multiple files, then whether you $ref to another file (with no fragment or "#" as the fragment), or whether you pre-process and "pack" that file into the original file, the result is the same. But in the former, the referenced schema is a root schema. In the latter case, it is a subschema.

That means that using $ref, you can reference a draft-04 schema from a draft-06 schema. But if you pack it, suddenly that is illegal, because you can't use $schema in the packed subschema to switch the processing rules.

This seems very wrong.

@awwright you wrote the PR- do you remember why it seemed correct? What am I missing?

I know @epoberezkin had some concerns about implementation, but I don't recall why that was compelling- the first thing you do on processing a schema is check $schema to set the rules for processing the rest of the schema (this is how my embryonic implementation handled things before I decided it was probably best to leave that to other libraries). Perhaps @Julian has thoughts on this as well?

The text was updated successfully, but these errors were encountered:

handrews · 2017-10-02T18:27:08Z

I suppose we could redefine "root" schema as "a schema that sets a new base URI" or something, but that seems confusing.

epoberezkin · 2017-10-02T21:00:58Z

@handrews we had a very long conversation on the issue, both about $ref and inclusion not being the same and that meta-schema is actually a JSON-schema that cannot be changed during validation.

I can point you to the relevant comments somewhat later but let's please keep it as it is.

Referencing draft-04 schema from draft-06 schema is fine, because $ref is not schema inclusion. Having $schema in the middle of the schema is not fine, as there is no validation process defined that allows to change schema (in this case, meta-schema) on the fly.

epoberezkin · 2017-10-02T21:03:31Z

Root schema is the top level of a separate JSON instance, not any inner schema that changes base URI. So we can't redefine what root schema means.

epoberezkin · 2017-10-02T21:06:27Z

@handrewd I am happy to have this discussion again, as long as it stays the same for draft-07, it's not seen as either critical or bug, and you re-read our previous conversation on the subject.

Let's get draft-07 out as is and then we can talk again about it.

epoberezkin · 2017-10-02T21:12:11Z

With the current spec, you can pack multiple schemas into a single file that is a collection of schemas but not a JSON schema, in the general case.

handrews · 2017-10-02T21:17:11Z

Having $schema in the middle of the schema is not fine, as there is no validation process defined that allows to change schema (in this case, meta-schema) on the fly.

That doesn't make any sense. Why would you need a special process? The process for handling a schema is:

Check for $schema, use it to set further rules
Check for a change in base URI (either id or $id depending on $schema)
Process the other keywords

recurse and repeat as needed. It doesn't matter whether the schema is at the top of the file, inside the file, the target of a $ref, or anything else. A schema is a schema, wherever it is. It "inherits" the parent's $schema value and base URI if those are not changed, but otherwise it's all processed the same way no matter how you got there.

What am I missing?

handrews · 2017-10-02T21:17:42Z

Let's get draft-07 out as is and then we can talk again about it.

I likely need this resolved for my usage, so I'm not interested in deferring it.

handrews · 2017-10-02T21:20:17Z

Plus, everything about draft-07 is blocked on other people anyway, so it's not like it's being held up for this.

handrews · 2017-10-02T23:31:24Z

@epoberezkin also, while I want it resolved for draft-07, that just literally means resolved. It doesn't mean any specific resolution. If you or @awwright can explain how/why this is supposed to work, that's fine. Right now I see the requirements as contradictory, and your statements so far have not cleared that up.

handrews · 2017-10-03T02:34:58Z

@epoberezkin OK I went back and read the whole thing (issue #244, not the PR) again. I didn't slog through it before b/c I assumed this was a simple error, which was incorrect. Anyway... I think I figured it out, including why we see this so differently (both views actually make sense).

The whole discussion mostly just reminds me how much I hate $schema, which is not new.

The issue is actually pretty inconclusive, and there is a CREF explaining that the behavior might change. Because I was never entirely sold, and @awwright also had some concerns (I think- he proposed including the CREF, anyway).

Anyway, you pushed off further in the name of shipping a draft last time. Which was totally reasonable, I'm not complaining! But I am putting my foot down this time (I'm doing the vast majority of the non-PR-review work on this draft so I feel entitled).

Our opposing views are easily explained by the two totally different purposes assigned to $schema: indicating which schema to use to perform validation on the containing schema as an instance, and declaring the vocabulary within the local schema object. The former is, by nature, an assertion across the entire file- no more, no less. The latter is local for each schema object, but inherited from parent schemas when no $schema is present.

You are primarily concerned with it's impact on validating the schema as an instance, because you wrote and maintain a validator. Sensible!

I am primarily concerned with declaring vocabulary, because I look at JSON Schema as a system for defining and using numerous vocabularies. My main interest in validation is as a hook for applying other vocabularies. Using several vocabularies in the same file (and even using multiple concurrently in one schema object- hyper-schema plus UI generation, for instance). See also #314 for more details on how difficult this is right now.

So what do we really need from this keyword? I'm going to argue that

We actually don't need it, although we should keep it in draft-06 form (or very close to it) for compatibility at least for now
We do need something else that behaves differently

Validating schemas as instances

We don't really need $schema to declare how to validate the schema as an instance. There are already mechanisms for doing that, and they are the only mechanisms available to other instances. And somehow non-schema instances get validated just fine :-)

Declaring vocabulary on a schema object-by-schema object basis

This is separate from declaring what to use to validate the whole document. If a schema needs to mix vocabularies, then the validating meta-schema must support all vocabularies. This avoids the whole "validating a schema becomes a special case" problem.

So if for some reason I have a frankenstein schema that switches back and forth between draft-04 and draft-06 (say, because different teams maintain different schemas, but they all need to be shipped in one file), I would need to validate that against a meta-schema that recursively anyOf'd the draft-04 and draft-06 meta-schemas.

And then I'd need to use $schema all over, except that actually that doesn't work very well for the reasons explained in #314. So clearly a different solution is called for. Specifically, I'll propose $vocabularies, which takes an array of schema URIs. It declare the vocabularies that are in use by the local schema object, with the semantics being that an implementation that recognizes a vocabulary can make use of that schema as that one vocabulary defines, and can ignore keywords from unrecognized vocabularies.

So I might have:

{
    "$vocabularies": [
        "http://json-schema.org/draft-07/hyper-schema",
        "http://example.com/custom-ui-generation-schema"
    ],
    "links": [...],
    "someUiGenKeyword": {...}
}

A hyperschema implementation can use this as a hyperschema. An implementation of the custom UI vocabulary could use it for that. An implementation understanding both could in theory use them together in some way (but that would probably not be good vocab design).

So for draft-07 I recommend that we:

Leave $schema specified as it (which, as of draft-06, is restricted to declaring a single validating meta-schema), but flag it for deprecation and document that the best practice is to associate or choose the validating schema the same way you would for any other instance document
Introduce $vocabularies, probably after some more debate (here or in Understanding extended meta-schemas #314)

epoberezkin · 2017-10-03T06:41:43Z

There is no way, in general case, to define which schema should be used with a JSON instance. It is defined, in most cases, on application level. Am I missing something? For JSON schema it is convenient to have meta-schema defined on the top level of JSON instance.

I like the idea of $vocabularies because of extension of meta-schema. It should complement rather than replace $schema. You also need to review the conversation we had in email.

Given that different vocabularies usually require different meta-schemas to validate a schema (as a JSON instance), allowing vocabulary change in the middle of JSON instance will make validating a schema, in a standard way, impossible.

My concern has little to do with my implementation. The change you propose, that would only solve YOUR narrow problem, would destroy a fundamental principle of the JSON schema specification - that we can validate JSON schema as JSON instance against meta-schema.

You have a particular problem - to be able to ship all your schemas as a single file. Why do you need this file to be a schema? Why is it not good enough to have a collection of schemas as array (you can even define such collection in the spec)? Will there be any library that is able to correctly process ALL vocabularies in a single schema file?

handrews · 2017-10-03T07:01:21Z

The change you propose, that would only solve YOUR narrow problem, would destroy a fundamental principle of the JSON schema specification - that we can validate JSON schema as JSON instance against meta-schema.

Did you miss the part where I said:

This avoids the whole "validating a schema becomes a special case" problem.*

I specifically said in my proposal to leave $schema as it is in draft-06. I just said to warn of possible future deprecation (I realize "flag it" is not clear, but it should be clear enough that it does not mean "rip your implementation out and turn it sideways").

Given that different vocabularies usually require different meta-schemas to validate a schema (as a JSON instance), allowing vocabulary change in the middle of JSON instance will make validating a schema, in a standard way, impossible.

To use a schema that declares multiple vocabularies, I would need to declare a meta-schema that supports all of those vocabularies. See again This avoids the whole "validating a schema becomes a special case" problem.

It is the problem of the meta-schema author to come up with one that works. Implementations don't care.

Will there be any library that is able to correctly process ALL vocabularies in a single schema file?

This is meaningless. What is "all"? Why would you even want to use them all at once? What even are they? The point of vocabularies is that anyone can make one. We're standardizing a few, but I would expect many non-standard / extended ones.

handrews · 2017-10-04T05:56:03Z

@epoberezkin: PR #432 implements what I wrote up last night. I went ahead with a PR in the hopes of proving to you that I am not destroying your entire implementation philosophy. Hopefully this is more clear in the PR.

handrews · 2017-10-08T19:29:43Z

We have a general agreement that $schema is correctly specified, even if the rationale is not clear in the spec. The vocabulary issue is already tracked by #314, so I am closing this issue.

handrews added core Priority: Critical Type: Bug labels Oct 2, 2017

handrews added this to the draft-07 (wright-*-02) milestone Oct 2, 2017

handrews mentioned this issue Oct 4, 2017

Add $vocabularies, clarify $schema and $ref #432

Closed

handrews added Priority: High and removed Priority: Critical Type: Bug labels Oct 4, 2017

handrews closed this as completed Oct 8, 2017

handrews mentioned this issue Dec 5, 2017

What is "$ref" and how does it work? #514

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is `$schema` restricted to root schemas? #431

Why is `$schema` restricted to root schemas? #431

handrews commented Oct 2, 2017

handrews commented Oct 2, 2017

epoberezkin commented Oct 2, 2017 •

edited

Loading

epoberezkin commented Oct 2, 2017

epoberezkin commented Oct 2, 2017

epoberezkin commented Oct 2, 2017

handrews commented Oct 2, 2017

handrews commented Oct 2, 2017

handrews commented Oct 2, 2017

handrews commented Oct 2, 2017

handrews commented Oct 3, 2017

epoberezkin commented Oct 3, 2017 •

edited

Loading

handrews commented Oct 3, 2017 •

edited

Loading

handrews commented Oct 4, 2017

handrews commented Oct 8, 2017

Why is $schema restricted to root schemas? #431

Why is $schema restricted to root schemas? #431

Comments

handrews commented Oct 2, 2017

handrews commented Oct 2, 2017

epoberezkin commented Oct 2, 2017 • edited Loading

epoberezkin commented Oct 2, 2017

epoberezkin commented Oct 2, 2017

epoberezkin commented Oct 2, 2017

handrews commented Oct 2, 2017

handrews commented Oct 2, 2017

handrews commented Oct 2, 2017

handrews commented Oct 2, 2017

handrews commented Oct 3, 2017

Validating schemas as instances

Declaring vocabulary on a schema object-by-schema object basis

epoberezkin commented Oct 3, 2017 • edited Loading

handrews commented Oct 3, 2017 • edited Loading

handrews commented Oct 4, 2017

handrews commented Oct 8, 2017

Why is `$schema` restricted to root schemas? #431

Why is `$schema` restricted to root schemas? #431

epoberezkin commented Oct 2, 2017 •

edited

Loading

epoberezkin commented Oct 3, 2017 •

edited

Loading

handrews commented Oct 3, 2017 •

edited

Loading