Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First pass a list of principles. #856

Closed
wants to merge 2 commits into from

Conversation

handrews
Copy link
Contributor

These inform how JSON Schema is designed as a system.

The section is placed after Definitions as otherwise the
principles don't have clear meaning.

This is a first pass at #848.

I definitely want real feedback here. This is a long-ish list. What should be cut? Can the wording be condensed? Does this feel like it serves a real purpose?

(Using the new "draft pull request" feature to ensure that it doesn't get merged too quickly :-)

</t>
<t>
Once schema URIs are resolved, the result of applying a schema object
to instance data is data is a function of that schema object and its
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"data is data is"

<t>
Keywords produce a boolean assertion result, and an annotation result that
can be of any type. Depending on the classification, the assertion result
might always be true, or the annotation result might always be empty.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Annotations are keywords and they do not produce any assertion result, right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This defines that all keywords produce both, but "annotation" keywords always produce a true assertion, making the assertion result moot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gregsdennis that is correct. Might need to also clarify what "empty" annotation results mean.

I put this in because in terms of generic extensions, just always returning the same sorts of things makes sense instead of "wait, what kind of keyword was this? does it return this kind of thing? yes? no?" If not otherwise specified, keywords return a true validation result.

This is open to debate, it just made the most sense as I was typing it.

(this is exactly why laying out principles is a good idea, though! I don't think we've ever really discussed this)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Fair.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the document, I can't see this defined explicitly. Are you saying it's implicit with the above wording? If so, could we ALSO make it explicit with requirement keywords? Tests will then have a requirements keyword to point to in the spec then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Relequestual I think we should make it explicit somewhere else. I specifically don't want any requirement keywords in the Principles section: They're principles, not requirements. They should drive the requirements. I will make a separate PR for clarifying the nature of results.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm taking the MAY out of one of the other principles and changing it to "can")

Comment on lines +480 to +481
identifiers only. Locating and retrieving URI-identified resources
is outside of the scope of this specification.
Copy link
Member

@gregsdennis gregsdennis Feb 11, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Locating and retrieving is out of scope, but loading is in scope? This is a fine line and maybe we should define "locating" and "retrieving" vs "loading".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. Will add to definitions section.

</t>
<t>
Keyword results MAY be defined in terms of the results of other keywords
in the same schema object, as long as no cyclic dependencies are produced.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"in the same schema object or its subschemas"? Maybe explicitly call out that subschema results can be acted on?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results of a keyword include the results of any subschemas. I'd rather keep these things orthogonal. If we need to have a principle about subschema result roll-up we can? Not sure if that's needed?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's a design principle, more of a consequence of processing.

<t>
JSON Schema assertions form a constraint system. The empty schema,
<spanx style="verb">{}</spanx>, allows everything and constrains nothing.
Each assertion keyword adds constraints; keywords MUST NOT remove constraints.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"MUST NOT remove or alter contraints."

Also, I still think that we should mention that this does not conform to inheritance models such as polymorphism.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What sort of "alter" do you have in mind? 99% sure I agree, just want to be clear if something has come up in practice.

And yeah, probably worth a comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's me drifting off into an inheritance mindset where you can override members.

The case I can think of is redefining something like maximum alongside a $ref to something that has maximum defined within. But in that case, the lower one would win out because the instance would just have to satisfy both maximums. Similar for property definitions: the instance would have to validate against both definitions.

This is fine, but it doesn't do what people thinking about inheritance would think it should do.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm.... yeah, that's an important point, I'll see if I can come up with concise wording.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The trouble is, we DO have keywords which alter the behaviour of other keywords, so I don't think you can say that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There might be something here about applicators vs assertions. You can use applicators to change which subschema(s) apply to a location in the instance. But adding assertions to a given schema object always further constrains. The only way to effectively lift a constraint is to use applicators to change which schema object is used.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

but because that carves foo out of additionalProperties, from the point of view of that instance property the constraints are changed...

I don't consider this as additionalProperties changing its behavior. additionalProperties is defined to depend on the existence of properties, so the behavior of seeing foo only when it's not handled by properties is consistent with that definition.

Maybe "override" would be a better word here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intentionally didn't say (or at least didn't mean to say) that additionalProperties changed its behavior. Rather, from the black box validation point of view, a constraint appeared to be lifted when a keyword was added. In a proper constraint layering system, the set of valid instances after adding any keyword would be a subset of the set of valid instances before adding the keyword.

However, this instance: {"foo": "hello", "bar": 2} fails validation against my first example schema, because "hello" is not an integer. But it passes validation against my second example schema, because adding properties caused a different subschema (true) to apply to "hello".

I'm 99% certain that if we just look at assertions, adding an assertion always reduces the set of valid instances (ignoring complex situations like "not" where adding to its subschema reduces that subschema's valid set, which expands the inverted set accepted by "not". Hmm... that's an applicator behavior, too. Not the "which subschema applies to what" part but the "perform boolean logic on the subschema results to produce the applicator result."

I think it is always true that adding a pure assertion keyword (that is not also an applicator and/or annotation) to a schema object always reduces the set of valid instances according to that schema object (including its subschemas, because subschemas are never affected by parent schemas once you resolve URI references- that's one of the other principles).

Applicators allow much more complex behavior, but it's still just changing how the subschemas results are combined. This is why there is never any override or merge or replacement. The operations are performed on subschema results, never on individual keywords. Aside from information returned in annotation results, subschemas are black boxes; the parent schema has no idea what keywords they contain.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could the issue here be it's "assertions and annotations must not remove constraints" but applicator keywords will alter how constraints are applied?

properties and additionalProperties are both applicators.
An argument could be made that when using if/then/else, the instance data actually modifies how the constraints are applied also. BUT, they are not removed, just applied differently.

Maybe it's worth noting that while constraints cannot be removed, the application of the schema they belong to may be altered as a side effect of the annotation results of other keywords?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Relequestual yeah what you're saying here is getting very close to it, I think.

</t>
<t>
Schema URIs referenced or defined in JSON Schema's Core vocabulary MUST be
resolvable without instance data.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "resolvable without instance data" mean?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pretty much what it literally says- don't overthink it ;-)

A URI reference in $ref or $id is resolved against the base URI, which is the nearest parent $id. If there isn't one, the schema resource's retrieval URI is used (per RFC 3986 generic rules for establishing base URIs).

What this point was trying to establish is that you can safely pre-process a schema so that all of those URI references are converted to full URIs (starting with a scheme) before you're given any instance data. This makes applying the schema to any instance faster and simpler, because you don't have to keep track of base URIs and resolve references when you encounter them.


However, this point doesn't quite work in 2019-09: the URI in $recursiveRef can't, based on the implementation mechanism described, be resolved like this, because what $recursiveAnchor does is replace the original base URI (from $id or the retrieval URI) with the base URI of the dynamic-scope outermost schema object containing "$recursiveAnchor": true, if any such schema object exists in the dynamic scope.

We could actually change the internal mechanism for this to be simply "replace the entire resolved reference from $recursiveRef with the URI of the dynamic-scope outermost etc.", because right now the only legal value of $recursiveRef is "#", and both of these mechanisms produce the same result with that URI reference.

If we were to lift that restriction and allow any URI reference as a $recursiveRef value, then the two mechanisms produce different results, and require keeping track of base URIs at runtime.

See #868 for more details and hopefully eventually a decision on whether to require static (without instance) resolution for all keywords and a change to $recursive*, or whether to change this architectural principle.

handrews added 2 commits May 22, 2020 12:49
These inform how JSON Schema is designed as a system.

The section is placed after Definitions as otherwise the
principles don't have clear meaning.
</t>
<t>
Locations in schemas and instances are always identified by URIs,
JSON Pointers, or Relative JSON Pointers. JSON Schema treats URIs as
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I only just noticed this now. Are relative json pointers now fair game to be used inside $refs? I think this is the first time that's been mentioned explicitly in the spec. (The metaschema itself just uses 'format: uri-reference' for the $ref syntax, and there are no examples of relative JSON pointers used.) It would be good to clarify this explicitly (probably in a separate patch).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@karenetheridge ignore this PR, it's way outdated. I should really just take it down, I may or may not revisit it, or this info may go on the web site, or something id.

@handrews handrews closed this May 24, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants