Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove internal/external reference terminology, expand use cases and examples #550

Merged
merged 7 commits into from
Feb 28, 2018

Conversation

handrews
Copy link
Contributor

*NOTE: This PR is broken into several commits for easier review. Some commits relocate large blocks of text, others rework and expand existing content. No commits do both. It is probably worth looking at individual commits for diffs, but building the branch locally to look at the complete flow of the reworked section to get a feel for how it reads as a whole.

This addresses #545.

The "Internal References" concept and explanation was confusing, as noted in #545. The "External References" was really just a discussion of schema identification uniqueness and whether or not you can load remote schemas.

It felt odd to me to introduce $ref before $id, as you need to understand identifying schemas before you can reference them by identifiers. So I re-ordered that.

In the $id section, made subsections for the different use cases and expanded on the implications, in particular for how JSON Pointer fragments can be used with any base from any parent in the same schema document (this was another aspect brought up in #545 or related discussions). I expanded the example to show this clearly.

Finally, I switched the order of the external and internal sections, and made them about loading and resolving references, respectively. The same example is present, but rather than talking about exceptions to RFC 3986 and "in the same document", it phrases things in terms of whether the implementation has seen the identified schema or not. This matches the discussion in the loading section (formerly external references).

Finally, added a cref note to get us to consider guidance on previously unknown schema identifiers. In the case of Hyper-Schema in particular, forbidding network dereferencing makes the entire concept unusable with dynamic, evolving hypermedia. Some further guidance is required (although the actual guidance should be developed separately- I will file an issue for it).

Also paging @johandorland and @spenced in addition to the requested reviewers.

This may go into a bugfix update before we move to a draft-08 meta-schema, as it is not intended to change any behavior, and definitely does not involve a meta-schema change.

Talking about references and dereferencing makes more sense after
we establish base URIs and subschema identfication.

This commit just moves the section, subsequent commits will adjust
the text and titles accordingly.
This clarifies that JSON Pointers work from any base URI within
a document to any location underneath that base, and shows that
this can produce several different valid URIs for a subschema.
This re-organizes and expands the existing text for "$id" into
subsections based on the three primary use cases:

* Identifying the root schema in a schema document
* Changing the base URI (with implications for JSON Pointer fragments)
* Defining a location-independent name

All of these are demonstrated in the subsequent example section.
Move it up under "$ref", and change the title to be about loading
referenced schemas.  Some further wording tweaks will be required
to the contents, this just moves the paragraph without changing it.
Note the case of identifying schemas during processing, as shown
in the earlier example.  This replaces the explicitly called-out
concept of "Internal Referencing" which will be removed.
a JSON Pointer or their URI given directly by "$id". In all cases,
dereferencing a "$ref" reference involves first resolving its value as a
URI reference against the current base URI per
<xref target="RFC3986">RFC 3985</xref>.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

which is it? 3985 or 3986

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed.

We know from feedback that the "internal referencing" concept
and in particular the "same document" wording is confusing and
for some people misleading.  Remove the term and focus the wording
on normal RFC 3986 URI reference resolution.

Cover the same logical ground as "internal referencing" by discussing
whether and how the implementation knows of the identified schema,
using the same general ideas as covered in the previous section
on loading (or not loading) referenced schemas.
Copy link
Member

@awwright awwright left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a pretty good rework.

I included some notes I can take care of in a different PR.

I don't think it's a problem if $ref comes before $id, since you can use either without the other, but I see what you're getting at.

<t>
Using JSON Pointer fragments requires knowledge of the structure of the schema.
When writing schema documents with the intention to provide re-usable
schemas, it is preferable to use a plain name fragment that is not tied to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say "it may be preferable" to avoid assuming the motives of document authors. Alternatively, if we want to encourage the behavior, I'd go all the way and say SHOULD.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For now I'll use "may be preferable" and we can think about the SHOULD.

</t>
<t>
To name subschemas in a JSON Schema document,
subschemas can use "$id" to give themselves a document-local identifier.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rereading this language I know I wrote, it may be clearer to add a qualifier to '"$id"' like

To name subschemas in a JSON Schema document,
subschemas can use a bare fragment for "$id" to give themselves a document-local identifier.

Maybe I can PR for this after merge.

subschemas can use "$id" to give themselves a document-local identifier.
This is done by setting "$id" to a URI reference consisting
only of a plain name fragment (not a JSON Pointer fragment).
The fragment identifier MUST begin with a letter ([A-Za-z]), followed by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also rereading this language, it might not be clear that the property value as a whole still begins with #.

I can PR for this after merge if you like.

The fragment identifier MUST begin with a letter ([A-Za-z]), followed by
any number of letters, digits ([0-9]), hyphens ("-"), underscores ("_"), colons
(":"), or periods (".").
</t>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also I thought I had a comment in here explaining I just copy-pasted the criteria from XML element IDs.

shown in section <xref target="idExamples" format="counter"></xref>.
</t>
</section>
<section title="Location-independent identifiers">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized RFC3986 has a term for what we're doing, "Same-Document Reference"

But it's not quite the same concept, and I think I like this title better. (Nice pick.)

<t>
The root schema of a JSON Schema document SHOULD contain an "$id" keyword with
a URI (containing a scheme). This URI SHOULD either not have a fragment, or
have one that is an empty string.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably better suited for a follow-up PR, but this is sort of confusing to parse, maybe

This URI SHOULD have no fragment, or an empty fragment.

Maybe we simplify the entire paragraph down to:

The root schema of a JSON Schema document SHOULD contain an "$id" keyword with a base URI [RFC3986] (a full URI with a scheme but no fragment), or a full URI with an empty fragment.

@handrews
Copy link
Contributor Author

@awwright thanks! I'll make the one fix and merge since @philsturgeon approved and you seem generally fine with it. If you can go ahead and write a PR for the other things soon that would be great- I would like to make a "bugfix" release of core (and validation) like I did for hyper-schema within the next few days to a week.

Feedback from PR that this should be more of a suggestion.
We may strengthen later.
@handrews handrews merged commit 0b1a062 into json-schema-org:master Feb 28, 2018
@handrews
Copy link
Contributor Author

@johandorland and @spenced please feel free to comment further if you spot anything of concern, or perhaps better yet open new issues and we will keep working on this if needed. Since @awwright has some further ideas I did not want to block on this step.

@handrews handrews deleted the ref-id branch February 28, 2018 22:34
@gregsdennis gregsdennis added clarification Items that need to be clarified in the specification and removed Type: Maintenance labels Jul 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification Items that need to be clarified in the specification core Priority: High
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants