-
-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite the Hyper-Schema spec almost entirely #427
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a couple of remarks spotted while reading the whole text for the first time a couple of days ago.
Overall, I see no obvious problem; but I certainly need to reread it more thoroughly (hopefully soon).
Also I agree with the idea of moving things to the web site. Not sure this should be done with the scope of this PR.
We need other reviewers to comment on!
jsonschema-hyperschema.xml
Outdated
|
||
<section title="Schema keywords"> | ||
<section title="base"> | ||
<section title="Streaming parsers"> | ||
<t> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get the point of this new "streaming parsers" section, can you explain?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dlax that comes in part from @awwright talking about writing a streaming parser. When working out the testable requirements finding links, it occurred to me that it is not possible to find all links without processing the entirety of the instance document.
This is especially true for locating links based on context, because anchor
and anchorPointer
can adjust the context. There's no way to tell, without searching the entire document, whether a link defined later on sets the current location as its context.
In the case of a streamed instance, the definition of "entire document" may be difficult to determine. There may not even be a fixed-scope document, as the source of the stream may be adding to it based on other events.
So, if I wrote the requirement around find-by-context in terms of processing the entire document, then you could end up blocking indefinitely in a streaming parser situation. I felt like that warranted an alternate consideration to provide some information while acknowledging that it is technically incomplete.
@awwright can comment on whether the section is accurate to what he had in mind, and whether it is really necessary. But that's why it's there.
jsonschema-hyperschema.xml
Outdated
All properties in this section are advisory only. While keywords such | ||
as "title" and "description" are used primarily to present the link | ||
to users, those keywords that predict the nature of a link interaction | ||
or response MUST NOT |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The sentence is incomplete.
jsonschema-hyperschema.xml
Outdated
This property defines a title for the link. | ||
The value MUST be a string. | ||
</t> | ||
<section t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, this looks strange. The default should not be to constrain target behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the line you meant to attach this comment to? I don't follow.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No... The actual line is https://github.com/json-schema-org/json-schema-spec/pull/427/files#diff-6f6cc4b69f2e1ecfca964501e7a07e2eR922 (the cref after the paragraph saying "Omitting "submissionSchema" or setting the entire schema to "false" prevents any user agent data from being accepted.").
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dlax oh, cool, yeah that makes a lot more sense :-)
I'm continuing to lean towards saying that omitting "submissionSchema" is the same as true
rather than false
.
"hrefSchema" needs to default to behaving like false
because of how it interacts with the rest of the target URI resolution process. I couldn't get it to work with a default behavior of true
without needing another keyword to indicate whether the link accepts input at all. Or schema authors would have to explicitly set it to false to opt out of accepting input, which feels odd. But maybe that's just because I've been staring at it for too long.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another set of comments while re-reading until section 5 (included). Still no blocking problem. I'll keep on reading.
jsonschema-hyperschema.xml
Outdated
can make it easier to support multiple representation media types and | ||
linking mechanisms. It is even possible to support features such as JSON HAL's | ||
embedded resources by automatically following links on the server and | ||
inlining the results. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really get the point of this paragraph and references to JSON HAL are a bit odd. Maybe it'd be better elsewhere than in the introduction. (Otherwise, the content of the introduction is nice.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dlax I can see how this would be a little confusing.
JSON Hyper-Schema is a separate document from the response (or request) document. Instead of sending the links in the document, we use this separate document as instructions for building them.
To some people, that is a benefit. The responses take up less space on the wire, and people who want to treat the system as a "traditional" non-HATEOAS HTTP API can easily do so.
To others, it is less desirable. They want self-contained responses. Additionally, given the variety of hypermedia-ish formats based on JSON, application developers may prefer various representations. This paragraph is explaining how a JSON Hyper-Schema system can fulfill those use cases alongside of its traditional, client-side role.
When giving examples, I like to give at least two. HTTP Link headers are one way to generate links server-side and send them in the response. JSON HAL is a simple and at least somewhat popular format that would be easy to generate as well. So you could offer both hyper-schema and HAL representations quite cheaply. And unlike Siren or JSON API, HAL does not significantly constrain the representation structure. It would be hard to use Siren or JSON API in this way, and impossible to do both as their structural restrictions are not compatible with each other. So JSON HAL seemed the best 2nd example.
This paragraph is definitely not essential, but I do feel like it's an important use case for Hyper-Schema that is often overlooked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand all that and definitely agree it's worth mentioning as it highlights the nice "separation" of JSON Hyper-Schema. Just wondering if the introduction is the best place.
Maybe something to be moved to the web site, in particular since people may first read information there before getting into the spec.
In general, about moving things from the spec to the web site, it's probably not a priority at this point and we can leave everything in the spec during the rewrite time while spotting "candidates for move".
jsonschema-hyperschema.xml
Outdated
in <xref target="json-schema-validation">Section 10.1 of the JSON Schema | ||
validation specification</xref>. Links that cannot fulfill | ||
their "templateRequired" keyword based on the instance and/or potential | ||
client input MUST be excluded. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'm not sure it's worth mentioning the templateRequired
keyword. Maybe just say that "An implementation MUST find and make the valid links in all applicable (sub)schemas available to clients." and then refer to link validity section as you do for "schema applicability".
The idea is to keep this "General implementation requirements" section independent of keywords as much as possible. It's already quite well achieved, just mentioning this particular point that might be improved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dlax I hadn't thought about it that way. Definitely an interesting idea, let me think a bit more about how to do that clearly. I do like calling out the concept of link validity.
jsonschema-hyperschema.xml
Outdated
<t> | ||
For a given link, an implementation MUST be able to construct the | ||
target IRI or URI based on the instance and, if "hrefSchema" is present, | ||
client input. When working with client input, the input data MUST be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above about hrefSchema
which might be dropped while keeping things understandable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dlax yeah, makes sense. Trying to think of how to describe it other than "and, if allowed, client input". I'm not even sure "allowed" is right. "requested" might be more accurate, as "hrefSchema" is a pretty strong hint that the client should at least provide the option of input, whatever that means in the context of the client application.
I want to avoid the "form" terminology, though, as it has tended to cause confusion.
jsonschema-hyperschema.xml
Outdated
identifiers, implementations MAY opt to hide those fields that | ||
are only used in URI construction ("href", "templatePointers", | ||
"templateRequired", "anchor", and "anchorPointer") when making | ||
the remainder of the LDO available to clients. Implementations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless I missed it before, this is the first occurrence of LDO.
As reading from the top, I wonder if the introduction text about LDO being "a serialization of the abstract link model defined in RFC 5988bis" (currently in section 7, just the first paragraph) wouldn't be better before (in particular before this section about implementation requirements).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops! Yeah, at one point LDO got introduced very early on, possibly in the terminology section. But clearly I dropped that at some point and forgot to put it somewhere else up front.
And yeah, the link serialization bit is probably introductory / overview material.
@dlax I have added several separate commits addressing each of your comments to some degree or another. Let me know if I missed anything or if these introduce new confusion :-) |
@handrews The new commits look fine, I think you addressed all of my comments. |
The new Overview is exceptional. Great job.
Who what whats being discovered by who. Section 5 comes at you like a ten tonne truck. It's all making sense and things are really exciting, then i dunno whats happening and im sad. Can warnings to implementations maybe be bumped down a little bit so general humans can get more information on what/how/when instead of getting their brains rammed full of information they probably don't need? On the whole this reads a hundred times better than the previous versions. It feels consistent, and like humans are actually expected to look at it. Well done. Could do with some of those awesome examples added in. |
@philsturgeon thanks! This is very helpful :-)
I used the word "discovered" because "discoverability". But I just mean that an implementation that's looking at some instance data has to do something to figure out if any links apply to that location in the instance, and if so, which links do. Not sure how to better word that but open to suggestion. Maybe it's because that's the opening line in the implementation requirements part, and I didn't establish what the implementation is well enough yet?
Heh. Now that you mention it I did kind of write it with a "YOU WILL UNDERSTAND HOW TO IMPLEMENT THIS!" attitude which was probably driven by my frustrations more than empathy with the reader :-D What I want to accomplish is to at least outline the responsibilities of an implementation to provide some context for the keywords to follow. I dislike reading the keywords without some indication of why they are there and what an implementation does with them as a whole. Diving straight into keywords leaves you with things like "this keyword affects the URI construction process" and I'm like "wtf, what process is this, why does it need this particular keyword and behavior", etc. I think at least the intro bit (the part before §5.1) should stay where it is for that reason. For the remainder, would it help to have a single example (schema, instance, output) that minimally illustrates what the 5.x sections cover before diving in? Or as a §5.3 immediately after and before the keywords? I could see that helping without making the later example section (with the build-up from simple to complex) unnecessary. This example wouldn't explain everything that goes into producing the output, just point to it and say "this is what we're talking about, details in the next major section." Alternatively, can you get more specific about what is overwhelming? The topic of implementation requirements at all, or is it just the wrong level of detail?
Working on it, as you saw in the next PR. I'll be adding more during this week :-) |
@philsturgeon I think what I'm going to try to do about section 5 is split it up into a general discussion of the required concepts (link discovery/applicability/validity, and the basic outlines of usage), and then a section that is very focused on exactly what output should be produced. I'll leave the former in section five, and move the latter down with the URI Templating section after the keywords are described. I think that puts all of the implementation details together (e.g. with the pseudocode) which probably makes for a more gradual build. Stuff like the notes on streaming parsers would go at the end of the implementation details section. This would produce an outline like:
This actually seems like a much stronger flow to me now that I've had a chance to reflect on it. The examples are much closer to the main specification requirements, which had been worrying me before. |
@philsturgeon @dlax I have taken the high-level aspects of the old section 5 and moved them up into Overview, along with a more thorough and clear section on terminology. The remainder of the implementation section has been moved but not yet updated or merged with the URI Templating implementation guidance. I will keep working on that and continue to add commits. But this latest commit is worth reviewing for the new Overview material, I think. |
@handrews the latest restructuring is spot on. It gives people a good understanding of what i actually being discussed without diving into impls. A+. |
@philsturgeon thanks! I'm also finding that moving the remaining parts of what used to be section 5 down to the new location, I can actually cut a lot of it. So overall I think this streamlines things a lot. Thanks for the feedback! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also like the new outline. A couple of remarks while reading (quickly).
<t hangText="hyper-schema"> | ||
Within this document, the term "hyper-schema" always refers to | ||
a JSON Hyper-Schema | ||
</t> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The two definitions above are now redundant with the second paragraph of the introduction which can now be removed I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Information in the introduction is redundant almost by definition. I think it's fundamental enough to stay in the introduction, and it's also necessary to include them in this definition list for completeness. Otherwise it would be an odd omission if you just jumped into the definitions section without paying much attention to introduction.
My view of the introduction is that it should hit the major high points but not be essential.
I'm open to further debate on this, but unless you feel strongly I'd like to leave this as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a big deal, let's leave it as is.
A user agent which is only aware of standardized link relations, | ||
media types, URI schemes, and protocols. It may have an extensible | ||
architecture to allow adding support for standards beyond the core | ||
set of which it is aware. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if using the term "hyper-schema user agent" wouldn't be more appropriate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, the user agent glues together many different standards. It is not specific to hyper-schema.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered changing this, but it's really a chicken and egg thing. We need to explain some stuff in the overview before diving into definitions, but it's the definitions that clarify things. As with the introduction concern, I'd like to leave this as-is unless you feel really strongly about it. I think that if a reader continues on to the definitions this should be very clear.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, let's leave as is.
jsonschema-hyperschema.xml
Outdated
describing that format is | ||
<eref target="http://json-schema.org/draft-7-wip/hyper-schema-output#"> | ||
http://json-schema.org/draft-7-wip/hyper-schema-output#</eref> | ||
(draft-07 work-in-progress version). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is first time I hear about "output format", and I don't really get what it is so far. So it's probably worth adding a note (and a link) to say that this is detailed in the "Implementation requirements" section.
Also, does the "hyper-schema-ouput" document exist already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I added it but I didn't. Will fix and introduce more clearly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additional details about the "output format" and its schema look good to me, thanks.
} | ||
}, | ||
"hrefPrepopulatedInput": { | ||
"$comment": "The initial data set to be presented with the input form when URI Tempalte input is accepted.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: Tempalte
{ "$ref": "#/definitions/noRequiredFields" } | ||
], | ||
"definitions": { | ||
"noRequiredFields": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This noRequiredFields
name seems a bit odd, especially as it's used in "hyper-schema-output.json". Maybe "ldo" would be more meaningful but on the other hand it does not convey the fact that "required" is absent on purpose so it wouldn't be a real LDO. I have no better idea to suggest at the moment, just mentioning in case somebody has an idea...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm not particularly fond of the name but nothing else was coming to mind so I went with it for now. It's at least descriptive. But hopefully someone will have a better idea :-)
The last commits look good to me. It's probably time to merge this and follow-up with more specific PR. However, by looking at the "files" tab, I don't understand the diff for |
Here are the main changes: * Set context in the overview and explain purpose. This should probably be trimmed back down some. * Start with a general, high-level definition of what it means to implement hyper-schema. This provides a framework for understanding the individual keywords. * Due to high-level requirements up front, and details of the whole URI Template resolution process, HTTP usage, security concerns, and API usage factored out to their own sections or appendices, the individual keyword sections focus on the key basic definitions * Schema keywords discussed in terms of applicability. Clarified requirements on how to handle "base". * LDO keywords restructured around the abstract link model * Context identification keywords first * Link relations next * Target identification rounds out the required concepts * Template process keywords, which apply to context and target * Target attributes (emphasize advisory nature in intro paragraph) * Input keywords, grouped by usage * links.json re-ordered to match * URI Template resolution process factored out to its own section, with pseudocode. Probably needs more explanatory text. * Security Considerations section instead of interrupting the flow of keywords as they are defined. * Consolidated HTTP usage notes into an appendix. This thoroughly covers how to use JSON Hyper-Schema with HTTP. * Added an appendix on how to use hyper-schema in APIs. This is probably the most likely to move out to the web site. I also was going to write more with analogies between API user agents and web browsers, but that got kind of bogged down so I stopped where it is now. * Added an appending on static analysis of hyper-schemas. This is where a lot of usage goes wrong in terms of expectations, so briefly addressing it in the spec seems wise.
Now that we have "targetHints" which, among other things, can provide information about which HTTP methods are supported, there is no longer a clear need to prevent submission by default. It is more in line with the rest of JSON Schema's philosophy to allow unconstrained input by default. Note that "hrefSchema" remains false by default due to its interaction with resolving URI Templates from instance data.
and note that it is a serialization of the abstract link model from RFC 5988bis
It does not need as much detail, and I had left out the important application towards server push.
When discovering links for a hyper-schema, it does not make sense to examine the schema keywords within the discovered links. While this was already the case as there are no rules for determining the applicability of those keywords to an instance, that is not obvious so make it explicit.
Define the condition for using hrefSchema more clearly (I hope). Use consistent language around "locations" in a schema rather than introduce a new term "data sources" which is not defined. Attempt a better explanation of how to find values in the instance. I'm not entirely sure it's successful but at least it's different.
Minor tidying and removal of an unnecessary reference.
Re-organize the first few sections a bit (based on looking at how several other relatively recent RFCs organize things). Have an explicit list of referenced and defined terms (at least one OAuth 2.0 RFC does this and it seems useful). With the terms defined, cover the most important implementation requirements at a *very* high level, without getting into more confusing cases such as how to handle streaming data. Move the old section down after the LDO section and before URI Templating. The contents have not been changed aside from chopping off the introduction which was used as the starting point for the remaining early section. URI Templating will need to move into the implementation section, which will need substantial reworking to accomodate that and its new position in the flow of the schema. This will be done in subsequent commits.
Now that the implementation requirements have been moved down below the keywords, they can be substantially trimmed down and the URI Templating section can be brought in with the other implementation requirements.
A lot of this should go on some sort of best practices guide on the web site. This cuts quite a bit of text, and merges the static analysis appendix into the API appendex as API-wide processing is the main use case for such analysis. There is likely more to cut, and perhaps the entire appendix should be removed, but I'd like to get some feedback on this level of information first.
Per discussions with other project members given the rewrite effort.
Also improve the wording introducing the output format. Split the "links" fields off from their "required" value, as we may use the format in environments that do not require the same set of fields. For instance, the output format is allowed to omit "href", which is required in an LDO proper. Finally, update the URIs to draft-07-wip which is less confusing than having them say draft-06 (which is blatantly wrong) or draft-07 (which is misleading as it's not yet finalized, much less published).
While it is technically not a primary part of the specification, but rather an application of the spec to a specific protocol, it is important enough to explaining the motivation and usage of Hyper-Schema to discuss immediately after the general implementation requirements but before the examples. This allows the examples to reference and illustrate HTTP usage.
Target attributes are advisory, and MUST NOT be considered authoritative.
Needed to add application/schema-instance+json and related changes.
…ints" section Add the example as it was in 1865631 (that is, just before merge of json-schema-org#427 "Rewrite the Hyper-Schema spec almost entirely").
Add the example as it was in 1865631 (that is, just before merge of json-schema-org#427 "Rewrite the Hyper-Schema spec almost entirely"). Only added a top-level "headerSchema" key to make it symmetrical with the "targetHints" above.
HOW TO REVIEW THIS PR:
I recommend checking out master and building the hyper-schema spec to have for reference, particularly if you want to check the examples. I removed all examples from the rewrite to cut down on text that was in my way.
Then clone or set up a remote for my repository, and check out the "rewrite" branch. Build it, and read the hyper-schema spec there (the validation spec also has the "applicability" changes in it on this branch, so you might want to refer to it as well)
THE jsonschema-hyperschema.xml DIFF IS INCOMPREHENSIBLE
Don't bother trying to read it. GitHub won't even load it by default it's so large.
This more or less includes all open PRs, including the one for
moving the applicability concept over to the validation spec.
This is a ROUGH DRAFT. The goals are:
The following feedback is NOT WANTED at this time:
Grammar, wording, and flow will be addressed after I make
another pass on this, but I don't want to do that unless the
direction is generally acceptable.
I ripped out all of the examples because it made it about 5000%
easier for me to work with the document, and we know we want
to write them anyway. Keep a copy of the version from master
around if you want to reference the existing examples while
reviewing this new draft.
DON'T FORGET TO READ THE CREFs!
There are several places where I opted to write a cref note
instead of continuing to try to hammer out the proper text.
Here are the main changes, aside from those that have also been
submitted as separate PRs:
Set context in the overview and explain purpose. This should
probably be trimmed back down some.
Start with a general, high-level definition of what it means
to implement hyper-schema. This provides a framework for
understanding the individual keywords.
Due to high-level requirements up front, and details of the
whole URI Template resolution process, HTTP usage, security
concerns, and API usage factored out to their own sections
or appendices, the individual keyword sections focus on the
key basic definitions
Schema keywords discussed in terms of applicability. Clarified
requirements on how to handle "base".
LDO keywords restructured around the abstract link model
URI Template resolution process factored out to its own section,
with pseudocode. Probably needs more explanatory text.
Security Considerations section instead of interrupting the flow
of keywords as they are defined.
Consolidated HTTP usage notes into an appendix. This thoroughly
covers how to use JSON Hyper-Schema with HTTP.
Added an appendix on how to use hyper-schema in APIs. This is
probably the most likely to move out to the web site. I also
was going to write more with analogies between API user agents
and web browsers, but that got kind of bogged down so I stopped
where it is now.
Added an appending on static analysis of hyper-schemas. This is
where a lot of usage goes wrong in terms of expectations, so
briefly addressing it in the spec seems wise.
I'm probably missing something, but I'm pretty sure those are
all of the really big things.