-
-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature for defining data sources/relationships #26
Comments
Both this and To me this feels like an even larger change than How does the principle of least power apply to these use cases? |
Good points. This does allow any data source to be referenced, I think that flexibility is desirable; I don't want to encourage people to shoehorn in more JSON data into their document than would otherwise be wise, just so they can use this feature. I don't think it's too much more or less powerful than $data except for a few points:
|
It should be noted that separating data from logic (even if that logic is data, e.g. a schema document) is a common practice. This strategy would allow the data to be updated without having to update the schema and risk changing the logic. Most notably in the .Net world is Jon Skeet's NodaTime library. About a year ago, Jon moved to publishing NodaTime in multiple Nuget packages, one for logic and one for calendar data. The one for logic would be updated only for bug fixes and such, while the calendar data one would be updated for data accuracy. |
@awwright Thanks for the responses, they've been very helpful and thought-provoking, and I'm really glad that you're tackling htis. @gregsdennis that's a good point about data vs logic separation, also very helpful for setting a wider context. Sometimes I forget to look past JSON Schema and particular aspects of hypermedia. Your comment made me re-think some things. I'm generally sympathetic to arguments based in widespread best practices. I think there are two orthogonal concerns here:
For the first point, I'm still concerned over allowing arbitrary external data sources. Currently, processing a schema and an instance is a function of:
An implementation can be supplied the schema(s) and instance pre-parsed into the data model, so technically an implementation need not handle any sort of parsing. I don't think it makes sense to require validators to handle arbitrary connections that return arbitrary output. However, I think we can look to I could support saying that keywords can rely on URI-identified external data sources as long as those data sources supply data in the JSON Schema data model. This could either be in the form of an This puts the burden of translation onto the data source, and not on the JSON Schema implementation. As with I'll need to think a bit more on the second point, about purposeful vs generic keywords. In particular, I don't follow your |
@awwright any thoughts on this? Or on how this might address #20 or json-schema-org/json-schema-spec#541? In particular, how would these keywords use instance data, or is the intention specifically that they cannot do so? In which case this proposal actually would not have any overlap with My first impulse would be to say that the URI of the instance is the base URI for any URI-reference values (similar to how many hyper-schema keywords work). |
Bringing over some commentary from json-schema-org/json-schema-spec#541: I'm proposing that all of the possible @awwright I'm assuming from your thumbs-up on that comment in json-schema-org/json-schema-spec#541 that you're OK with this approach. I'm working on the vocabulary support PRs now, so we can make sure that the vocabulary concepts will support doing this (I can't see any problems with that right now). |
It seems to me that there is a general use case of 'As a schema author, I want to make an assertion that this data conforms to something I can refer to but do not necessarily want to write out in full in the JSON Schema I share with everyone', for instance because one or more of the following apply:
You can cut this use case various ways - where does the data come from? does it require computation? But I am not sure the distinction is actually helpful, as at most it allows you to squeeze some of the use cases into some HTTP + JSON Schema or something, whether or not those are the right implementation solutions. I reckon getting data is a red herring. Most of the time what you actually want to do is apply a piece of application code to the value being validated (and maybe some other arguments), and get a boolean value as @awwright demonstrated:
I suspect that furthermore, there is a need to partially validate schemas, wherein one validator can make qualified affirmations of validity where other validators can confirm total validity, e.g. 'This document is valid for the things I know how to test' vs 'This document is valid and I have tested everything in the schema'. Here I am thinking of a client/server use case. The client only troubles itself to check that the customer_uid is an integer; however, the server also checks that the cusotmer_uid is that of a valid customer (and maybe that the order belongs to the customer as well). It strikes me that the first step should be to create a syntax which will allow schema authors to refer to 'custom' assertions as extensions in a way which will allow vocabulary and conventions to develop outside the spec, with a view to formalising groups of related vocabularies into standards which validators could then support natively and advertise support. Incidentally, I would like to see the uri in this case to indicate not so much the extension as implemented (which might be language specific), but the feature spec, e.g. a test suite which validates that extension. This means that new features can be developed, tested, and demonstrated outside of the core. Users can provide their own implementations without having to get their code into the validation engine for their language, which can be kept minimal. If the features are are niche, those who want them get to keep them. If they are useful to lots of people, the tests and implementations can be shared as part of an ecosystem. Really useful features might eventually make it into the core. Is this the problem space which vocabularies would solve @handrews? |
I was working on an application and wondering if there's a few standard ways for looking up values, that we could describe; and then allow implementations to offer optimizations/alternative versions. For example, derive a URL from a URI Template like Then, implementations could allow optimizations to the HTTP request process. So now, our hypothetical API might go like: validator.register("http://example.com/user{?uid}", function(uid){
return db.select("SELECT * FROM user WHERE uid=@0", uid).count() > 0;
}); |
@awwright I was literally just thinking that for an implementation I'm looking for here!! In this library, you can generate UI forms using a jsonschema as of the newest beta. I would like to also add to that being able to generate validations in a standard way. Right now, I'd have to rely on an extra "validation.json" with my predefined rules to generate that and also share that rule set and parser with the server. In my case, I have a server with arbitrary validations that we want to apply cross field or cross internal systems. For that, I want the call to simply return 2XX for valid and 400 for this is not a valid value. On the client, I want to be able to define things like lastName requires firstname. the dependencies keyword is very good for "directly requiring a value exists" but as you know it is not good for defining a property with a value exists. Is there something I can do here to help with this? Do we just need ideas at this point? I am very interested in this feature set and can most likely convince my team to allow me to offer some regular sprint time towards helping. |
@kenisteward you may be interested in json-schema-org/json-schema-spec#643 as well. |
@gregsdennis Thanks for the insight! json-schema-org/json-schema-spec#643 is not actually in the standard yet right? will it be soon? this will be very helpful for error reporting on our API's. I wasn't able to find the corresponding PR to hopefully add this as an extension for a validator for our usage. |
It's planned for draft-08, which is slated for this month/year/🤞. There's not a PR on it yet. I typed up the issue based on conversations in another issue, but I'm not the spec author type. |
@awwright Since this is proposing a set of new keywords, I'm going to move it to the vocabularies repo. If you think that is in error, please feel free to move it back. From reading back through this, I think that json-schema-org/json-schema-spec#855 covers the general underlying issues, such as whether/how a keyword can access the instance or external data sources. There's a lot of great discussion here and I don't want to close it. But I think continuing the effort in the vocabulary repo is the right move. Again, please move back if you disagree. |
That makes sense. We should get into the habit of developing proposals like this as vocabularies. |
This issue proposes a set of keywords that would let schemas reference external assertions about the permitted range of instances/values.
It is a different way of solving many of the same problems that $data and similar keywords hope to address, but in a more flexible fashion.
The basis of the proposal is a keyword that lets authors define the relationship of the value to other data:
This specifies that the
"customer_id"
property must be an integer, and must be within some externally defined range of legal values (identified by a URI).How this is implemented would vary between validators. Validators might allow coders to implement custom handlers, and do something along the lines of:
Other keywords would automatically keep track of which values become legal to use:
Here, the
valueDefine
keyword registers the instance value as legal, when other instances assert a value must be within the given range.Similarly, the
valueUniqueKeys
keyword specifies that any values defined under the range name can be used to uniquely identify the whole, larger object they're found within. With these two keywords, I can ask the validator "Find me the document whereuid = 3
" and it returns an entire instance which has the property{uid: 3, name:"Alice"}
.I'm going to continue refining this, and maybe take some inspiration from JSON-LD; however I think this is in a good state to ask for feedback.
Related issues: json-schema-org/json-schema-spec#340
The text was updated successfully, but these errors were encountered: