-
-
Notifications
You must be signed in to change notification settings - Fork 266
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Add propertyDependencies keyword (aka discriminator) #1082
Comments
Even though this is "just" syntax sugar, the I agree that this is one of the most frequently asked questions lately, and adding this construct should help both schema authors and various tools that parse schemas and generate code and/or UIs. (Note: I originally misread the placement of |
I'm confused. Are |
|
There could be. If this is intended to replace the |
@gregsdennis It's important to keep in mind that That said, you have a good point that this would only work for properties with string values and it would be nice if it could support numbers and booleans without compromising the simplicity Another limitation that came up in Slack discussion is that there's no way to set a default schema to use if there are no matches. This could be supported by adding a sibling keyword called |
Maybe with these things in mind, this would be better suited to an independent vocabulary and not in the out-of-the-box (avoiding the word "core" here) set of keywords. (🤔 Actually, also moving other "sugar" keywords into a separate vocabulary might not be a bad idea...) |
I disagree. We want JSON Schema to be reasonably easy use for the most common use cases without having to be extended. That means that having a few sugar keywords around to make very common patterns not such a horrible experience. The use case Moving sugar keywords out to an independent vocabulary is probably a more slippery slope than you expect. Here are some examples that could be considered sugar keywords.
|
I'm not suggesting the existing sugar keywords are moved to an independent vocab, just a spec-defined sugar vocab, akin to the annotations or metadata vocabs. If they just make authoring easier, then they're not really basic functionality. It was just a thought. But I do think this specific proposal belongs in an independent vocab unless we can make it work for all value types. |
Whether this goes into a new vocab or not is more an implementation question, as long as it's included in the main meta-schema (that is, it is enabled for schemas that declare |
I see. I'm not sure that's necessary, but it's a little off topic, so let's discuss that later.
It seems a bit extreme to exclude a keyword that solves a real and common problem because it doesn't support a use case that no one is asking for. I think it would help if you could share why you think it's important for this keyword to support all value types. I'll write up some examples of alternatives so we can more easily see what the trade-offs are. I think that might help with decision making process. |
I think this should be the approach for new keywords going forward. If we see that it has a fair amount of implementation support and end-user use, then it can be pulled into the main vocabs with relative ease. I also think that keywords that are in the main vocabs should be as generic as possible, supporting all possible values. Examples of this are You might argue that keywords like |
Here are what some alternatives might look like that support all value types. All examples are functionality equivalent to the examples in the original post. This version uses an array of objects. Each object is a collection of the variables needed to express a property dependency. This doesn't fit the style of JSON Schema. There aren't any keywords remotely like this. It's also still too verbose. It's a little more intuitive than {
"propertyDependencies": [
{
"propertyName": "foo",
"propertySchema": { "const": "aaa" },
"apply": { "$ref": "#/$defs/foo-aaa" }
},
{
"propertyName": "foo",
"propertySchema": { "const": "bbb" },
"apply": { "$ref": "#/$defs/foo-bbb" }
}
]
} A slight variation on that example is to make it a map of keyword to dependency object. But, I don't think that makes a significant difference. It's still too verbose and has the same efficiency problems in most cases. {
"propertyDependencies": {
"foo": [
{
"propertySchema": { "const": "aaa" },
"apply": { "$ref": "#/$defs/foo-aaa" }
},
{
"propertySchema": { "const": "bbb" },
"apply": { "$ref": "#/$defs/foo-bbb" }
}
]
}
} This one is a little more consistent with the JSON Schema style (poor keyword naming aside), but otherwise has all the same problems as the other examples. {
"allOf": [
{
"propertyDependencyName": "foo",
"propertyDependencySchema": { "const": "aaa" },
"propertyDependencyApply": { "$ref": "#/$defs/foo-aaa" }
},
{
"propertyDependencyName": "foo",
"propertyDependencySchema": { "const": "bbb" },
"propertyDependencyApply": { "$ref": "#/$defs/foo-bbb" }
}
]
} This one is a variation of {
"allOf": [
{
"ifProperties": {
"foo": { "const": "aaa" }
},
"then": { "$ref": "#/$defs/foo-aaa" }
},
{
"ifProperties": {
"foo": { "const": "bbb" }
},
"then": { "$ref": "#/$defs/foo-aaa" }
}
] I read all the |
It most certainly does, but
I don't think our vocabulary system is equipped to support that goal right now. It's something I want to bring up in more detail at some point, but I'll try to explain briefly for now. Vocabularies are well equipped for creating dialects. It's good for when someone wants to include a version of JSON Schema in their own specification and provide their own tooling for that specialized use of JSON Schema. OpenAPI is a great example of who it would work well for. However, vocabularies are not good for individual schema authors who want to add a new capability to their schemas. Schema authors would need to create their own custom dialect every time they want to use a non-standard keyword in one of their schemas. I don't see people doing that until we can make changes to the vocabulary system to better support schema authors as opposed to specification authors.
I agree that they should be as generic as is reasonable. When making it more generic means that it no longer solves the problem it's intended to solve, then you've gone too far. Hopefully the alternative examples I posted help illustrate that supporting any value type is a step too far. |
I'd advise separating the question of extension vs in-spec vocabulary from the discussion of the keyword itself. These are largely orthogonal problems and can clutter the issue for readers not as deeply versed in the project. Also, keywords can always be pulled into the spec- initially releasing it as an extension vocabulary for use with 2020-12 does not preclude pulling it into the next draft (or a later one). While I have a knee-jerk "what about other types" reaction, I agree that a keyword that is fundamentally about making a common use case easier should focus on that use case. It's already possible to handle arbitrarily complex conditions, so we don't really need another system for arbitrary complexity. I think boolean and numeric coercion is a reasonable approach for those simple cases, but don't have strong feelings on it. You would technically lose the ability to distinguish between Since I haven't been commenting for a while, let me just note that I'll probably start dropping the occasional comment on issues, but do not intend to dive back in at anything resembling my previous levels of involvement anytime soon. Please just treat my comments as offered opinions, and don't wait on me if I don't reply to something promptly. I'm thrilled to see all of the work that is going on and do not want to disrupt the flow you all have going! |
It's good to hear from you Henry.
This is what I've been trying to say, but I think you said it more succinctly. Thank you. |
I implemented Note: Things in my |
I still don't like that this only supports cases where values are strings. I think that one aspect makes this too specialized to include in JSON Schema proper. |
@gregsdennis What would it take to convince you? We have the generalized solution in |
The way I understand this it's a hard choice between getting O(1) validation vs. supporting values other than strings. The only way to get O(1) choice of a schema is with objects and well, the names have to be strings. Doing type-coercion here seems like a huge nono to me, that stuff is a big part of why I develop Python instead of JS these days. I'm quite surprised that people are so concerned with getting this validation with O(1) vs. O(n) as described in @jdesrosiers post here, but he certainly knows better than me. On the other Hand I really understand @handrews and @gregsdennis aversion to only support string values. I for one have at least one case in mind where I use integers to discern the "type" of object. Couldn't the logic from the original proposal be applied if |
@NiklasBeierl Thank you for your feedback! I really want to hear more voices on this issue.
O(1) is not the property I'm trying to optimize for. We already have a solution that supports all values:
I'm certainly open to a hybrid solution and/or other alternatives, but I don't think alternative 2 is a good solution. I think it's too verbose. But, mostly it's going to be a hard sell because it requires a keyword with sub-keywords, which is something that has been aggressively avoided in JSON Schema in the past. I personally think that the use case for values other than strings is rare enough that it's ok to have people use |
Please consider including {
"type": "object",
"properties":
{
"type":
{
"enum":
[
"A",
"B",
"C",
"D"
]
}
},
"required": ["type"],
"propertyDependencies":
{
"type":
{
"A": { "$ref" : "..." },
"B": { "$ref" : "..." },
"C": { "$ref" : "..." },
"D": { "$ref" : "..." }
}
}
} vs. {
"type": "object",
"required": ["type"],
"propertyDependencies":
{
"type":
{
"A": { "$ref" : "..." },
"B": { "$ref" : "..." },
"C": { "$ref" : "..." },
"D": { "$ref" : "..." }
}
},
"defaultPropertyDependency":
{
"type": false
}
} I'm assuming In general, having to repeat the discriminator values in |
I think I see the usefulness of this, but my concern about non-string values remains. That said, I think if we were to change the name from Perhaps something like "selector," following the "discriminator' example. Or "switch" following in the "if/then/else" theme ("select" works here, too, since that's the word that's used in some languages for the same construct). You'd end up with {
"switch": {
"foo": {
"aaa": { "$ref": "#/$defs/foo-aaa" },
"bbb": { "$ref": "#/$defs/foo-bbb" }
}
}
} which to me actually reads very similar to switch (foo) {
case "aaa":
return Validate("#/$defs/foo-aaa");
case "bbb":
return Validate("#/$defs/foo-bbb");
default: // with @letmaik's suggestion 👍
return false;
} With this, it's obvious that we're switching out the schema we're using to validate the object based on the value in one of its properties. One thing that stands out, though is a limitation on using a property at the (local) instance root. What if the discriminating property is nested inside the instance, for example, in some meta-data container property? Could we use pointers (not URIs) to indicate the instance path? {
"switch": {
"/meta/foo": {
"aaa": { "$ref": "#/$defs/foo-aaa" },
"bbb": { "$ref": "#/$defs/foo-bbb" }
}
}
} |
@letmaik Thanks for calling this out. It's been really helpful to think through some of the use cases.
I understand the case you're calling out. If you need to define a closed set of dependencies, the property values would have to be kept in sync between the I see how
Actually, I was thinking |
I'm more than happy to come up with a better name. I chose the name because it works exactly like
I don't follow this suggestion. "selector" is not a term used with
My concern with "switch" is that switch statements have fall-through semantics, which is why you need to put
I disagree that switching schemas is a good description of what this is doing or even what it should be doing. Switching, to me, sounds like something is being swapped out for something else or a path is being chosen from a set of alternatives. Nothing is being replaced and these aren't exactly alternatives. Schemas are applied if they match a condition. Sometimes multiple schemas will apply and other times none will apply.
We could do that. It slightly complicates the normal case, but only slightly. However, I think I'd prefer to just have people fall back to |
I think I just phrased that in a misleading way. It's not three repetitions, but three occurences. The one I haven't explicitly written down would be in the subschemas. For example, the schema referenced in {
"properties": {
"type": { "const": "A" },
...
} If you were able to use
Interesting, I guess that would work and probably be what is needed in most cases. I'm all for making it less verbose. |
How could multiple apply? The options are the value of a property, and they're explicitly stated. A If you're uncomfortable with
To me that's exactly what's going on. You're making a selection of which subschema to apply to an object based on one of its property values. |
Random idea: |
Nah, I'll take it back. This may imply you want to define a schema for the actual value, not that something should happen based on a value. |
Haskell has case expressions and they look like this: case type of
"A" -> ...
"B" -> ...
"C" -> ... Maybe that's another candidate: "case": {
"type": {
"A": { "$ref": ... },
"B": { "$ref": ... },
"C": { "$ref": ... }
}
} |
I see what you mean. As you say, the one in the sub-schemas isn't necessary. If you want to include it, then, yes, that would be another occurrence.
Yes, you could have multiple properties apply. That's all I meant.
I still don't think that selecting is the right metaphor, but it's not worse than
I had considered
That's a possibility. Or, something with the word "case" in it. This type of feature is generally referred to as pattern matching, so I had considered something with those words as well, but "pattern" isn't quite right here. Maybe something like |
I've been following this issue for a while due to interest and need and thought I'd contribute an idea I've had given the recent comments {
"caseOf": {
"properties": [
"p1",
"p2"
],
"cases": [
{
"case": [1, 2],
"is": {"$ref": "..."}
}
]
}
} What's cool about
wdyt? |
My first hesitation on I'm not opposed to the idea, but I'd want a really good reason why such a construct would be necessary instead of constructs like we already have. |
@bsless Thanks for sharing your ideas. It's really appreciated. We need more people thinking about these problems. My main concern about this approach is that there are is a lot going on. One of my main goals for this keyword is to keep it simple enough that people will use it rather than solve every possible use-case. The ability to validate this kind of thing already exists with I think your proposal is essentially the same as the alternative I discussed in #1082 (comment). Your example would look like this. {
"allOf": [
{
"ifProperties": {
"p1": { "const": 1 }
"p2": { "const": 2 }
},
"then": { "$ref": "..." }
}
]
} I think this is just as powerful, but more ideomatic JSON Schema. In any case, I'm afraid people won't want to reach for either of these for the most common case because it's too verbose and use |
Thank you @jdesrosiers I see a slight difference between #1082 (comment) and my suggestion, and it relates to two elements in my rationale:
What makes It's possible these two converge and I'm just not seeing it, this isn't a simple subject. |
@bsless I'm not following everything you're saying. I think it's because you're coming from a perspective that's a bit out of scope for JSON Schema ...
Keep in mind that JSON Schema is a validation language, not a type definition language. Mapping to type systems is out of scope. But even so, I don't see how
Maybe I don't understand your example, but I don't see how
There is no limitation.
They are just sugar over |
For now, causes an uncertainty from removeAdditional / useDefaults, but that could be resolved/optimized in some certain situations later. Refs: json-schema-org/json-schema-spec#1082 Refs: json-schema-org/json-schema-spec#1143
@jdesrosiers @gregsdennis @karenetheridge Can we use something like this: {
"propertyDiscriminator": {
"foo": [
[
["aaa", 2],
{
"$ref": "#/$defs/foo-aaa"
}
],
[
["bbb", false],
{
"$ref": "#/$defs/foo-bbb"
}
]
]
}
} If Or just simplify {
"propertyDiscriminator": {
"foo": [
[
"aaa",
{
"$ref": "#/$defs/foo-aaa"
}
],
[
2333,
{
"$ref": "#/$defs/foo-bbb"
}
]
]
}
} And this is O(1) selection. For example: const schema ={
"propertyDiscriminator": {
"foo": [
[
"aaa",
{
"$ref": "#/$defs/foo-aaa"
}
],
[
2333,
{
"$ref": "#/$defs/foo-bbb"
}
]
]
}
}
const fooMap = new Map()
for (const [key, ref] of schema.propertyDiscriminator.foo) {
fooMap.set(key,ref)
}
// then validate obj
const obj = { foo: 2333, bar: 1234 }
const ref = fooMap.get(obj.foo) // O(1) selection And then we can support both numeric support is important because some company always use numeric enum as discriminator. |
@xiaoxiangmoe The second version is an interesting option. One of my main design goals for this keyword has always been to keep it as simple as possible or people won't use it. This is a little more complex, but I don't think it's too much. However, this isn't actually an O(1) process. The lookup from the Map is O(1), but the cost of building the Map is O(n). Implementations that have a compile step could produce the Map in the compile step and then evaluate instances using that Map in O(1), but most implementations don't have a compile step. Overall, I think you've presented a reasonable alternative for this keyword. It makes it more versatile without adding too much complexity. It's technically a performance regression, but it can be optimized to O(1) in a lot of cases. In any case, we're talking about very small values for "n", so O(n) isn't much worse that O(1). Since this issue is closed, could you please create a new issue or discussion proposing this change? |
This also address my concern about only supporting string values. I think we could technically support any JSON value. I'd be open to exploring this. |
The pattern of using
oneOf
to describe a choice between two types has become ubiquitous.However, this pattern is also notorious for its many shortcomings. There is a better way to describe this kind of constraint that doesn't have all the problems of the
oneOf
pattern, but it's verbose, error prone, and not particularly intuitive.OpenAPI addresses this problem with the
discriminator
keyword. However their approach is more oriented toward code generation concerns and is poorly specified when it comes to validation. I don't think we should adoptdiscriminator
, but I do think we need something like it. I believe this is the thing that is generating the most questions in our community right now.Right now, we have the
dependentSchemas
keyword that is very close to what is needed except it checks for the presence of a property rather than it's value. ThepropertyDependencies
keyword builds on that concept to solve the problem.If the instance is an object, then for every property (name) and value of that property (value), if
/propertyDependencies/{name}/{value}
is defined, then the instance must be valid against the schema at that location. Compared todiscriminator
, this is more consistent with the style of JSON Schema keywords because it doesn't use sub-keywords likepropertyName
andmappings
. It's also more powerful because it allows you to discriminate using more than one property.Because of the parallels to
dependencies
, I chose to name it in similar way. However,dependencies
is the least well known and understood keyword, so it might not be beneficial to build on that naming convention. Either way, I don't think we should call itdiscriminator
to avoid confusion with what OpenAPI has specified.The text was updated successfully, but these errors were encountered: