-
-
Notifications
You must be signed in to change notification settings - Fork 279
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
$combine
: Re-use with "additionalProperties": false
#119
Comments
@handrews thank you for that, it makes your proposal quite clear. I do like your idea that "schema algebra" can be outside of the JSON-Schema standard. As you noted, the complexity of consistent implementation is a concern, and I believe that consistency is very important. I'd rather it's very strictly specified in a very detailed way how it is supposed to work than leaving it to implementations. Inconsistency leads to a limited usability and in difficulties with migration between implementations. My bigger concern about "schema algebra", leaving complexity aside, is that it leads to more performance-expensive schemas in many cases. See your anyOf example, e.g., where A will have to be validated twice, combined with B and combined with C. For many real life applications it is a critical concern. The argument "too powerful" applies here as well I believe. I'd say it's more powerful than $merge, judging by the extent of changes it inflicts on schemas. Also, it is much more complex than generic extension mechanism that would be introduced by $merge and even by $use (which is as I wrote has a more complex definition than $merge). I am cool with $merge and $use staying out of JSON-Schema standard as well. Maybe @fge would be interested to collaborate on a separate I-D for $merge/$patch (that can be defined independent of JSON schema and instead rely on RFC6901 (JSON pointer), RFC3986 (URI) and JSON reference (draft 3?) - is there RFC already? ). And you can make a separate I-D(s) for $use and/or for "schema algebra". Then validator authors can decide what to implement and after some time we can decide if any of them should be merged (or used :) in the standard based on the number of users that use the features. How does it sound? |
As I said, I'm fine with none of these things making it into the standard, so that's fine with me. All I really care about is keeping both I don't care if
@awwright killed JSON Reference as a separate RFC. I'm still not sure I agree, but it's very low on my list of concerns. Although if you do make a |
It might be fun and useful to write out the schema algebra rules and put them on the JSON Schema web site. They don't need to be an I-D of their own, they all follow directly from the JSON Schema specification. If there's interest in that I can track it with a bug against the web site project. |
I'm working with large standardization group and we have quite some JSON Schema code base, where code reuse is top goal, so problems mentioned in google group post I have encountered multiple times. While I'm totally convinced JSON Schema just needs some extension keywords in order to allow re-use and extension, I can offer slightly different approach:
I slightly disagree on sample given:
Here obviously 'classical' merge/patch/combine is attempted and |
@DimDragon how would your |
In my mind
Now suppose I want to use this schema as base and just add In our library we had to deal with exact issue you sampled and while we keep ```"additionalProperties": false" appearances very low, we created something like a 'convention' to deal with this problem:
I'm not advocating strongly for such extension. I only wanted to point out that for me the core of the problem lies in using go to https://jschema.com and try this schema:
you will see kind of 'combine' in action, if you however attempt:
you will see the 'incompatibility' error. Hope this is helpful to demonstrate the 'abuse' of |
VOTE-A-RAMA!!! It's time to gauge community support for all re-use/extension/additionalProperties proposals and actually make some decisions for the next draft. Please use emoji reactions ON THIS COMMENT to indicate your support.
This is not a binding majority-rule vote, but it will be a very significant input into discussions. Here are the meanings for the emojis:
If you want to explain in more detail, feel free to add another comment, but please also vote on this comment. The vote should stay open for several weeks- I'll update this comment with specifics once we see how much feedback we are getting and whether any really obvious patterns emerge. |
With six against and two for (one person voted twice :-P ) I'm pretty comfortable closing this as non-viable even though it's only been a week and a halr. No one was close to volunteering to advocate for this, so unless someone comes along to step into that role, this is not going to be accepted. Also, it's my idea, and I think it's a mess :-) But if someone wants to champion this, please comment and ask to re-open it. |
fwiw, I'm not confident enough to advocate for the specific proposal, but this problem absolutely needs to be solved, and I'm happy to advocate for any solution that solves it. |
[Filing at the request of @epoberezkin in #15]
Background
This is a proposed solution for one of the ways people want to re-use schemas that declare
"additionalProperties": false
. It could be made part of the JSON Schema specification, either as a required or optional feature, or it could simply be published to illustrate how a completely separate tool could be built to work around the limitation. This separability hinges on the proposal being in the form of a preprocessing step.The reason to have the separability is that there is no other way to address this use case without violating at least one desirable property of JSON Schema validation, or introducing arbitrary JSON transformation capabilities into a standard that should be focused on schemas.
Allowing for partial implementation may also be a good idea, as any full solution will be complex to implement. A useful but limited solution is quite easy.
Keywords
$combine
and$combinable
There are two keywords in this proposal, both marked with a
$
to indicate that they (like$ref
) are structural elements outside of validation proper, and can be preprocessed out (either entirely, or lazily as needed in the case of$ref
cycles).$combine
implements the oft-requested re-use pattern of applyingadditionalProperties
only after inventorying allproperties
andpatternProperties
in all constituent schemas.$combinable
is a boolean keyword. Iffalse
, then$combine
has no special effect on the schema- it behaves likeallOf
(explained in detail below). This preserves an the use case of usingadditionalProperties
to intentionally close a schema off to this sort of "consider all properties" extension.If
$combinable
defaults totrue
, then current schemas must opt-out of being usable with$combine
. If it defaults tofalse
, then current schemas must opt-in. Opting in would be compatible with the current draft, so I recommend that$combinable
default tofalse
to avoid automatically circumventing any intent to close a schema to re-use.Formal description of the limitation of
additionalProperties
Note: This assumes that #101 (allow
true
andfalse
for all child schemas) is implemented, as it makes this whole thing about ten times more clear).For nearly all validation keywords k1, k2, k3... the following (which @awwright referred to as "linearity" in issue #65 ) holds true:
is equivalent to
There are only a few exceptions (see also issue #77 ), and they tend to be vexing.
additionalProperties
is the one that trips people up the most. Here are its properties in terms of "schema algebra" (for lack of a better term):is equivalent to
You can separate each entry in
properties
andpatternProperties
, and handle the child validation in each of those separate entries, but the only simplification that can be done to a factored-outadditionalProperties
is to reduce theproperties
andpatternProperties
down to just validating the instance object property names (by giving blank ortrue
child schemas) and leaving the child validation to the separated schemas.This is because
additionalProperties
determines how to behave by taking the set of instance properties, removing all properties that match eitherproperties
orpatternProperties
, and applying its child schema to the value of any properties that remain. So the child schemas inproperties
andpatternProperties
have no effect onadditionalProperties
and can be factored out, but the property names and patterns do.This is fundamentally why re-using schemas that use
additionalProperties
is so problematic for many people. It comes up most when it's set tofalse
, but the underlying problem affects any use of a non-defaultadditionalProperties
.Note that
required
,minProperties
, andmaxProperties
do not have to be carried into the factored-outadditionalProperties
schema, as they do not impact the behavior ofadditionalProperties
. Issue #65 proposes also removing any property named inrequired
from the set of properties affected byadditionalProperties
, which would mean thatrequired
would also need to be carried along in the factored-out schema. This is why I oppose #65.Re-use and
additionalProperties
In the thread about use cases for JSON Schema re-use, we agreed that there have been several use cases motivating various attempts to get around this algebraic limitation. A simple example of what's desired is that people want this:
to be equivalent to
Instead, that
allOf
combination can never validate- the foo schema forbids any property other than "foo", while the bar schema requires the property "bar".Note that an analogous problem occurs if
additionalProperties
is set to any non-blank schema. For instance, if it was set to{"type": "string"}
it would still be an impossible schema as that would be applied to "bar" which already required its value to be a number.patternProperties
behaves in an analogous way, so I mostly will ignore it to keep things (relatively) simple.Solving the problem with a pre-processing step
I propose to solve this with a keyword,
$combine
, indicating a pre-processing step:$
to indicate that, like$ref
, there is something structural going onadditionalProperties
shown at the beginning of this proposalExample conversion
Here is the conversion, replacing
allOf
with$combine
. After preprocessing, this:becomes this:
The only change to the first two entries is removing
additionalProperties
. The new third entry just collects the variousproperties
(andpatternProperties
if they were present) with blank schemas, and includes them with the factored-out"additionalProperties": false
.The same thing would happen when considering our alternate example with
"additionalProperties": {"type": "string"}
. The first two would look exactly as they do above, and the third would look the same except for{"type": "string"}
in place offalse
.Difficulty and optional-ness
Making these conversions is easy in the simple case. Even when there are child schemas, it's pretty easy to implement:
is equivalent to:
$combine
gets challenging when combining complex schemas that themselves involve boolean keywords. Implementing$combine
in such a situation requires implementing a kind of "schema algebra". Requiring this of all implementations seems excessively burdensome. As withformat
, I'd recommend specifying this but leaving it up to implementations whether to support all or part of the feature. In particular, an implementation may want to implement$combine
except not support combining schemas that contain boolean keywords.Here are the challenges for the boolean keywords:
allOf
Since we are converting to an
allOf
already, and boolean AND is associative, this is straightforward. It's just more branches to combine but the mechanism is the same. So{"$combine": [A, "allOf": [B, C]]}
is the same as{"$combine": [A, B, C]}
anyOf
An
anyOf
needs to be factored out, as the collection ofproperties
andpatternProperties
should not draw from all branches, but only from one at a time. So{"$combine": [A, "anyOf": [B, C]]}
becomes{"anyOf": [{"$combine": [A, B]}, {"$combine": [A, C]}]}
, which reduces the problem to our existing rules.not
This gets a little tricky. The fundamental principle with
not
is that:is equivalent to
which, by DeMorgan's law, is equivalent to:
The
not
doesn't affect the validation of the presence or absence of the properties that are named or matched by a pattern. It only affects the validation of child values.NOTE: Fully covering all of the things that can happen with
not
is complicated, and may require adding some new keywords likeminAdditionalProperties
to make possible in all cases. I'll elaborate on these issues if there is ever momentum behind this proposal. They are all solvable.oneOf
As
{"oneOf": [A, B, C]}
is equivalent to:It can be handled from this transformed point with the already established rules.
dependencies
Name dependencies are unaffected, but when dealing with schema dependencies we have to handle the conditional nature of how the dependent schemas are applied. This involves observing that:
Where Y is the rest of the schema containing
dependencies
, is equivalent to:This reduces the problem to a combination of
oneOf
andallOf
, which can therefore be reduced to a combination ofallOf
,anyOf
, andnot
, which gets us back to our known rules.Conclusion
This is, as far as I can tell, the only way to get this behavior while staying within the philosophical boundaries of JSON Schema.
I am not personally convinced that JSON Schema should directly incorporate any of the proposed "solutions" for the
"additionalProperties": false
"problem", but @epoberezkin requested in #15 that this be filed for the record.I strongly prefer this (plus #98 ) over introducing arbitrary JSON transformations as is done by #15 , but I would also be fine with rejecting both proposals (or even all three proposals including #98 ). All three proposals could be implemented separately and used either as a build step, or as an extended vocabulary indicated by another meta-schema.
The text was updated successfully, but these errors were encountered: