Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of oneOf error reporting #31

Closed
awwright opened this issue May 24, 2016 · 20 comments
Closed

Improve handling of oneOf error reporting #31

awwright opened this issue May 24, 2016 · 20 comments

Comments

@awwright
Copy link
Member

In cases where oneOf is used for multiple mutually exclusive options, it is frequently the case that the option to pick is a single key within the instance. E.g. the "type" property will be from the enum "Animal, Vegetable, Mineral" and the appropriate schema to apply is picked based on the value of this property.

Right now, all schemas must be tested (similar to an O(n) operation). Only one schema corresponding with an e.g. "type" property need be tested (similar to an O(log n) or O(1) operation), and only errors against that schema need be reported. Otherwise we get bizarre, non-helpful errors like so:

(1 of 1) instance.o3 is not exactly one from <http://example.org/Animal>,<http://example.org/Vegetable>,<http://example.org/Mineral>
@awwright
Copy link
Member Author

Upon further investigation, this is probably a mis-use of JSON Schema.

If there's multiple different, but similar, schemas, they should each get a different schema URI.

@handrews
Copy link
Contributor

This is addressed in much more detail in #64 because I did not realize that you had re-filed the bounding proposal from the old wiki here in this issue under a different name.

This is why I am so bothered by all of the open issues and pages in the old repo. Leaving them open without any link to this repo makes it look like they're not tracked here, which means duplicate filings.

Would you be willing to close this one? Or I could close the other and import all of the info from it, but the title of this issue is much more narrow.

@handrews
Copy link
Contributor

Now that we have const it is easy to check that keyword first and short-circuit any evaluation. I don't think there's anything else needed to specifically address this issue, and #64 covers other related concerns. Can we close this one?

@awwright
Copy link
Member Author

I think we can still offer some guidance on how validators can expose these cases to developers.

They can produce output like:

(1 of 2) instance.o3 is not exactly one from http://example.org/Animal,http://example.org/Vegetable,http://example.org/Mineral
(2 of 2) (if instance.o3.type=="animal") Missing expected property "species"

@handrews
Copy link
Contributor

@awwright true, but does that go in the spec or on the web site? An implementation notes section perhaps? I'm asking because I don't know- I do not have an opinion on where it should go, I'm just trying to learn what the appropriate scope for this sort of thing is in a spec.

@awwright
Copy link
Member Author

The spec is probably an appropriate place for a limited amount of advice on how to write the implementation.

JSON (RFC7159) might be a good example, while all it is is a media type; the RFC also has a brief section on parsers.

@handrews
Copy link
Contributor

handrews commented Dec 4, 2016

Right now, all schemas must be tested (similar to an O(n) operation). Only one schema corresponding with an e.g. "type" property need be tested (similar to an O(log n) or O(1) operation), and only errors against that schema need be reported.

@awwright The selectWith proposal in #64 fixes exactly this, without changing validation at all. It just enables an optimization and better error reporting.

@epoberezkin
Copy link
Member

I summary of our conversation with @handrews: I do not understand how the value that selectWith points to maps into any subschema inside oneOf/anyOf without using any complex algorythms.

The alternative idea is to have another validation keyword, e.g. selectCases that would have a map of subschemas.

@handrews
Copy link
Contributor

handrews commented Dec 5, 2016

@epoberezkin I'm not at all attached to the instance pointer version of selectWith. It could work with schema pointers you need to be able to extract the schema from each branch and apply it. Honestly, I'm becoming less supportive of this option. It fits the declarative approach but doesn't make it that much easier to understand.

In my experience, engineers needed to ramp up a bit on how to use "oneOf" but settled into it pretty well once they tried it out a bit. That's my main reason for continuing to object to imperative approaches. It's a dramatic departure to solve a "problem" that in my experience just needs a bit of explanation and best practices documentation.

The web site needs a schema patterns cookbook section, really.

@handrews
Copy link
Contributor

handrews commented Dec 7, 2016

So there seems to be a consensus developing around @epoberezkin if/then/else proposal in #180 (which avoids the complexity of switch that was my strongest objection to that option).

@epoberezkin
Copy link
Member

epoberezkin commented Dec 7, 2016

select could solve a related problem which although can be expressed with if/then/else looks quite clunky with it.

The thing I don't like about it is that it also uses a pointer to the data in a way similar to proposed $data reference without using it explicitly. So I think select can be considered, but only if we adopt $data.

With it we could:

{
  "select": { "$data": "0/kind" },
  "selectCases": {
    "value1": { "$ref": "schema1" },
    "value2": { "$ref": "schema2" }
  },
  "selectDefault": { "$ref": "defaultSchema" }
}

and you can "selectDefault": false.

The same can be expressed with conditional:

{
  "anyOf": [
    {
      "if": { "properties": { "kind": { "const": "value1" } } },
      "then": { "$ref": "schema1" },
      "else": false
    },
    {
      "if": { "properties": { "kind": { "const": "value2" } } },
      "then": { "$ref": "schema2" },
      "else": false
    },
    {
      "if": {
        "not": {
          "required": ["kind"],
          "properties": { "kind": { "enum": [ "value1", "value2" ] } }
        }
      },
      "then": { "$ref": "defaultSchema" }
    }
  ]
}

It's nowhere near as bad as if/then/else expressed with "anyOf" etc. But given how common this use case is (and usually you have much more than 2 options) I would very much like to see a shorter and much more expressive select form.

But let's think about it. The key for it is $data, introducing select without it is a bad idea I think.

@handrews
Copy link
Contributor

handrews commented Dec 7, 2016

@epoberezkin yeah, with $data it starts to make sense. I'm not entirely sold on it as I dislike having too many different ways to do things (I always hated Perl for that- and I've written tens of thousands of lines of it so I know what I'm talking about :-) But it's probably worth coming back to once we see how people use if/then/else in the wild, assuming it goes into Draft 7. I don't find the anyOf + if/then/else to be to hard to read, but then again I'm fine reading "conditionals" with oneOf so I'm probably not the best person to say here.

@epoberezkin
Copy link
Member

I don't find the anyOf + if/then/else to be to hard to read

I agree. But I've just realised that handling the default case makes it particularly ugly, as you have to repeat all the conditions in the negated form (because there is no order defined for anyOf validation)

@handrews
Copy link
Contributor

handrews commented Dec 7, 2016

But I've just realised that handling the default case makes it particularly ugly

Hmm.... let's wait and see how much demand there really is for the default case.

@epoberezkin
Copy link
Member

I agree

@handrews handrews added format and removed format labels May 12, 2017
@handrews handrews added this to the draft-07 (wright-*-02) milestone May 16, 2017
@handrews
Copy link
Contributor

I'm not sure why I put this in the draft-07 milestone. I guess because we were putting if/then/else in this milestone? I'm going to bump it out to future. We'll either implement this or close it unimplemented depending on feedback on if/then/else.

Note also that this all somewhat like OpenAPI's discriminator keyword.

@awwright
Copy link
Member Author

@handrews
Copy link
Contributor

@awwright I think this is addressed by the output guidance in the latest draft? Plus if/then/`else being usable to increase the performance by prioritizing the most important tests.

Since this issues ended up with a couple of related ideas in it, and the discussion is by now quite outdated, I'm going to close it and if there are still outstanding concerns can you file them as new separate issues?

@colevscode
Copy link

I've searched through several threads here. There's no clarity for how I should implement an array with oneOf containing several distinct subschemas. I'm using python jsonschema for validation which supports draft7. I started with just an array of subschemas, but it gives me back meaningless errors when none of the subschemas match. I tried the above if/then/else:false pattern which caused the validator to fail in cases where it should have passed. I'm resorting to a bunch of nested if/then/else statements, but as I add more cases this will become ugly.

I'm here to +1 a switch statement. Barring that, is there a way to provide a mechanism for resolving oneOf ambiguity that can be guaranteed by the spec, as opposed to a gray area left up to implementors?

Here's what my code looks like today:

    "items": {
        "if": {"properties": {"type": {"const": "email"}}},
        "then": {"$ref": "#/definitions/email"},
        "else": {
            "if": {
                "properties": {"type": {"const": "webhook"}}
            },
            "then": {"$ref": "#/definitions/webhook"},
            "else": {
                "if": {
                    "properties": {
                        "app": {"const": "mailchimp"},
                        "type": {"const": "addOrUpdateContact"},
                    }
                },
                "then": {"$ref": "#/definitions/mailchimp"},
                "else": {
                    "if": {
                        "properties": {
                            "app": {"const": "convertkit"},
                            "type": {"const": "applyTags"},
                        }
                    },
                    "then": {
                        "$ref": "#/definitions/convertkit",
                    },
                    "else": False,
                },
            },
        },
    },
    "type": "array",

@karenetheridge
Copy link
Member

karenetheridge commented Jul 18, 2020

Instead of nested {if: .., then: .., else: { if: .., then: .. }} statements, you can bring all the cases up to the same level with allOf and if/then (with no else):

allOf: [
  {
    if: { case1 },
    then: { .. }
  },
  {
    if: { case2 },
    then: { .. }
  },
  {
    if: { case3 },
    then: { .. }
  }
]

Bonus: this also allows you to have overlapping conditions.

If you don't have overlapping conditions (i.e. they are mutually exclusive) and your implementation supports short-circuiting, then you could change the allOf to an anyOf and then the first if/then that passes will result in the other conditions not being checked (which could be a big boon to speed if you have a lot of subschemas to check).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

5 participants