Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes the currently failing json-schema-test-suite tests #370

Closed
wants to merge 3 commits into from

Conversation

bsmithers
Copy link
Contributor

Currently using jsonschema at work and interested in getting draft6 support finalised. To jump in, I've provided a couple of quick fixes for the previously failing tests.

Before:
FAILED (skips=47, failures=3, errors=10, successes=1209)
After:
PASSED (skips=47, successes=1254)

Note that I pulled in the latest tests from the JSON schema test suit repo.

I was hesitant to have RefResolver needing to know about id vs $id, but I could see no other way without removing RefResolver.from_schema() from the interface. I also briefly looked at the tests being skipped, which appear to require RefResolver to be able to parse the id attribute, but I'll comment on that separately.

- Add custom type validator for draft6 (a la draft3), in order to fix
  the change in floats without fractional component (now considered
  ints)
- Add a 'scope_key' property to validators, to handle the difference in
  id vs $id.
@bsmithers bsmithers mentioned this pull request Nov 5, 2017
- uritemplate package latest version is 3.0.0, updated extra_requires to
  >=
- Fixed jsonpointer test where instance is not str
@Julian
Copy link
Member

Julian commented Nov 5, 2017

Hey! Awesome, thanks so much for trying to move this along, these 3 things (draft 6, ref v2, and perf) have all hit little walls which has left them hanging there :/, so definitely definitely appreciate some help moving them to completion.

Gonna have to leave some detailed comments, for now just doing a first pass.

@@ -155,6 +155,7 @@ for more information.
### PostgreSQL ###

* [postgres-json-schema](https://github.com/gavinwahl/postgres-json-schema)
* [is_jsonb_valid](https://github.com/furstenheim/is_jsonb_valid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm gonna just merge this immediately (the squash of the upstream changes)

@@ -56,7 +56,7 @@ def _validates(cls):
return _validates


def create(meta_schema, validators=(), version=None, default_types=None):
def create(meta_schema, validators=(), version=None, default_types=None, scope_key="id"):
Copy link
Member

@Julian Julian Nov 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this makes me a bit uneasy (which is why I didn't do it immediately myself, I was trying to convince myself what the right thing was...).

I think to be safe (future-proof) we need to make the interface a bit wider -- i.e., rather than just specifying the name of a key to use for scoping, we should provide something like a function here that the validator will use to detect scope (and yeah to make the current implementations just be lookups). I'm worried about the potential for some more complicated process, and once we already need an argument to do it, might as well have it be a generic one. Lemme know if that makes sense.

We also need to have direct tests for this (i.e. separate from any specific draft support).

Copy link
Contributor Author

@bsmithers bsmithers Nov 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lemme know if that makes sense.

You mean specifying a validator for id or $id to handle the scope change?

I did consider that, however I wasn't sure the corresponding architectural change would be nice and wondered if a simpler solution is better at least until such time as there are future differences. But I see your point.

We also need to have direct tests for this (i.e. separate from any specific draft support).

Agreed on that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be slightly narrower than a validator -- basically it is a function to replace the calls just to someschema.get("id") -- so, scope_of(schema), say.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And really, yeah this is tightly related to #346 unfortunately, which is to say, I think the even better thing is to come up with the right interface for that, provide the RefResolver class you want to use to this function, and have a function on that interface that is "scope of". I just can't decide whether it's obvious enough how that looks quite yet, and how much work it is to change it :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

provide the RefResolver class you want to use to this function

That was kinda where my initial thought was with this too. Unless I'm misunderstood, my concern is that a user may specify a RefResolver instance when calling .validate() -- this would mean a user is then specifying the draft number twice, i.e. Draft4Validator and Draft4RefResovler, which seemed unpleasant? I'm not up to speed enough to determine if that's a narrow enough issue though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right so I think the idea there would be to stash what is passed in here as the resolver callable onto the validator interface -- i.e., the newer way to construct resolvers would become DraftSomethingValidator(resolver=DraftSomethingValidator.Resolver(...))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright well I think I've got a bit of a better picture of the direction you'd like this to go in. I'll try and take a look at the related issues and then see if I can work towards something. Of course I shan't be offended if you're actively working on these things and prefer effort elsewhere.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideas I have, time, much less so :(, especially given that the ideas are often only partially baked and need refinement or experimentation to see whether they fully fit -- so definitely definitely would appreciate the help!

I think the one thing that seems most clear at the moment is the stuff I mentioned about type, so I might recommend that one as the first self-contained area to poke at -- widen the type interface, get that merged, and then come back here and wiggle this onto the new one.

But there are many many paths forward and yeah I super appreciate any help.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've taken a look at the type stuff [1] - it's WIP and not yet ready for a PR, but if you have the time to give some feedback before I take it further it'd be really appreciated.

In particular:

  • Rather than the single is_type injected function discussed in WIP: Support namedtuple validation #365, I've taken the approach of a function per type. I figure this is more composable for extensions
  • I presume a need to maintain backwards compatibility with the old default_types and per-instance types properties. I believe this does so (tests using this pass) - but ideas appreciated
  • I feel this works well for 'simple' types, for example handling the change in integer in draft6 and e.g. a hypothetical scenario where strings encoding ints are valid under integer [2]. However, some difficulty remains for something like Support namedtuple #364 to extend object to allow namedtuple - a user would have to provide custom validators for all object-based checks, e.g. in my test case [3]. Wether or not this is an acceptable level of work for a custom type or we should look to facilitate this I'm not sure.

[1] draft6...bsmithers:types
[2] https://github.com/bsmithers/jsonschema/blob/types/jsonschema/tests/test_types.py#L56
[3] https://github.com/bsmithers/jsonschema/blob/types/jsonschema/tests/test_types.py#L95

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So! I think that looks like a great start -- if I can inject one idea there (which is only 80% baked I think, so lemme know if I go off a rail) -- I think the right thing here is to take a cue from what's happening on the ref resolver and format checker sides -- to introduce a TypeChecker class, one that you associate an instance of with a validator class in create and extend.

The one major mistake I'd love to correct would be to not have mutable APIs for that -- so, to talk about the specific namedtuple case, the API I'd expect would look roughly like:

def object_or_namedtuple(value):
    return Draft4Validator.type_checker.is_type("object") or getattr(value, "_replace", None) is not None

say, and then Draft4Validator.type_checker.redefine(object=object_or_namedtuple) which should return a new instance (which you'd use via Draft4Validator(type_checker=that) where the appropriate redefinition has happened (attr.evolve + a pyrsistent.pmap of checkers are the pieces we'd use to make that happen).

You can then also have type_checker.redefine(**types) to convert from the old format, and I think the above does seem like a reasonable amount of work to do the namedtuple case.

How does that sound to you (and as usual apologies for spitting information out slowly here after you've already gotten started :), as usual with these things what seems right only starts to come out and evolve after progress has begun).

if schema is True:
resolver = RefResolver.from_schema({})
_schema = {}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here is that now you will mis-report schemas.

One major gap in draft 6 support is writing some tests for the new schema coersion behavior (of bools) -- specifically, this will misreport what the schema was in ValidationError.schema.

@@ -391,6 +391,15 @@ def type(validator, types, instance, schema):
yield ValidationError(_utils.types_msg(instance, types))


def type_draft6(validator, types, instance, schema):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here too I think draft 6 taught us we need to widen the type interface. I put some notes in #364, maybe have a look there if you don't mind?

@@ -356,7 +358,7 @@ def __init__(
self._remote_cache = remote_cache

@classmethod
def from_schema(cls, schema, *args, **kwargs):
def from_schema(cls, schema, scope_key="id", *args, **kwargs):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this also makes me uneasy -- up until now, ref resolvers were completely separate from drafts (because the behavior was the same across them).

I don't love this idea -- now a caller needs to know implementation details of the draft they're using.

I'd much rather encapsulate that, even if that means that we now need to have separate ref resolvers per draft (and we'd also need to preserve exactly the old interface for a class called just RefResolver). Annoying, but I think the whole change is annoying :/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this also makes me uneasy -- up until now, ref resolvers were completely separate from drafts (because the behavior was the same across them).

The problem is that they're not, because that from_schema method needs to understand id. My initial thought was actually to drop that method completely, but of course that's extremely unpleasant for users in the wild.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right but what I mean is -- as far as RefResolver is concerned, when presented with any schema, there is an algorithm you can use to unambiguously know this (but now this introduces the possibility of error).

Even now, you have this algorithm:

  • Look for a $schema key
  • If you find one, you (RefResolver) know what the correct scope key is
  • If not, you use the "package" default (which I'm not too happy with how we're managing currently :/, but it's a thing)

You can take that algorithm, and layer it on top of an "explicit" one -- say a Draft3Draft4RefResolver and a Draft6RefResolver alongside a resolver_for_schema that calls the right thing as above.

(I know I'm being terse still, lemme know if that's unclear)

@@ -28,7 +28,7 @@
"jsonpointer>1.13",
"rfc3987",
"strict-rfc3339",
"uritemplate>3.0.0",
"uritemplate>=3.0.0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know that the > was failing here, but that's because the patch with this wasn't merged yet (and in fact 3.0.0 doesn't have what we need).

So I think it was correct as-is [and probably even needs some more work once the upstream PR is merged].

Maybe you can chime in upstream with any comments? The PR is python-hyper/uritemplate#36

@@ -296,6 +296,8 @@ def is_css3_color(instance):
draft6="json-pointer", raises=jsonpointer.JsonPointerException,
)
def is_json_pointer(instance):
if not isinstance(instance, str_types):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a test :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It already does - the test fail was what caused me to fix :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah cool. Hm, I thought I'd ran these after updating. Thanks for fixing.

@bsmithers bsmithers mentioned this pull request Nov 15, 2017
@manycoding
Copy link

@Julian Hi, could this be merged? It should help to resolve other pull requests, I reckon.

@Julian
Copy link
Member

Julian commented Aug 8, 2018

@manycoding will have another look now but I'm not even sure anything in this PR is relevant anymore, all the tests now pass already and a lot of the specific fixes here went in separately.

What specifically from this one are you looking at/for?

@manycoding
Copy link

@Julian I was hoping this would fix failing tests for my pr #425

@Julian Julian closed this Sep 3, 2018
Julian added a commit that referenced this pull request Jun 9, 2020
817b724 add perl implementation and test suite to the user list
ca14e01 Merge branch 'pull/382'
3dabf55 move non-format tests out of the format directory, while keeping all ECMA regex tests together
4121aa5 move format tests to their own directory
4bae8aa Add more idn-hostname tests to draft-2019-09 and draft-07
6d91158 [325] Add some more hostname tests
e593057 Merge pull request #389 from ssilverman/readme-drafts
fb3766d README: Improve issue/PR links
79bef22 README: Update language around drafts
ade47e4 README: Add Snow to the list of supporting Java validators
fc0c14e README: Update simple JSON example
1167669 README: Update structure, consistency, spacing, and wrapping
9514122 Merge pull request #388 from json-schema-org/ether/maxProperties=0
7646490 test that maxProperties = 0 means the object is empty
c3f4319 Merge pull request #384 from ChALkeR/chalker/unique
7766f48 Improve uniqueItems validation
7555d41 Add unnormalized $id tests
11f70eb [300] Add tests for valid use of empty fragments in "$id"
b106ff0 [299] Add tests for invalid use of fragments in "$id"
4a74d45 Fix "tilde" spelling
3eca41b Merge pull request #379 from json-schema-org/ether/remove-wrapped-refs
d61bae8 remove wrapped refs
536ec07 [359] Add unevaluatedProperties/unevaluatedItems cousin tests
ac63eb7 Small README update that introduces the concept of directories
697944e Merge pull request #374 from karenetheridge/ether/allOf-anyOf-oneOf
33f8549 test all the *Of keywords together
44b99ed Merge pull request #373 from karenetheridge/ether/items-and-contains
4a2b52f some tests of items + contains
7f00cc8 add test that was present for other drafts but not 2019-09
a3f9e2e Merge pull request #371 from karenetheridge/ether/if-then-else-boolean
aeeaecc some tests with if/then/else and boolean schemas
b8a083c Merge pull request #372 from nomnoms12/unique-false-and-zero
85728f1 Add tests for uniqueness [1] and [true]
fd01a60 Add tests for uniqueness [1] and [true]
0a8823c Merge pull request #370 from karenetheridge/ether/nul-char
fa6f4dd add tests for the NUL character in strings
8bf2f7f Merge pull request #369 from ssilverman/data-desc
2ba7a76 Add data description
4f66078 Merge pull request #367 from karenetheridge/ether/additionalItems
283da7c some more tests for additionalItems
7ba95f3 add tests for keywords maxContains, minContains
2f2e7cf Merge pull request #365 from karenetheridge/ether/move-ecma-regex
8388f27 move ECMA regex tests under format/

git-subtree-dir: json
git-subtree-split: 817b724b7a64d7c18a8232aa32b5f1cc1d6dd153
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants