-
-
Notifications
You must be signed in to change notification settings - Fork 106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add date format validation to test_extract_from_text_properly_implemented
on test_ScraperExtractFromTextTest.py
#838
Comments
I'm going to go ahead and just push the update/fix for BAP1 but I like the idea of implementing the correct format. Thanks for tackling this. |
I put this in the PR, but I think the proper fix for this is to do a json schema for our outputs. It'd help folks understand the code too, if we had schemas for all our scrapers that had to pass all tests before PRs were merged. |
Another related error from lack of validation As you say @mlissner a json schema to validate would definetly help |
Yeah, let's get that prioritized. It shouldn't be terribly hard. Maybe a day or two, I'd guess. |
I have been trying this implementation (docs) which seems like a healthy project There is a small sample schema for the scrapers here
Some nice things:
In the end I think it will be faster doing these schemas by hand, since, at least for the scrapers, the scraped field names that Courtlistener expects are different from the model names proper, so changing that on the scraping side would require changes on the CL side This schema validation could replace a part of |
Looks great to me. |
Add tests for required properties, for types and formats, and for additional properties, to ensure the validator and the schemas work as expected Related to freelawproject#838
We had an error on courlistener when extracting date_filed using
extract_from_text
from recently added bap1The function
test_extract_from_text_properly_implemented
on test_ScraperExtractFromTextTest.py should force the user to use the proper format when dealing with date fieldsThe text was updated successfully, but these errors were encountered: