Validation tests #256

will-moore · 2021-11-10T10:38:36Z

Hi,
I read on the README https://github.com/HumanBrainProject/openMINDS_core#tests that "In the tests directory you can find JSON-LDs designed to test the validation behaviour of each schema" but I don't see a tests/ directory?

I'd like to understand how OpenMINDS does validation of JSON data against a schema.
My initial investigation is described at ome/ngff#75

Thanks,
Will

The text was updated successfully, but these errors were encountered:

lzehl · 2021-11-10T11:39:35Z

Hi Will, thanks a lot for raising this issue! Here some comments from my side, but I hope @skoehnen & @olinux will comment as well:

The tests/ directories are indeed missing at the moment, but these are not used for the actual validation, they are meant to be used to test the validation itself. They are missing, and are partially out of date, because we lack the manpower at the moment. I hope we can start to tackle these open issues latest in the beginning of the new year.

The schemas themselves are validated by the openMINDS pipeline. Here I would refer you to @olinux for details.

As for the general validation of JSON-LDs (metadata instances) against the schemas: technically you can simply use any JSON-Schema validators for this for the JSON-Schemas of openMINDS (not the *schema.tpl.json, see note below).

@olinux you can maybe explain how the validation for instances is done in the EBRAINS KG.

If you use the openMINDS Python library the validation should be done through the package which apparently does not work / is not implemented yet (@skoehnen please comment)

Please note that openMINDS Python is released in an alpha version. Not all features are fully implemented yet.
The validation is one we did not test/discuss yet in detail. For this reason I'm grateful that you brought this issue up!

I tested it myself and the instances seem indeed not to be fully validated yet (specifically the expected value type of the instance seems not to be correctly validated yet). This is indeed a bit tricky because JSON-Schema cannot handle JSON-LD linkages well (as far as I know). This might be the reason why this validation was not yet implemented.
@skoehnen can we correct / tackle this rather soonish?

Some notes/questions:

you can always look at the openMINDS schema template (*.schema.tpl.json), but the instances are validated against the respective formal JSON-Schemas.
what version did you select for the openMINDS collection?
can you explain a bit ngff for me/us?
let us/me know if you want to set up a meeting for further discussions?

🙂 Lyuba

will-moore · 2021-11-11T14:51:00Z

Hi Lyuba, thanks for your reply.
We at OME are working on a "Next generation file format" for (bio)-imaging data, which is based on the Zarr format, with the metadata being stored in JSON (maybe moving to JSON-LD in future). It's early days, but you can read the current ngff spec at https://ngff.openmicroscopy.org/latest/.

We are starting to look at how we can validate the JSON metadata in ngff data, since we expect these files to be produced by many different users and tools and we want to be able to quickly validate the output.
We've been looking at various options, e.g. SALAD and shacl, and also openMINDs.

For the testing on ome/ngff#75 I had openMINDS==0.0.9.
I'll look more at JSON-Schema validation...

Will.

lzehl · 2021-11-11T15:08:53Z

Ah I understand. So for openMINDS we make use of existing validators.

The current target format for openMINDS would be JSON-Schema for the schemas and JSON-LD for the metadata instances. Besides the LD part, our instances can be tested via a JSON-Schema validator. The LD part is a bit more tricky. Depth 1 for parent / child is fine and should be covered by JSON-Schema (with some tweaks), but actual graph validation across multiple nodes is not covered yet.

The openMINDS syntax (.schema.tpl.json) does not have its own instance validator at the moment (I also do not think we aim for one, but that may change). Meaning the openMINDS instances are always validated against the schemas that were translated to the JSON-Schema target format.

In addition, on the long run, we probably also support SHACL as target format for the openMINDS schemas.

More close to your approach is probably the validator of BIDS (Brain Imaging Data Structure). Not sure if you checked that out already?

Thanks for raising the issue though. We definitely need to fix the validation of value types in the Python library 😉
I hope my comments clarified a bit more how we do our validation.

skoehnen · 2021-11-11T15:54:52Z

To add some links, we use this validator for JSON-schema files https://pypi.org/project/jsonschema/
The validation of JSON-LDs across links definitely more complicated.

lzehl · 2022-07-26T13:48:21Z

@skoehnen just as a reminder that we could implement this as feature for the new version of the Python library. I suggest to implement the validation as separate function that a user can call when ready (with warning / error reports):

validation of individual instance, e.g., my_instance.validate()
validation of collection, e.g., my_collection.validate()

Validation should include simple value constraints (incl. type of linked child) for individual instances.
Some time in the future this can be complemented with a model validation based on model constraints (which are currently not yet formulated in the schemas).

lzehl added bug something isn't working question further information is requested labels Nov 11, 2021

will-moore mentioned this issue Nov 12, 2021

Investigate OpenMINDS as an option for ngff validation ome/ngff#75

Closed

will-moore mentioned this issue Jan 12, 2022

Collections Specification ome/ngff#31

Open

lzehl closed this as completed Jul 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Validation tests #256

Validation tests #256

will-moore commented Nov 10, 2021

lzehl commented Nov 10, 2021 •

edited

Loading

will-moore commented Nov 11, 2021

lzehl commented Nov 11, 2021 •

edited

Loading

skoehnen commented Nov 11, 2021

lzehl commented Jul 26, 2022

Validation tests #256

Validation tests #256

Comments

will-moore commented Nov 10, 2021

lzehl commented Nov 10, 2021 • edited Loading

will-moore commented Nov 11, 2021

lzehl commented Nov 11, 2021 • edited Loading

skoehnen commented Nov 11, 2021

lzehl commented Jul 26, 2022

lzehl commented Nov 10, 2021 •

edited

Loading

lzehl commented Nov 11, 2021 •

edited

Loading