Important: This repository is no longer under active development.
Please see our newer repository agr_curation_schema for any and all updates regarding Alliance identifier processing and schema.
This directory contains JSON schemas used to define data for integration into the Alliance of Genome Resources and defines the data dictionary of the Alliance data store. All information is used internally by the Alliance and is not meant to be interpreted as public-facing or an approach towards identifier standardization.
The python script "agr_validate.py" can be used to validate a JSON entry against a schema for testing/development purposes.
Usage is as follows:
agr_validate.py -d test_data.json -s base_schema.json
For the basic Gene info file run
./agr_validate.py -d <your_new_gene_file.json> -s gene/geneMetaData.json
for the disease info file run
./agr_validate.py -d <your_new_disease_file.json> -s disease/diseaseMetaDataDefinition.json
and for the allele info file run
./agr_validate.py -d <your_new_allele_file.json> -s allele/alleleMetaData.json
The java script "agr_validate_schema.sh" can be used to validate that the schema file itself conforms to the draft-4 version of the JSON schema spec and will run on PR into master.
For validating all schema files in a branch:
./agr_validate_schema.sh
JSON schema files are validated by the Continuous Integration / Deployment System (GOCD) through a JAVA JSON validator made available through the Docker base container. This is because the JAVA implementation is much more strict than the Python implementation. This validation is called by the default command in the Docker files. The way the java works is that the Dockerfile points to a directory full of JSON files. Java opens each file and checks for the "$schema": "http://json-schema.org/draft-04/schema#", property if its found then it validates it, otherwise ignores the file.
In order to run these validations locally install Docker and then run the command:
make run
The Alliance Data Dictionary is a set of information that describes the content, format and structure of the nodes and relations in the Alliance data store. metaschema.yaml defines the potential content of each sub-schema. Each file is named according to its corresponding node label.