Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge master #144

Open
wants to merge 97 commits into
base: RO-3698
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
97 commits
Select commit Hold shift + click to select a range
9f34b12
Update fall-back files to latest
aliciaaevans Apr 13, 2023
33c1d3b
remove backup that is not needed
aliciaaevans Apr 13, 2023
cde9c0a
Merge pull request #121 from rcsb/RO-3751
aliciaaevans Apr 19, 2023
2313064
Remove internal VRPT_DICT_LOCATOR and corresponding categories from s…
brindakv Apr 19, 2023
dbf442a
Remove internal VRPT_DICT_LOCATOR and corresponding categories from s…
brindakv Apr 19, 2023
8e61d99
Updated schema
brindakv Apr 19, 2023
829b728
Merge pull request #122 from rcsb/RO-3698-vrpt
brindakv Apr 21, 2023
4b419c6
import 20230607 data
zkzfng Jul 11, 2023
7a2f98a
Merge pull request #123 from rcsb/dev-2023-score
piehld Jul 12, 2023
625f24d
Schema improvements to support reference sequence coverage loading an…
brindakv Jul 26, 2023
132e774
Updated schema
brindakv Jul 26, 2023
2c41508
Add UI metadata for Pfam protein family name and Improve indexing for…
brindakv Jul 26, 2023
b97fd9f
Udpdated schema
brindakv Jul 26, 2023
da6200d
Merge pull request #126 from rcsb/RO-3904-4002
brindakv Jul 26, 2023
cacffaf
Rollback RO-4002
brindakv Jul 26, 2023
e148064
Updated schema
brindakv Jul 26, 2023
9f703d0
Resolve conflicts in config yml file
brindakv Jul 26, 2023
bcf578c
Resolve conflicts
brindakv Jul 27, 2023
98c1aa7
Merge branch 'master' into RO-3979-4002
brindakv Aug 1, 2023
dfea8b9
Merge pull request #125 from rcsb/RO-3979-4002
brindakv Aug 1, 2023
822d995
RO-4015: Remove GO and InterPro context values from rcsb_nested_index…
piehld Aug 8, 2023
cad28a5
Merge pull request #127 from rcsb/RO-4015
piehld Aug 10, 2023
86cd5a0
Frontend related schema updates to support search of sequence coverag…
brindakv Sep 7, 2023
471f100
Update schema
brindakv Sep 7, 2023
870a13c
Merge pull request #128 from rcsb/RO-3979-FE
brindakv Sep 7, 2023
510fd6e
Add "COD" to _rcsb_chem_comp_related.resource_name enumerations
piehld Sep 14, 2023
bd953d3
Update mmcif_pdbx_v5_next.dic to version 5.387
brindakv Sep 23, 2023
0d3b705
Add support for mmCIF item_type pdb_id_u
brindakv Sep 23, 2023
29ce3be
Update schema
brindakv Sep 23, 2023
c221da8
Add new Pfam mapping fallback file
piehld Oct 3, 2023
ee0d4da
Merge pull request #129 from rcsb/RO-4051
brindakv Oct 3, 2023
b458b9d
Update drugbank_info category
brindakv Nov 17, 2023
66ec549
Update sub-category aggregates
brindakv Nov 17, 2023
491362a
Add support for list of dates in nested subcategories
brindakv Nov 18, 2023
c522f3e
Update config to support array of dates in nested subcategories
brindakv Nov 18, 2023
95b54b2
Add support for min items and unique items in config file
brindakv Nov 20, 2023
12eda2f
Update drugbank_products subcategory name
brindakv Nov 20, 2023
ce75344
Updated config
brindakv Nov 21, 2023
214e2fc
Add enums for drugbank_info.drug_products_country
brindakv Nov 21, 2023
5921cb6
Update schema
brindakv Nov 21, 2023
35a2a94
Merge pull request #130 from rcsb/RO-3975
brindakv Dec 1, 2023
b867e40
Update config file to expose unobserved residue and atom coverage
brindakv Dec 1, 2023
45eac04
Updated schema
brindakv Dec 1, 2023
e7ec955
Merge pull request #131 from rcsb/RO-4130
brindakv Dec 1, 2023
94aaa9a
Add definitions for deuterated water molecule counts at the entry and…
brindakv Feb 1, 2024
7269e61
Update mmcif_pdbx_v5_next.dic reference dictionary
brindakv Feb 1, 2024
6dff7ef
Add support for counts of deuterated water molecules at the entry and…
brindakv Feb 1, 2024
8fb717a
Updated schema
brindakv Feb 1, 2024
7edca99
Merge pull request #132 from rcsb/RO-3860
brindakv Feb 6, 2024
d1ef3f2
Update _rcsb_nonpolymer_instance_annotation.type to support ligands t…
brindakv Feb 14, 2024
442c150
Updated schema
brindakv Feb 14, 2024
53216e2
Update _rcsb_nonpolymer_instance_annotation.type to support ligands t…
brindakv Feb 19, 2024
0bf2e1e
Updated schema
brindakv Feb 19, 2024
ebdbcbb
Update examples
brindakv Feb 19, 2024
9097fda
Updated schema
brindakv Feb 19, 2024
67cb3ed
Address RO-4214: Add exact-match search context for entity_src_gen.pd…
brindakv Feb 20, 2024
1b860a4
Updated schema
brindakv Feb 20, 2024
d48853f
Merge pull request #133 from rcsb/RO-4212
brindakv Feb 28, 2024
5d98adb
RO-4169: Change rcsb_entity_source_organism.pdbx_src_id and rcsb_enti…
piehld Mar 4, 2024
86c350b
Merge pull request #134 from rcsb/RO-4169
piehld Mar 21, 2024
7bbeb67
Add GlyGen to rcsb_polymer_instance_annotation.type enumeration list;…
piehld Mar 26, 2024
760049f
Update version information in rcsb_polymer_instance_annotations.dic
piehld Mar 28, 2024
c87dd6d
Merge pull request #135 from rcsb/RO-3958
piehld Apr 1, 2024
a031998
Update mmCIF dictionary to version 5.398
brindakv Apr 3, 2024
5d7c8f7
Add int_list to data type mapping to support updates in mmCIF dictionary
brindakv Apr 3, 2024
0c19631
Updated config file
brindakv Apr 3, 2024
936bc9f
Updated schema
brindakv Apr 3, 2024
18c166a
Merge pull request #136 from rcsb/RO-4222
brindakv Apr 4, 2024
7b83dab
update SCOP2 fallback
piehld Apr 23, 2024
12a9ee9
Merge pull request #137 from rcsb/dev-dwp
piehld Apr 24, 2024
5cb0bbe
update enzyme-data fallback file
piehld May 8, 2024
0967d77
upload 2024 ligand score reference
zkzfng May 30, 2024
0de4518
Merge pull request #138 from rcsb/dev-2024-score
piehld May 30, 2024
1eb4ba6
Add pdb_pfam_mapping.tsv.gz
piehld Jun 3, 2024
0a1f2bf
Update ChemAxon data file cc-full-chemaxon-descriptors.json
piehld Jun 18, 2024
0a03ce3
Merge pull request #139 from rcsb/dev-dwp
shaochenghua Jun 18, 2024
e507ddf
Add ligand interaction enumerations to polymer/entity instance_featur…
piehld Jun 25, 2024
fce91f5
Update branched_entity_instance dictionary
piehld Jul 16, 2024
86de67b
revert branched_entity_instance update
piehld Jul 16, 2024
046917f
fix _rcsb_target_neighbors.connect_type definition
piehld Jul 24, 2024
6b2c2f3
minor updates
piehld Jul 25, 2024
ad01e92
redo
piehld Jul 25, 2024
b2a4677
Merge pull request #140 from rcsb/ro-4209
piehld Jul 25, 2024
7941c61
Revert "Merge pull request #140 from rcsb/ro-4209"
piehld Jul 25, 2024
0b6f0b0
RO-4209: Update polymer instance features with ligand interaction enu…
piehld Jul 30, 2024
97fb6de
Merge pull request #141 from rcsb/ro-4209-v2
piehld Aug 12, 2024
edc19a6
Add fruitfly and yeast glygen fallback files
piehld Aug 20, 2024
f4d9520
Update label under rcsb_description for em_software.name
brindakv Aug 23, 2024
6e5ce27
Updated schema
brindakv Aug 24, 2024
47a2403
Merge pull request #142 from rcsb/RO-4361
piehld Aug 27, 2024
14e5870
RO-4377: Add new enumerations for repository_content_types
piehld Sep 9, 2024
43a4771
update JSON files
piehld Sep 9, 2024
01fe075
Merge pull request #143 from rcsb/ro-4377
piehld Sep 11, 2024
8ace30e
Update PDBx/mmCIF to latest version (5.406) and address RO-4372 & RO-…
brindakv Oct 23, 2024
bc9ca22
Upversion schemas
brindakv Oct 23, 2024
09c1f61
Updated schemas
brindakv Oct 23, 2024
dca0598
Merge pull request #146 from rcsb/RO-4372-4392
brindakv Oct 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
53 changes: 53 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,56 @@
## Introduction

This module contains a collection of configuration, dictionary and data assets supporting processing and loading of both PDB repository and derived data content using relational and document database stores.

## Updating and Generating Schemas

The process of updating schemas involves two main steps: (1) Updating the relevant dictionary definitions, and (2) Generating the schema files based on the updated dictionary files.

### Updating Dictionary Files

All dictionary files used for schemas are found under the `dictionary_files` directory. The final "production-level" dictionary files are stored under `dictionary_files/dist`, which are composed from merging a set of the smaller component dictionary files together (e.g., those under `base` and `extension`).

Accordingly, the general set of steps for updating the dictionary files are as follows:
1. Update the relevant dictionary file under `dictionary_files`. This could either be the `base` `core` dictionary file (`dictionary_files/base/rcsb_mmcif_ext-core.dic`) or an `extensions` dictionary file (`dictionary_files/extensions/*`).
- Also be sure to update the version information at the top of the file
2. Update the `base` `header` dictionary file: `dictionary_files/base/rcsb_mmcif_ext-header.dic`
- This simply involves updating the version and description of changes
3. Generate the `dist` dictionary by running the `Build.sh` script:
- From the `dictionary_files` directory, run: `./scripts/Build.sh rcsb_mmcif_ext local`

### Generating Schema Files

Once the dictionary files are updated, the schema files can be generated.

Before generating the schema files, several things should be configured/set-up first:
1. Make sure the configuration file, `./config/exdb-config-example.yml`, exists and points to local `py-rcsb_exdb_assets` directory tree
- If the config file is not present, copy it from https://github.com/rcsb/py-rcsb_db/blob/master/rcsb/db/config/exdb-config-example.yml
2. Make sure the `schema_update_cli` command is installed.
- If not, get it by installing the `rcsb.db` package: `pip install rcsb.db`

Once the above are confirmed, the general steps for generating the schema files are as follows:
1. Make any necessary changes to the schema configuration file: `./config/exdb-config-schema.yml`
- Even if no direct schema changes are necessary, be sure to update the version number (`major.minor.patch`) of the corresponding collection schema that is being updated based on the dictionary updates, e.g.:

```
pdbx_core:
- NAME: pdbx_core_entry
VERSION: 9.0.1
```

2. From the top directory of `py-rcsb_exdb_assets`, run the schema generation command (adjusting the arguments as necessary):
- This will create a directory `CACHE` with the new schema files found under `CACHE/json_schema_definitions/` and `CACHE/schema_definitions/`

```
# First run it for 'pdbx_core' schemas:
schema_update_cli --encoding_types rcsb,json,bson --validation_levels full,min --update_pdbx_core --cache_path CACHE --config_path ./config/exdb-config-example.yml --config_name site_info_configuration

# Next run it for 'pdbx_comp_model_core' schemas:
schema_update_cli --encoding_types rcsb,json,bson --validation_levels full,min --update_pdbx_comp_model_core --cache_path CACHE --config_path ./config/exdb-config-example.yml --config_name site_info_configuration
```

3. Copy the newly generated schema files into corresponding directories in `py-rcsb_exdb_assets`:
- `cp CACHE/json_schema_definitions/* json_schema_definitions/`
- `cp CACHE/schema_definitions/* schema_definitions/`
4. Create a PR to this repository (`py-rcsb_exdb_assets`) with the new dictionary and schema files
- Once approved, create a parallel PR to [rcsb-mojave-model](https://github.com/rcsb/rcsb-mojave-model), with the relevant files copied to their corresponding directories under [schemas/exchange](https://github.com/rcsb/rcsb-mojave-model/tree/master/schemas/exchange)
Loading