Skip to content

Commit

Permalink
Reorganize repository structure (#1103)
Browse files Browse the repository at this point in the history
This PR does the following:

1. Consolidates the external registry getters (in
`bioregistry.external`), the external registry alignment classes (in
`bioregistry.align` the data artifacts (in `bioregistry.data.external`),
and a few (3) configuration files (in `bioregistry.data`) into a single
hierarchy in `bioregistry.external`.
2. Moves the metaregistry curation sheets and the raw data from the
repositories out of the `src/` structure. They're now in the
`/exports/alignment/` and `/exports/raw/` folders, respectively. The
point of this is to reduce the size of the package that gets sent to
PyPI, related to #1100
3. minor version bump to 0.11.X series

In theory, this shouldn't affect any downstream uses, since the
`bioregistry.align` submodule isn't really for external users.
  • Loading branch information
cthoyt authored Apr 18, 2024
1 parent f7ce129 commit 02ae7c2
Show file tree
Hide file tree
Showing 150 changed files with 8,265 additions and 7,157 deletions.
2 changes: 1 addition & 1 deletion .bumpversion.cfg
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
[bumpversion]
current_version = 0.10.204-dev
current_version = 0.11.0-dev
commit = True
tag = False
parse = (?P<major>\d+)\.(?P<minor>\d+)\.(?P<patch>\d+)(?:-(?P<release>[0-9A-Za-z-]+(?:\.[0-9A-Za-z-]+)*))?(?:\+(?P<build>[0-9A-Za-z-]+(?:\.[0-9A-Za-z-]+)*))?
Expand Down
2 changes: 1 addition & 1 deletion docs/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -291,7 +291,7 @@ file for inspiration. Entries in this file should follow the schema defined by t
See also the corresponding entry in the Bioregistry's [JSON schema](https://github.com/biopragmatics/bioregistry/blob/main/src/bioregistry/schema/schema.json)

While not strictly required, it's also useful for each registry to add a corresponding getter script and aligner
class in `bioregistry.external` and `bioregistry.align`, respectively. See examples there, or get in touch on the
class in `bioregistry.external`. See examples there, or get in touch on the
issue tracker for help.

## Code Contribution
Expand Down
2 changes: 1 addition & 1 deletion docs/curation.md
Original file line number Diff line number Diff line change
Expand Up @@ -67,7 +67,7 @@ that show relevant metadata to help curate each record as one of the following:
{% for entry in site.data.curation["prefix_xrefs"] %}
<tr>
<td>{{ entry.metaprefix }}</td>
<td><a href="https://github.com/biopragmatics/bioregistry/blob/main/src/bioregistry/data/external/{{ entry.metaprefix }}/curation.tsv">{{ entry.name }}</a></td>
<td><a href="https://github.com/biopragmatics/bioregistry/blob/main/exports/alignment{{ entry.metaprefix }}.tsv">{{ entry.name }}</a></td>
</tr>
{% endfor %}
</tbody>
Expand Down
3 changes: 0 additions & 3 deletions docs/source/alignment.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,3 @@
Alignment
=========
.. automodapi:: bioregistry.align
:no-inheritance-diagram:

.. automodapi:: bioregistry.external
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@
author = "Charles Tapley Hoyt"

# The full version, including alpha/beta/rc tags.
release = "0.10.204-dev"
release = "0.11.0-dev"

# The short X.Y version.
parsed_version = re.match(
Expand Down
4 changes: 3 additions & 1 deletion exports/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,9 +20,11 @@ basis using GitHub Actions as a continuous integration server.
| [`rdf`](rdf) | Build of an RDF triple-store representing the registry, metaregistry, and collections |
| [`sssom`](sssom) | An export of prefix mappings in the Simple Standard for Sharing Ontology Mappings (SSSOM) format |
| [`contexts`](contexts) | Fit-for-purpose exports of JSON-LD contexts constructed from the Bioregistry |
| [`alignment`](alignment) | Curation sheets for aligning the metaregistry |
| [`raw`](raw) | Raw data from select external registries |

## PURLs

The Bioregistry uses https://w3id.org to create persistent uniform resource locators (PURLs) for various
resources. These are configured on GitHub in the .htaccess file
resources. These are configured on GitHub in the `.htaccess` file
in https://github.com/perma-id/w3id.org/tree/master/biopragmatics.

Large diffs are not rendered by default.

Large diffs are not rendered by default.

3,088 changes: 510 additions & 2,578 deletions ...egistry/data/external/bartoc/curation.tsv → exports/alignment/bartoc.tsv

Large diffs are not rendered by default.

File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -327,7 +327,7 @@ IBO Imaging Biomarker Ontology The Imaging Biomarker Ontology describes radiolo
ICD-O-3 International Classification of Diseases for Oncology, 3rd edition https://data.jrc.ec.europa.eu/dataset/88ff4ec5-1832-403e-abe1-64928592568f Ontology used for oncology research derived from the European Cancer Information System (ECIS)
ICD11-BODYSYSTEM Body System Terms from ICD11 This is a set of body-system terms used in the ICD 11 revision
ICDO International Classification of Diseases Ontology https://github.com/icdo/ICDO A biomedical ontology for logical representation of the terms and relations related to the International Classification of Diseases (ICD)
ICECI International Classification of External Causes of Injuries The International Classification of External Causes of Injury (ICECI) is a system of classifications to enable systematic description of how injuries occur. It is designed especially to assist injury prevention. The ICECI was originally designed for use in settings in which information is recorded in a way that allows statistical reporting--for example, injury surveillance based on collection of information about cases attending a sample of hospital emergency departments. It has also been found useful for other purposes. For example, it has been used as a reference classification during revision of another classification, to record risk-factor exposure of children in a cohort study, as the basis for special-purpose classifications and in a growing number of other ways.
ICECI International Classification of External Causes of Injuries The International Classification of External Causes of Injury (ICECI) is a system of classifications to enable systematic description of how injuries occur. It is designed especially to assist injury prevention. The ICECI was originally designed for use in settings in which information is recorded in a way that allows statistical reporting--for example, injury surveillance based on collection of information about cases attending a sample of hospital emergency departments. It has also been found useful for other purposes. For example, it has been used as a reference classification during revision of another classification, to record risk-factor exposure of children in a cohort study, as the basis for special-purpose classifications and in a growing number of other ways.
ICHOM-PROMS-PCB ICHOM Set Pregnancy and Childbirth The ICHOM Set of Patient-Centered Outcome Measures for Pregnancy And Childbirth is the result of hard work by a group of leading physicians, measurement experts and patients. It is a recommendation of the outcomes that matter most to persons experiencing Pregnancy And Childbirth.
ICNP International Classification for Nursing Practice https://www.icn.ch/what-we-do/projects/ehealth-icnp International Classification for Nursing Practice
ICPC2P International Classification of Primary Care - 2 PLUS http://www.fmrc.org.au/ ICPC-2 PLUS
Expand Down Expand Up @@ -549,7 +549,7 @@ ONTOKBCF Ontological Knowledge Base Model for Cystic Fibrosis OntoKBCF is an on
ONTOLURGENCES Emergency care ontology Emergency care ontology build during LERUDI project. http://www.ncbi.nlm.nih.gov/pubmed/25160343 (v4.0)
ONTOMA Ontology of Alternative Medicine, French www.lavima.org Common concepts for communication between traditional medicine and western medicine. (In French)
ONTOPARON Ontology of Amyotrophic Lateral Sclerosis, all modules Ontology of ALS (amyotrophic lateral sclerosis), social module + coordination module + medical module developped by neurologists (ICM) and knowledge engineers (LIMICS). This version (11/15/2018) is upload with inferred axioms but without equivalent classe axioms. It serves in this version in a Gate pipeline for a semantic annotation task. This version (03/20/2019) is upload with object properties, inferred axioms and equivalent classes axioms. This version (11/10/2020) is uploaded with small corrections.
ONTOPARON_SOCIAL Ontology of amyotrophic lateral sclerosis, social module Ontology of ALS (amyotrophic lateral sclerosis), social module developped by neurologist (ICM) and knowledge engineers (LIMICS). Complete version available at http://bioportal.bioontology.org/ontologies/ONTOPARON
ONTOPARON_SOCIAL Ontology of amyotrophic lateral sclerosis, social module Ontology of ALS (amyotrophic lateral sclerosis), social module developped by neurologist (ICM) and knowledge engineers (LIMICS). Complete version available at http://bioportal.bioontology.org/ontologies/ONTOPARON
ONTOPBM Ontology for Process-Based Modeling of Dynamical Systems (OntoPBM) OntoPBM is an ontology of core entities for process-based modeling of dynamical systems.
ONTOPNEUMO Ontology of Pneumology Ontology of pneumology (french version). The ONTOPNEUMO ontology was developped by Audrey Baneyx, under the direction of Jean Charlet about knowledge engineering expertise and by François-Xavier Blanc in collaboration with Bruno Housset about medical expertise. The OWL compliant ONTOPNEUMO ontology is available under Creative Commons license “Attribution-Non-Commercial-No Derivative Works 2.0 UK”. Details of this license are accessible at : http://creativecommons.org/licenses/ by-nc-nd/2.0/uk/.
ONTOPSYCHIA OntoPsychia, social module Ontology of social and environmental determinants for psychiatry
Expand Down Expand Up @@ -661,7 +661,7 @@ ROS Radiation Oncology Structures Ontology http://www.twitter.com/jebibault This
RPO Resource of Asian Primary Immunodeficiency Diseases (RAPID) Phenotype Ontology http://rapid.rcai.riken.jp/ontology/v1.0/phenomer.php RAPID phenotype ontology presents controlled vocabulary of ontology class structures and entities of observed phenotypic terms for primary immunodeficiency diseases (PIDs) that facilitate global sharing and free exchange of PID data with users’ communities
RSA Reference Sequence Annotation An ontology for sequence annotations and how to preserve them with reference sequences
RVO Research Variable Ontology http://w3id.org/rv-ontology/info RVO, Research Variable Ontology, proposes a schema that can be use to record empirical data analytics research and can be use as a knowledge-base to support knowledge exploration phase of a new analytics research to learn and get recommendation. RVO is designed around the research variables, which form the basis of the hypothesis that analysts test through building a model.
SARSMUTONTO Ontology for SARS-CoV-2 lineages and mutations https://github.com/jbakkas/SARSMutOnto The SARSMutOnto ontology provides a list of all SARS-CoV-2 Pango lineages while maintaining their hierarchy (lineage/sublineage) describing in detail all lineage mutations
SARSMUTONTO Ontology for SARS-CoV-2 lineages and mutations https://github.com/jbakkas/SARSMutOnto The SARSMutOnto ontology provides a list of all SARS-CoV-2 Pango lineages while maintaining their hierarchy (lineage/sublineage) describing in detail all lineage mutations
SATO SATO (IDEAS expAnded wiTh BCIO): workflow for designers of patient-centered mobile health behaviour change intervention applications Designing effective theory-driven digital behaviour change interventions (DBCI) is a challenging task. To ease the design process, and assist with knowledge sharing and evaluation of the DBCI, we propose the SATO (IDEASexpAnded wiTh BCIO) design workflow based on the IDEAS (Integrate, Design, Assess, and Share) framework and aligned with the Behaviour Change Intervention Ontology (BCIO). BCIO is a structural representation of the knowledge in behaviour change domain supporting evaluation of behaviour change interventions (BCIs) but it is not straightforward to utilise it during DBCI design. IDEAS (Integrate, Design, Assess, and Share) framework guides multi-disciplinary teams through the mobile health (mHealth) application development life-cycle but it is not aligned with BCIO entities. SATO couples BCIO entities with workflow steps and extends IDEAS Integrate stage with consideration of customisation and personalisation. The SATO ontology provides the extensions of BCIO along with examples of a BCI Scenario of Fatigue Reduction.
SBOL Synthetic Biology Open Language Visual Ontology www.sbolstandard.org/visual Synthetic Biology Open Language Visual (SBOLv) is an ontology to represent standardized graphical notation for synthetic biology.
SCIO Spinal Cord Injury Ontology http://www.psink.de Representation of pre-clinical studies in the domain of spinal cord injury therapies
Expand Down
File renamed without changes.
5 changes: 5 additions & 0 deletions exports/alignment/cheminf.tsv
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
prefix name description
000570 SwissLipids
000571 MolMeDB
000572 PDB ligand
000573 PDB structure
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ EUROVOC EuroVoc Core Concepts https://op.europa.eu/s/y4TH EuroVoc is a multiling
FISHTRAITS Fish Traits Thesaurus The Fish Traits Thesaurus is the first initiative to deal with the semantics of fish functional traits. It has been developed by LifeWatch Italy, the Italian node of the e-science European infrastructure for biodiversity and ecosystem research (LifeWatch ERIC). FishTraits reflects the agreement of a scientific expert community to fix semantic properties (e.g. label, definition, relationships) of approximately 220 concepts.
I-ADOPT I-ADOPT Framework ontology "The I-ADOPT Framework is an ontology designed to facilitate interoperability between existing variable description models (including ontologies, taxonomy, and structured controlled vocabularies). One of the challenges in representing semantic descriptions of variables is getting people to agree about what they mean when describing the components that define the variables. The I-ADOPT ontology addresses this by providing core components and their relations that can be applied to define machine-interpretable variable descriptions that re-use FAIR vocabulary terms. It was developed by a core group of terminology experts and users from the Research Data Alliance (RDA) InteroperAble Descriptions of Observable Property Terminology (I-ADOPT) Working Group. The first published versions of the ontology up to v0.9.1 satisfied the basic cross-domain interoperability requirements. It defines four classes or ""concepts"" (Variable, Property, Entity, Constraint), and six object properties (hasProperty, hasObjectOfInterest, hasContextObject, hasMatrix, hasConstraint, constrains). The Variable is the top concept. It represents the description of something observed or mathematically derived. It minimally consists of one entity (the ObjectOfInterest) and its Property; a Property being a type of characteristic (i.e. a quantity or a quality). More complex variables can involve additional entities, for example an entity may have the role of Matrix and/or of ContextObject(s). The framework does not capture units, instruments, methods, and geographical location information; however its usage recommendation will make explicit reference to these by connecting the I-ADOPT framework to existing and complementary ontologies. This new version of the ontology (v1.0) adds one optional new class (VariableSet) and four optional new object properties (hasApplicableProperty, hasApplicableObjectOfInterest, hasApplicableMatrix, hasApplicableContextObject). This was necessary in order to enable flexibility in assigning optional and user-defined machine-interpretable categorizations of I-ADOPT variables under one or multiple coarser grouping concepts to facilitate dataset discovery and dataset aggregation. With the introduction of these concepts and properties, the framework enables different user communities or product developers to develop their own grouping criteria. While the Variable class must be connected to at least two classes via the mandatory properties hasProperty and hasObjectOIfInterest, the VariableSet class can have either of the new properties. Additionally, the VariableSet class can also be optionally connected to the Variable class using the property [ro:hasMember](http://purl.obolibrary.org/obo/RO_0002351) from the [OBO Relations Ontology](https://obofoundry.org/ontology/ro.html)."
LUPO LifeWatch ERIC Upper Ontology The LifeWatch ERIC Upper Ontology (LUPO) is the model defining the core set of LifeWatch ERIC elements (Actors, Services and Infrastructure) and describing their high-level arrangement. LUPO is extendable and will foster, connect and describe lower level models of its core elements, with the vision to provide functionalities that facilitate the uptake, re-use and enrichment of LifeWatch ERIC resources by an ever broader community.
LWT LifeWatch_test
MACROALGAETRAITS Macroalgae Traits Thesaurus The Macroalgae Traits Thesaurus contains several concepts on demographic and functional traits. It has been developed and published by LifeWatch Italy, the Italian node of the e-science European infrastructure for biodiversity and ecosystem research (LifeWatch ERIC). It reflects the agreement of a scientific expert community to fix semantic properties (e.g. label, definition) of approximately 100 traits.
MR Marine Regions ontology https://www.marineregions.org/ The Marine Regions ontology provides definitions for the classes and properties used in the Marine Regions dataset.
MRPTCODELIST Marine Regions PlaceTypes code list https://www.marineregions.org/ The Marine Regions PlaceType code list provides definitions for the PlaceTypes used in the Marine Regions dataset.
Expand Down
File renamed without changes.
Loading

0 comments on commit 02ae7c2

Please sign in to comment.