Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update imports documentation #2389

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 20 additions & 5 deletions docs/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,12 +24,27 @@ If you want a new term added, or want edits to a current term, or spot any mista
- Give a short summary of the pull request - that way we can find suitable reviewers much quicker. Say which terms you are adding or what kinds of changes you are proposing.
- It is most of the time a good idea to use `squash merge` rather than `merge` for your pull request, to keep the git history short and useful.

## Pull requests that require imports to be refreshed
## Contributions that use terms from other ontologies not yet referenced in CL

If your pull request references foreign terms from an external ontology that are not yet present in the import module for that ontology (for example, you’re adding a logical definition that makes use of a GO term for the first time), imports needs to be refreshed for the foreign terms to be available to use.
(in jargon, PRs that need imports to be refreshed)

If you have the technical skills and/or the required computer resources (refreshing imports can be a memory-intensive task), you may refresh the imports yourself before submitting the pull request, by following the [appropriate procedure](odk-workflows/UpdateImports.md).
Pull requests to the Cell Ontology often include terms from other external ontologies, such as [UBERON](https://github.com/obophenotype/uberon) or the [Gene Ontology](https://github.com/geneontology/go-ontology). If these terms do not yet exist, they need to be proposed, created, and released by the external source.

If you can’t apply the imports refreshing procedure for any reason, you may instead opt for using “bare IRIs” when editing the ontology, everywhere you need a reference to a foreign term. Then, when submitting your pull request, label it with the tag `update-imports-required` to ask that a member of the tech support group refresh the imports before the pull request can be merged.
If the foreign term exists but is not yet present in the import module of CL (for example, you’re adding a logical definition that makes use of a GO term for the first time), it is necessary to make them available. This requires ''refreshing the imports'', a technical task.
The easiest way to add the imports is by using what is called a Protége-based declaration, or a “bare IRIs” approach. Details on how to do so are available in the [OBO Training documentation](https://oboacademy.github.io/obook/howto/update-import/?h=import#protege-based-declaration). When submitting your pull request, you should label it with the tag `update-imports-required` to ask a member of the tech support group to refresh the imports before the pull request can be merged.

If you have the technical skills and/or the required computer resources (refreshing imports can be a memory-intensive task), you may refresh the imports yourself before submitting the pull request by following the [appropriate procedure](odk-workflows/UpdateImports.md). This approach is generally preferred, as it streamlines updates and reviews, but either is acceptable.

People reviewing pull requests must:
1. Make sure that if a pull request is referencing bare IRIs, the request is tagged with `update-imports-required` .
2. Make sure that imports have indeed been updated (either by the author of the pull request or by someone from the tech support group if requested) before allowing the request to be merged.

Additional details on imports are available in:
* the general CL guideline called ["Adding classes from another ontology"](https://obophenotype.github.io/cell-ontology/Adding_classes_from_another_ontology/).
* [OBO Training docs](https://oboacademy.github.io/obook/howto/update-import/)
* the [CL-specific ODK workflow documentation](odk-workflows/UpdateImports.md).

### Why the Cell Ontology does not pull all terms by default?

If the Cell Ontology pulled all terms by default (from UBERON or the Gene Ontology, for example), that would lead to a tremendous increase in ontology size and the resources needed to run it. Thus, it is necessary to import only a subset of terms from each foreign resource.

People reviewing pull requests must 1) make sure that if a pull request is referencing bare IRIs, the request is tagged with `update-imports-required` (adding the label themselves if needed); and 2) make sure that imports have indeed been updated (either by the author of the pull request, or by someone from the tech support group if requested) before allowing the request to be merged.
130 changes: 21 additions & 109 deletions docs/odk-workflows/UpdateImports.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,28 @@
# Update Imports Workflow

This page discusses how to update the contents of your imports, like adding or removing terms. If you are looking to customise imports, like changing the module type, see [here](RepoManagement.md).
This page details the import workflows mostly used by CL editors.
anitacaron marked this conversation as resolved.
Show resolved Hide resolved

## Importing a new term
There are several different ways of importing terms, though, and the details of the different approaches not covered here (such as the "Base Module approach") are available at the general [OBO Training Update Imports Workflow with ODK](https://oboacademy.github.io/obook/howto/update-import/).

There are some also notes available on the current practice of imports in CL at ["Adding classes from another ontology"](https://obophenotype.github.io/cell-ontology/Adding_classes_from_another_ontology/).

Note: some ontologies now use a merged-import system to manage dynamic imports, for these please follow instructions in the section title "Using the Base Module approach".
## Importing a new term

Importing a new term is split into two sub-phases:

1. Declaring the terms to be imported
2. Refreshing imports dynamically
1. Declaring the terms to be imported (always done by the author of the PR)
2. Refreshing imports dynamically (may be done by the author or _post-hoc_ by the tech team )

### Declaring terms to be imported
There are three ways to declare terms that are to be imported from an external ontology. Choose the appropriate one for your particular scenario (all three can be used in parallel if need be):
There are three ways to declare terms that are to be imported from an external ontology

1. Protégé-based declaration
2. Using term files
3. Using the custom import template
3. Using the custom import template (described only in the [OBO general docs](https://oboacademy.github.io/obook/howto/update-import/))

#### Protégé-based declaration

This workflow is to be avoided, but may be appropriate if the editor _does not have access to the ODK docker container_.
This approach also applies to ontologies that use base module import approach.
This workflow is the simplest, but will require an update by the tech team.

1. Open your ontology (edit file) in Protégé (5.5+).
1. Select 'owl:Thing'
Expand All @@ -33,53 +34,31 @@ This approach also applies to ontologies that use base module import approach.

Now you can use this term for example to construct logical definitions. The next time the imports are refreshed (see how to refresh [here](#refresh-imports)), the metadata (labels, definitions, etc.) for this term are imported from the respective external source ontology and becomes visible in your ontology.

Make sure that if a pull request is using Protégé-based declarations and using bare IRIs, the request is tagged with `update-imports-required`.


#### Using term files

Every import has, by default a term file associated with it, which can be found in the imports directory. For example, if you have a GO import in `src/ontology/go_import.owl`, you will also have an associated term file `src/ontology/go_terms.txt`. You can add terms in there simply as a list:
The Cell Ontology has several term files associated with each ontology it imports, which can be found in the imports directory ([`cell-ontology/src/ontology/imports/`](https://github.com/obophenotype/cell-ontology/tree/master/src/ontology/imports)).

For example, you may add a Gene Ontology term to the end of the list at `src/ontology/imports/go_terms.txt`, for example:

```
GO:0008150
GO:0008151
GO:0004990
GO:0070278
```

Now you can run the [refresh imports workflow](#refresh-imports)) and the two terms will be imported.

#### Using the custom import template

This workflow is appropriate if:

1. You prefer to manage all your imported terms in a single file (rather than multiple files like in the "Using term files" workflow above).
2. You wish to augment your imported ontologies with additional information. This requires a cautionary discussion.

To enable this workflow, you add the following to your ODK config file (`src/ontology/cl-odk.yaml`), and [update the repository](RepoManagement.md):

```
use_custom_import_module: TRUE
```

Now you can manage your imported terms directly in the custom external terms template, which is located at `src/templates/external_import.owl`. Note that this file is a [ROBOT template](http://robot.obolibrary.org/template), and can, in principle, be extended to include any axioms you like. Before extending the template, however, read the following carefully.
Now you can run the [refresh imports workflow](#refresh-imports)) and the new terms will be imported.

The main purpose of the custom import template is to enable the management off all terms to be imported in a centralised place. To enable that, you do not have to do anything other than maintaining the template. So if you, say currently import `APOLLO_SV:00000480`, and you wish to import `APOLLO_SV:00000532`, you simply add a row like this:

```
ID Entity Type
ID TYPE
APOLLO_SV:00000480 owl:Class
APOLLO_SV:00000532 owl:Class
```

When the imports are refreshed [see imports refresh workflow](#refresh-imports), the term(s) will simply be imported from the configured ontologies.

Now, if you wish to extend the Makefile (which is beyond these instructions) and add, say, synonyms to the imported terms, you can do that, but you need to (a) preserve the `ID` and `ENTITY` columns and (b) ensure that the ROBOT template is valid otherwise, [see here](http://robot.obolibrary.org/template).

_WARNING_. Note that doing this is a _widespread antipattern_ (see related [issue](https://github.com/OBOFoundry/OBOFoundry.github.io/issues/1443)). You should not change the axioms of terms that do not belong into your ontology unless necessary - such changes should always be pushed into the ontology where they belong. However, since people are doing it, whether the OBO Foundry likes it or not, at least using the _custom imports module_ as described here localises the changes to a single simple template and ensures that none of the annotations added this way are merged into the [base file](https://github.com/INCATools/ontology-development-kit/blob/master/docs/ReleaseArtefacts.md#release-artefact-1-base-required).

### Refresh imports

If you want to refresh the import yourself (this may be necessary to pass the travis tests), and you have the ODK installed, you can do the following (using go as an example):
If you want to refresh the import yourself, and you have the ODK installed, you can do the following (using GO as an example):

First, you navigate in your terminal to the ontology directory.

First, you navigate in your terminal to the ontology directory (underneath src in your hpo root directory).
```
cd src/ontology
```
Expand Down Expand Up @@ -107,70 +86,3 @@ If you wish to skip refreshing the mirror, i.e. skip downloading the latest vers
```
sh run.sh make IMP=true MIR=false PAT=false imports/go_import.owl -B
```

## Using the Base Module approach

Since ODK 1.2.31, we support an entirely new approach to generate modules: Using base files.
The idea is to only import axioms from ontologies that _actually belong to it_.
A base file is a subset of the ontology that only contains those axioms that nominally
belong there. In other words, the base file does not contain any axioms that belong
to another ontology. An example would be this:

Imagine this being the full Uberon ontology:

```
Axiom 1: BFO:123 SubClassOf BFO:124
Axiom 1: UBERON:123 SubClassOf BFO:123
Axiom 1: UBERON:124 SubClassOf UBERON 123
```

The base file is the set of all axioms that are about UBERON terms:

```
Axiom 1: UBERON:123 SubClassOf BFO:123
Axiom 1: UBERON:124 SubClassOf UBERON 123
```

I.e.

```
Axiom 1: BFO:123 SubClassOf BFO:124
```

Gets removed.

The base file pipeline is a bit more complex than the normal pipelines, because
of the logical interactions between the imported ontologies. This is solved by _first
merging all mirrors into one huge file and then extracting one mega module from it.

Example: Let's say we are importing terms from Uberon, GO and RO in our ontologies.
When we use the base pipelines, we

1) First obtain the base (usually by simply downloading it, but there is also an option now to create it with ROBOT)
2) We merge all base files into one big pile
3) Then we extract a single module `imports/merged_import.owl`

The first implementation of this pipeline is PATO, see https://github.com/pato-ontology/pato/blob/master/src/ontology/pato-odk.yaml.

To check if your ontology uses this method, check src/ontology/cl-odk.yaml to see if `use_base_merging: TRUE` is declared under `import_group`

If your ontology uses Base Module approach, please use the following steps:

First, add the term to be imported to the term file associated with it (see above "Using term files" section if this is not clear to you)

Next, you navigate in your terminal to the ontology directory (underneath src in your hpo root directory).
```
cd src/ontology
```

Then refresh imports by running

```
sh run.sh make imports/merged_import.owl
```
Note: if your mirrors are updated, you can run `sh run.sh make no-mirror-refresh-merged`

This requires quite a bit of memory on your local machine, so if you encounter an error, it might be a lack of memory on your computer. A solution would be to create a ticket in an issue tracker requesting for the term to be imported, and one of the local devs should pick this up and run the import for you.

Lastly, restart Protégé, and the term should be imported in ready to be used.

Loading