-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add additional species names to gene labels #116
Comments
Species names are still not being displayed in Noctua. For example, Taxon:42789 & model: http://noctua.geneontology.org/workbench/noctua-visual-pathway-editor/?model_id=gomodel%3A6494e2e900000134 ![]() |
This seems to be controlled by the script https://github.com/geneontology/neo/blob/master/gpi2obo.pl. When it is called by the Makefile, a species code needs to be passed to be appended to the gene product name. There seems to be some unfinished work related to handling virus names here: Lines 41 to 68 in b1a1039
|
(Quiet shoutout to geneontology/project-management#52) |
Trying to quote @kltm How the data gets in:
|
Noting that this seems to be the "next step" referred to in #77. Essentially, we stopped there before getting to this point. |
Right now there is a single file per file - so this is complicated to load |
Pondering from the meeting earlier today when talking to Patrick and @vanaukenk . There are a few ways to deal with this. It seems that the codes mostly come from, by way of a JSON derivative, the metadata/datasets YAML files. I'm not sure there's much to do there for adding a bunch of additional species. It would be nice if we could just modify The most direct way, without redoing a bunch of what we're doing, might be to add GPI files for the species that we want and adding the metadata for them in datasets. |
GPIs for the organisms requested by Patrick Masson and Paul Denny are now being generated by GOA at each release: |
From 2024-01-30 workbenches call: Once the new species are added to neo, we'll test on the next Noctua maintenance outage. |
Data needed for this now being populated to GO mirror of GOA data for our pipeline. |
Okay, I did a little testing of this NEO data load on amigo-staging (soon reverting), and it looks like there is a little more work to be done
|
Hi @kltm Are you limited to 4 characters? Ideally we would align to the UniProt 5 characters species_code. |
@Pauldenny For clarification, would this be the canonical source for the 5-character code? |
5 letter uniprot code is good! |
I believe that's the source you should use, @kltm |
@Pauldenny The above file mostly works, but is missing two entries for our purposes: https://www.ncbi.nlm.nih.gov/datasets/taxonomy/36352/ I've used placeholders for the time being. |
Talking to @tmushayahama , we will need to have a release of the Pathway Viewer that is more clever at removing the "species" part of the label. This will need to be done before we release the new data. |
To properly lay out the issues and options here, it turns out that the Pathway Viewer widget uses a statically compiled file to filter out the species part of the label. The species part of the label is introduced by the NEO data load taken by minerva and propagates to the API and other locations by way of the model JSON.
|
Hi @kltm thanks for explainer - I would prefer to get the new species in quickly, if possible and workaround the gene naming |
There is a collision between uniprot_reviewed.gpi.gz and taxon_12118.gpi.gz; it looks like they have difference names for the same identifier (from @balhoff ), likely around "name( P03305 FMDVO)" and "name( P03305 NCBITaxon:73482)". For expediency of testing, I'm going to remove taxon_12118.gpi.gz from the build for the moment to see if we can get more progress. |
Okay, there seem to be multiple issues. Unfortunately, it stops when hitting the first rather than continuing, so we'll have to take a few passes at this. I will keep a list of issues as I find them here:
I will be eliminating the files as I go; then examine the problematic files individually. |
See geneontology/go-site#1955 (comment)
The text was updated successfully, but these errors were encountered: