-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Replace default phylum controlled vocabulary with NCBI Phyla #58
Comments
Or if not replace, at least add these to it. |
Or, thinking further, we won't be pushing taxonomy back to NCBI at all, but we should still include their phyla in our default controlled vocabulary. |
Attached is a comparison of phyla between GEOME and NCBI.... there is not
as much agreement between the two as i would like to see!
Several options:
1. Add all the NCBI phyla to the GEOME list and we end up with a list that
is about 40 names longer.
2. Only use the NCBI phyla and force all future uploads into GEOME to adopt
the new taxonomy (will not change existing data unless a user tries to
reload).
3. Try and rectify the taxonomy in some way.
[NCBI_Phyla.csv](https://github.com/biocodellc/geome-db/files/6824951/NCBI_Phyla.csv)
|
Chris probably knows more than I do, but it seems like Catalog of Life is trying to reconcile ITIS and GBIF and may be the best authority. What did Biocode use as the source of phyla originally?
But if we are pushing data to NCBI then we really should include their taxonomy. For example in the datathon, since we were adding metadata retrospectively to SRA projects, we queried NCBI taxonomy. So I would favor option 1. In theory, the phyla could be reconciled later, right?
Eric
|
Yikes, next time I'll reply on GitHub |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
As we are pushing metadata to NCBI now, it would make sense that the default controlled vocabulary contain phyla from the NCBI taxonomy, which can be obtained with the R taxize command and is attached.
library(taxize) ncbi_phyla <- downstream(sci_id = "cellular organisms", db = "ncbi", downto = "phylum", intermediate = F)
NCBI_Phyla.csv
The text was updated successfully, but these errors were encountered: