Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Formatting RDP version 19 taxonomic database for DADA2 #2014

Closed
jheiman06 opened this issue Sep 5, 2024 · 3 comments
Closed

Formatting RDP version 19 taxonomic database for DADA2 #2014

jheiman06 opened this issue Sep 5, 2024 · 3 comments

Comments

@jheiman06
Copy link

I am working on processing bacterial 16S rRNA sequences with DADA2. I would like to classify these sequences with the newest update of the RDP database (version 19), but I have yet to find it formatted for DADA2. I have found the code to format it myself, and this was successful for the "assignTaxonomy" function, but I have not been able to format the species file correctly. I think this is because I cannot find the unaligned bacteria file as the RDP site at Michigan State is no longer working.
Has this database (RDP version 19) been formatted for DADA2? If not, do you know where I could find the unaligned bacteria file in order to format the species file correctly?

@benjjneb
Copy link
Owner

benjjneb commented Sep 9, 2024

Has this database (RDP version 19) been formatted for DADA2?

Nope, on the to-do list now though.

If not, do you know where I could find the unaligned bacteria file in order to format the species file correctly?

Not sure, but I will look into this more later this month.

@nvwinsen
Copy link

Is it possible to format the RDP v19 database including species instead of a separate species table?
This way bootstrapping can be included for species with RDP (which now is only possible with Silva).

@benjjneb
Copy link
Owner

assignTaxonomy compatible references for the RDP v19 release are now available: https://doi.org/10.5281/zenodo.14168770

Is it possible to format the RDP v19 database including species instead of a separate species table?
This way bootstrapping can be included for species with RDP (which now is only possible with Silva).

There are two versions of the reference available. One that goes only to the genus level as before (`rdp_19_toGenus_trainset.fa.gz') and one that also has species level information ('rdp_19_toSpecies_trainset.fa.gz').

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants