Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sequences of interest pruned from input tree #6

Open
ncoots opened this issue Jul 21, 2021 · 4 comments
Open

Sequences of interest pruned from input tree #6

ncoots opened this issue Jul 21, 2021 · 4 comments

Comments

@ncoots
Copy link

ncoots commented Jul 21, 2021

Hi @maclandrol,
I am having an issue while running CoreTracker.
I am getting the program to run successfully, however, I noticed that it consistently filters out the sequences of interest to me. I input sequences from at least 70 different species, but the only ones that appear to be missing from the output codon_data files are the ones that I care the most about. While running, CoreTracker tells me "DEBUG:root:Non-uniform sequence in sequences and tree. The tree will be pruned." So I'm assuming that my sequences of interest are pruned out. I would like to know what constitutes "non-uniform" and if there is a way that I can keep them from being removed? I'm aware that those sequences are particularly divergent from the rest...which is exactly why I want to use CoreTracker on them!

Thank you for your help,
Nicole

@maclandrol
Copy link
Contributor

Hello @ncoots, this will mostly happen because your have species in your dna, protein sequence not in the phylogenetic tree. Make sure you are keeping the same names or that the species are in the phylogenetic tree (only species found in dna, protein and tree are kept).

If you are able to share your input data, I could have a look at that.

@ncoots
Copy link
Author

ncoots commented Jul 21, 2021

Hello,
Thank you for your
output_p_aligned.txt
output.txt
Screen Shot 2021-07-21 at 8 38 35 AM
response! I have checked and double-checked that the 4 species of interest are in all 3 datasets with the exact same name; however, 2 continue to be removed. I am attaching the 3 datasets here for you to look at. The two species that are consistently removed are "Acanthometra_sp" and "Lithomelissa_sp"

@maclandrol
Copy link
Contributor

Indeed, they are present. Can you share the newick of the tree ? I will try to run it to check and get back to you as soon as I can.

@ncoots
Copy link
Author

ncoots commented Jul 21, 2021

concat.tree.zip

Absolutely! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants