-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gubbins output: internal_5 / internal_6 node? #246
Comments
Actually, I just figure it out. Sorry. |
Could you explain where the 0 SNPs issue came from? I am having the same issue. Any help would be really appreciated! |
The results are provided per branch - there are zero base substitutions on the terminal branch, because they are instead reconstructed as occurring on one of the internal branches. There is a node_labelled output file which labels the internal nodes of the tree. |
Hi @nickjcroucher, thank you for responding and for explaining this. I now realise that I was interpreting the "Total SNPs" wrongly. Just to sanity check, would a branch with "0 SNPs" be expected for an isolate that is divergent from all the other isolates? As in the image below. There were SNPs detected by snippy for that isolate, but my understanding is that when gubbins infers a separate clade with a single isolate, this will always lead to "0 SNPs" as there is only a terminal branch. |
Good question - zero mutations on a branch normally indicates there are no private mutations on a branch, so the isolate is closely related to at least one other isolate. You have highlighted a special case of an outgroup - this is descended directly from the root. The branch to your outgroup is artificially split in two by the root - Gubbins puts all the mutations on one of these two components, otherwise they are randomly split across the branches, which leads to false negatives and false positives when it comes to inferring recombination. You can sum the events over both halves of the root to get the overall divergence of your outgroup from the rest of the tree. |
I see, makes sense! Thanks so much for clarifying ⭐. |
Hello,
I'm trying to use Gubbins to identify recombination regions between different strains of Streptococcus pneumoniae which belong to the same clone or single clonal complex. I performed hybrid assembly and used Snippy to generate a core alignment from the contigs. Since Gubbins requires the full genome alignment, I followed the advice of @tseemann and processed the full core alignment as stated:
snippy-clean_full_aln core.full.aln > clean.full.aln
Afterwards, I ran gubbins with the default command:
run_gubbins.py --outgroup Reference -t fasttree --prefix Gubbins_SLV clean.full.aln
And got the following result:
Based on my snippy results, I have 5,747 snps in strain CH-216 relative to the Reference, but Gubbins gives me 0 total snps for this strain relative to the Reference? Why don't I have any snps here?
On the other hand, I'm mainly confused about the internal_5 and internal_6 nodes in my output file. Based on the Gubbins user Manual, these are internal nodes subtended by the branch, but I don't really understand how they relate to my data. Is internal_5 node the internal branch for CH-216 strain and internal_6 for CH-266 strain? How can I interpret the snps and recombination at the internal nodes together with my data?
Apologies if these seem like pretty straight forward results, but as a non-evolutionary biologist, I really need some guidance here.
Many thanks in advance!
The text was updated successfully, but these errors were encountered: