-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gubbins - Ns in output alignment #192
Comments
If at a particular site there are SNPs somewhere in the tree and at a different point there is a recombination, the recombination will be masked out with Ns. |
Thank you, @andrewjpage for responding back so quickly. I am sorry, but I am not sure I understand the answer. If I understand, the alignment is shorter, as some sites were removed because of recombination, but some other sites undergoing recombination were masked instead? I am not sure I understand the difference between the two. Thank you, |
Taking these as input:
sample1
AAC
sample2
AAA
sample3
AGG
Gubbins is run and it detects a recombinaton in sample3 at coords 2-3.
sample1
AAC
sample2
AAA
sample3
ANN
This recombination then gets masked out. Because there is no longer any
variation in coordinate 2, it gets filtered out, but coords 3 have some
variation C/A so is kept giving a site with an N:
sample1
AC
sample2
AA
sample3
AN
…On 19 June 2017 at 22:27, mstagliamonte ***@***.***> wrote:
Thank you, @andrewjpage <https://github.com/andrewjpage> for responding
back so quickly.
I am sorry, but I am not sure I understand the answer. If I understand,
the alignment is shorter, as some sites were removed because of
recombination, but some other sites undergoing recombination were masked
instead? I am not sure I understand the difference between the two.
Thank you,
Max
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#192 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AABeV2YysHNVy_zcXqdALvS3m-X91Jexks5sFueqgaJpZM4N-upd>
.
|
Thank you, Have a nice day, |
I reckon coord 1 in the example above should also be removed and only leave coord 3 in the final alignment, as coord 1 is invariant. |
Dear Gubbins Team,
I am trying to use Gubbins to scan for recombination a SNP alignment which contains no missing data. The input alignment has 21,134 sites, the output file (my_output.filtered_polymorphic_sites.fasta) contains 20,174 sites. I initially thought that this was due to the potential recombinant sites being excluded; however I noticed that there are 'Ns' in the output alignment. What are these due to?
I apologize in case this is a duplicated topic, as I found a similar question here: #182 , but there is no answer yet.
Thank you for your kind attention,
Max Tagliamonte
The text was updated successfully, but these errors were encountered: