-
Notifications
You must be signed in to change notification settings - Fork 240
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disabling BAQ in mpileup: better results? #1475
Comments
Thanks for the report. I think the documentation in bcftools is misleading. I added an explanation to your biostars post so others can see it, but also the whole idea of whether to-BAQ or not-to-BAQ has been on my radar for the last few months. See #1474 for a shake up of that idea, which adds partial-BAQ mode. See also samtools/htslib#1273, which is vital for calling on amplicon sequencing. As an alternative, try |
Thank you @jkbonfield for your speedy and thorough response, I really appreciate it! |
@charbel-gem, would you mind sharing what settings did you end up using for mpileup and ivar consensus/variants? Is this for COVID amplicon data? @jkbonfield,I am confused about using --reference fasta, but also -B. Aren't those supposed to cancel each other? Or I am understanding -B wrong? Would you recommend the use of bcftools mpileup vs. samtools mpileup? Can bcftools mpileup be piped into ivar? I explained here andersen-lab/ivar#85 the issues I am having around indels and the use of BAQ. Thanks |
Currently, |
If evaluating Bcftools here it would be useful to check both the current 1.12 release and the current develop branch, as they have diverged substantially. The false ivar variants you show are mainly due to alignment error due to biasing towards the reference rather than other sample reads. There are two approaches - nullifying their quality so they don't count (BAQ) or a local reassembly in that region to get the correct alignments. Locally we have experimented with using FreeBayes for this and it does fix a number of ivar errors. The person doing this though didn't evaluate bcftools so we don't have any side-by-side comparisons sadly. |
Thanks @jkbonfield for the explanation about If BAQ is always in use around indels, why do I see a difference here when using -f vs. non using it? In fact when I add -f it fails at detecting what I think is a real indel: 2021_0405_01C21_AllPiles_28270.txt This is pileup for the region with all the settings I tried. |
I also tried using
But where the coverage is zero and I should get a stretch of Ns, it gets the reference sequence instead. How could I use bcftools properly? The output.vcf.gz file is only calling the variants, but I am losing the Ns. |
@IsabelFE the usage of |
Thanks @pd3, I will open a new issue if I decide to explore more the |
Hello.
I kindly request from you to refer to my post on biostars for my question.
Thank you.
The text was updated successfully, but these errors were encountered: