-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mpileup comparison #127
Comments
<rant> Honestly, I have no idea why @pjotrp decided to include this usage scenario. </rant> Each process of Also, BCF is not well supported yet, so the extra step of converting BCF into VCF is taken. Summarizing the above, here's a list of what's to be done:
|
Ok, thanks for your suggestion. I've been trying to do a successful sambamba mpileup run and the threads always get stuck. I've had to kill every job. The steps I've taken are first index the bam file, then run sambamba mpileup --nthreads=8 With a 2GB buffer size, that should be more than enough I think. Sambamba creates all the threads and running "ps ux" shows that one of the threads generated by sambamba has the command samtools mpileup /tmp/sambamba-fork-dnGgdy/33 -gu -l /tmp/sambamba-fork-dnGgdy/33.bed | bcftools call -cV indels Initially all the threads do take up part of the CPU and MEM but quickly converge to 0% and never go away according to "ps ux". The file at /tmp/sambamba-fork-dnGgdy/33 is size 0 bytes and /tmp/sambamba-fork-dnGgdy/33.bed is 23 bytes. I have also tried to just run the following command directly samtools mpileup /tmp/sambamba-fork-dnGgdy/33 -gu -l /tmp/sambamba-fork-dnGgdy/33.bed and samtools gets stuck. There seems to be a break down somewhere before the line of code in pileup.d that joins the threads because they never finish. Could there be a race condition somewhere? According to test_suit.sh, mpileup is never tested. I realize it is one of the newer features of sambamba so development may still be ongoing. |
Sambamba mpileup is somewhat experimental and not well tested. It does a map-reduce using temporary files. I won't have time to look into it before April. |
The multithreading issue should be fixed now, please check. I've made a number of other improvements to the tool, see v0.5.3 release notes. |
This is a great tool. I am trying to compare using mpileup with sambamba vs samtools. Here's what I've seen so far
samtools faidx ../human_g1k_v37.fa
time sambamba mpileup -t 20 -o ENCFF000CXI.sambamba.raw.bcf ../ENCFF000CXI.star.bam --samtools -gIf ../human_g1k_v37.fa
real 168m9.613s
user 121m13.895s
sys 2m7.975s
and on samtools, the real time was 62 minutes (I cleared the console so I don't have exact numbers anymore). Why would sambamba mpileup take ~2 hours while samtools takes at most 1 hour? The bam file is 1.6 GB and I was running on a 40 core machine. I am using sambamba v0.5.1. I haven't come across any mpileup comparisons using sambamba so I don't know have a good basis for any expectations.
The text was updated successfully, but these errors were encountered: