-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sga index segfault with large values of -d #131
Comments
Can |
Did you run out of memory with |
Whether it is worth using |
The fm-merge FASTA file is 20 GB, so it should be possible to construct the BWT in a single pass using SAIS in roughly 200 GB RAM. I reported this issue because of the segfault, which is 😢. I'm happy with the
I don't believe so. It was using 76 GB of RAM when it crashed, and the machine has 2.5 TB available.
I'm using |
Have you read Optimal In-Place Suffix Sorting? https://arxiv.org/abs/1610.08305 |
sga index -d 1000000 -t 64 hsapiens.preprocess.filter.pass.merged.fa
205964.05s user 3080.39s system 232% cpu 24:56:18.90 total 9111 MB |
Thanks for the update. I did see that paper from @rob-p's twitter - its on my to-read list :) |
Here's the wallclock and memory results for SGA on human HG004 data with and without
|
Interesting, thanks! I wouldn't have expected the runtimes to be (nearly) the same, but it is good to see. |
It was surprising to me to. Running |
|
The command
sga index -d 20000000 -t 64 hsapiens.preprocess.filter.pass.merged.fa
segfaults with-d 20000000
. Reducing to-d 1000000
works. Is each BWT batch size limited in size, perhaps to 2 or 4 billion nucleotides?-d 20000000
with a mean sequence size of ~300 bp should correspond to a batch size of about 6 Gbp.The text was updated successfully, but these errors were encountered: