-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More memory problems with SARS-CoV-2 #684
Comments
In order to test how much memory is used, and where it exceeds the limit, you can limit memory usage the same way that Slurm does, using cgroups. I mostly followed this post, but I had to use the |
I reproduced the problem on my workstation with sample COVID242IPNT-Unknown_S69 from the 24 Jul 2020.M01841 run. It exceeded the memory limit after 54 minutes. I think it was in the remap stage. |
I pinned it down to the Gotoh call, so this bug is more motivation to replace Gotoh. (See #556.) |
Please note this is not gotoh2! "Gotoh" is a wrapper of alignment code that was in use by the lab well before I joined.. |
That's true, and it's also not really a bug in the Gotoh code. It's just that Gotoh is too memory intensive for aligning sequences as long as SARS-CoV-2. I'm experimenting with minimap2. |
I would be happy to see the end of the Gotoh code - it was completely unmaintainable and horrible before I put in the time trying to clean and document it, and it's still lousy! We've found minimap2 to be quite memory efficient indeed: |
Although #643 fixed all the known memory problems with SARS-CoV-2 samples, we just ran into a new one. Those errors occurred during
aln2counts.py
, but this error seems to happen during the remap step. Maybe bowtie2 is overwhelmed by so many reads on such a long reference? The sample with the error is COVIDVOC1WG-Unknown_S1 from the 29 Jan 2021 run. The two FASTQ files are about 900MB each. The remapped version took about 12 hours, and the assembled version took about 28 hours.The error occured again on the remapped version with samples COVID242IPNT, COVID241IPNT, COVID236IPNT, COVID234IPNT, COVID223IPNT, and COVID230IPNT in the 24 Jul 2020.M01841 run. The remapped version took just over two hours for sample COVID242IPNT, the fastest so far.
The text was updated successfully, but these errors were encountered: