-
Notifications
You must be signed in to change notification settings - Fork 741
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Empirical formula predicting disk space usage? #491
Comments
@akolesnikov on our team will look at this and reply later. |
Thanks you! An update: I've allocated 500GB disk and am still seeing PAPI 10, so it would be great if can get an understanding of this. |
@SHuang-Broad , are you using a BAM file that is publicly available? |
Unfortunately, it's not... Providing what I can:
all three BAMs are in the range between 15-20GB. I am actually suspecting that somewhere in these BAMs, there are stupidly crazy high coverage regions, and that's triggering something in DV to generate loads of temporary files. |
@SHuang-Broad , yes, I'd suggest subsampling the BAM file with mapping quality of Q10 or Q20 if coverage is that high and then try again. I think the pipeline is getting stuck at the centromere. Do you happen to know the read N50? |
Good to know that sub-sampling trick. |
@SHuang-Broad It could be disk limit issue, but I have a suspicion it might be memory related. Try to launching the free command in the background on some interval like 60 sec, before all the deepvariant ones: free -s 60 > dv_mem_usage.txt Then launch deepvariant in on the same machine from another terminal. Once you get the failure stop the logging from the command above. And look to see if memory has become exhausted. The above command can also be launched in parallel as part of the submitted command if that makes it easier. If that's the case, then just increase the memory requirements for the job. Hope it helps, |
Thanks, Paul! I do indeed sometimes see out-of-memory errors (e.g. for an even crazier sample with 140X coverage), but usually I get an PAPI error code 9 from that (based on experience, error code 9 indicates memory issues and error code 10 indicates disk issue, PAPI 10 is sort of an catch-all error code so it's not very informative). Never the less, the procedure you described is very helpful and I'll
Thanks! |
Hi Steve, Sounds good. You can also try use the Hope it helps, |
An update: I've tried increasing both memory and disk, and it has worked! Given that DV-Pepper pipeline is a non-trivial part of the pipeline I'm building, I'll dig deeper with both @kishwarshafin and Paul's suggestions. For the moment, when I initially prototyped the pipeline using a normal sample's ONT data, this is what I found: |
@SHuang-Broad Glad to hear it worked, and thank you for the nice visualization! Docker has multiple layers and DV expands with its own within them, which is something we've noticed with other folks in terms of resource requirements -- which you could sort of tell from the memory/cpu utilization profiles. You might be able to profile it more granularly, but it might easier to start with some simple input files, and maybe a modified version of DV where you add debug information in the code to get an idea of the points of memory/disk expansion. This way you can trace the code-execution with correlated flow of data-processing. |
Hi @SHuang-Broad , thanks for bringing up the issue about the logging file size.
In a quick test on a WGS BAM, both will roughly be 1-2min intervals, which hopefully is more reasonable. Before release, I'll check the size of the log files as well. |
@pichuan thanks for the work! |
I'm closing this issue now. I don't think we have a very easy solution for the original question in the title, but hopefully this thread is still useful. |
Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.2/docs/FAQ.md:
Yes
Describe the issue:
When running WDL workflows backed with PAPI, I get PAPI error 10, which indicates the disk is full.
Setup
kishwars/pepper_deepvariant:r0.4.1
kishwars/pepper_deepvariant:r0.4.1
Steps to reproduce:
Relevant part of the log file (which is over 200MB):
For one failed task, the input BAM size is 19GB, and allocated disk size is 300GB.
Does the quick start test work on your system?
Some inputs finish, while others fail using the exact same workflow (PAPI error 10), so it's unlikely to be a coding issue.
Any additional context:
We have successful runs with inputs of similar sizes that failed with PAPI 10. So I'm wondering if there's an empirical formula for predicting disk space usage.
Additionally, is there a way to make DV less verbose? The log file goes to hundreds of MB, which makes debugging less easy.
Thanks!
Steve
The text was updated successfully, but these errors were encountered: