-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel instances #214
Comments
Thanks for reporting @gitMakeCoffee - it's definitely not a known issue! Were the problematic VCF + HTML outputs written into |
Thank you for the reply. |
Very interesting observation @gitMakeCoffee, Peter and myself have discussed it a bit already. And yes, PCGR generate many intermediate files under the hood, there might very well be some weaknesses there. Generally, I think that the most likely weakness (i.e. causing issues when running in parallel) is found in the last step of PCGR (reporting with RMarkdown), the first part should (in general..) be more robust when it comes to handling sample-specific output. However, on that note: Have you looked at the log files for the samples that did not produce any VCFs (I here refer to the PCGR-annotated VCFs, containing the pcgr_acmg tag)? Also, is it so that some of the pcgr_acmg VCF files from different samples (with different query VCFs) are identical? Thanks again for reporting this, very valuable for us when it comes to improving the intermediate file handling. I am confident we will get to the bottom if it, and resolve it eventually :-) best, |
Thanks.
Sorry I can't share the output files, as these are sensitive clinical data. However, maybe these issues could be replicated with public VCF files. Please let me know if you have any more questions, I'd be glad to help. |
Hello,
I have been running PCGR (v1.4.1, GRCh37) on some clinical samples. These were just tests, so I didn't use proper pipelining workflows. I just used
xargs
to run samples in parallel :The following command does run correctly and does generate reports for all samples.
However, upon closer inspection, some outputs (including reports) for different samples are exactly identical, like they were mixed up. I double checked the input VCF files, which were of course very different.
Running
xargs
without the-P 4
option (ie running all samples sequentially) fixes the problem. In other words, it seems like this may be linked to PCGR running multiple instances in parallel.Is it a known issue ?
Thanks.
The text was updated successfully, but these errors were encountered: