-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
extract
stuck re-writing output files
#6
Comments
@cbreenmachine I'm encountering a similar problem. Did you manage to find a solution or workaround? |
Unfortunately no. Went back and forth with sys admins for weeks and they couldn't figure it out. Ended up abandoning this and using the gem mapper and bs_call separately. While gemBS is fast when it's working, it does not seem to be actively maintained and buggy. I ended up losing a lot of time working around these types of issues. I'd recommend using bismark instead. |
Bismark and gemBS are not at all doing the same analyses. Bismark makes no
account of variants, sequencing errors or sequencing inefficiencies, does
not do variant calling etc. etc.
The problem is that I cannot debug a problem that I cannot reproduce. If I
can be provided with a dataset and an environment that reliably causes the
problem then I can try to track it down. An error that only occurs in a
few installations is impossible for me to diagnose. Not being able to
track down the problems running in a particular environment that I do not
have access to does not mean that gemBS is no longer being maintained!
Yours,
Simon
…On Fri, May 3, 2024 at 5:32 PM Coleman Breen ***@***.***> wrote:
Unfortunately no. Went back and forth with sys admins for weeks and they
couldn't figure it out. Ended up abandoning this and using the gem mapper
<https://github.com/smarco/gem3-mapper> and bs_call separately. While
gemBS is fast when it's working, it does not seem to be actively maintained
and buggy. I ended up losing a lot of time working around these types of
issues. I'd recommend using bismark
<https://www.bioinformatics.babraham.ac.uk/projects/bismark/> instead.
—
Reply to this email directly, view it on GitHub
<#6 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAY4653ALGUYEMRXPZ4OSTLZAOUX5AVCNFSM43XNRA7KU5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TEMBZGMZDKMBTGEYQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Hi Simon,
Thanks again for your help earlier with the non-stranded issue. I am having one (hopefully easy-to-fix) issue with methylation extraction.
Running
extract
on a few human WGBS datasets does not finish. For some context, mapping and calling on five WGBS can take 2-3 days on my current system without a problem. Runningextract
will go for a week or more, and it seems that the command is working well, but getting stuck in a loop where it will write an output file, then do an asset check, decide the file needs to be created and then start the whole process over. Because of this, keeping track of the extract output folder size every five minutes results in something like 22 G --> 28 G --> 34 G --> 41 G --> 20 G --> ...Here's what I have tried:
I've interrupted the process before and looked at the files before and they seem reasonable and match my collaborator's outputs. Of course because I interrupted the process the EOF and most of the data has not been written.
Is there any more information I can provide? Or is this a case of user-error?
Thanks very much!
Best,
Coleman
Here is console output in default mode:
/usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample002_108_mextr_ctgs.bed --cpgfile ./extract/sample002_108_cpg.txt.gz --tabix ./calls/sample002_108.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample003_109_mextr_ctgs.bed --cpgfile ./extract/sample003_109_cpg.txt.gz --tabix ./calls/sample003_109.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample004_110_mextr_ctgs.bed --cpgfile ./extract/sample004_110_cpg.txt.gz --tabix ./calls/sample004_110.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample005_111_mextr_ctgs.bed --cpgfile ./extract/sample005_111_cpg.txt.gz --tabix ./calls/sample005_111.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample002_108_mextr_ctgs.bed --cpgfile ./extract/sample002_108_cpg.txt.gz --tabix ./calls/sample002_108.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample003_109_mextr_ctgs.bed --cpgfile ./extract/sample003_109_cpg.txt.gz --tabix ./calls/sample003_109.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample004_110_mextr_ctgs.bed --cpgfile ./extract/sample004_110_cpg.txt.gz --tabix ./calls/sample004_110.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample005_111_mextr_ctgs.bed --cpgfile ./extract/sample005_111_cpg.txt.gz --tabix ./calls/sample005_111.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample002_108_mextr_ctgs.bed --cpgfile ./extract/sample002_108_cpg.txt.gz --tabix ./calls/sample002_108.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample003_109_mextr_ctgs.bed --cpgfile ./extract/sample003_109_cpg.txt.gz --tabix ./calls/sample003_109.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample004_110_mextr_ctgs.bed --cpgfile ./extract/sample004_110_cpg.txt.gz --tabix ./calls/sample004_110.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample005_111_mextr_ctgs.bed --cpgfile ./extract/sample005_111_cpg.txt.gz --tabix ./calls/sample005_111.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample002_108_mextr_ctgs.bed --cpgfile ./extract/sample002_108_cpg.txt.gz --tabix ./calls/sample002_108.bcf INFO - Launch: /usr/local/lib/gemBS/bin/mextr --loglevel info --compress --md5 --regions-file ./extract/sample003_109_mextr_ctgs.bed --cpgfile ./extract/sample003_109_cpg.txt.gz --tabix ./calls/sample003_109.bcf
This pattern will continue over and over until interrupted. And then in debug mode (run on a different batch of five samples):
DEBUG - Asset check: "sample009_115_cpg.bed.gz" "./extract/sample009_115_cpg.bed.gz" Absent
DEBUG - Asset check: "sample009_115_cpg.bed.gz.md5" "./extract/sample009_115_cpg.bed.gz.md5" Absent
DEBUG - Asset check: "sample009_115_cpg.bb" "./extract/sample009_115_cpg.bb" Absent
DEBUG - Asset check: "sample009_115_cpg.bb.md5" "./extract/sample009_115_cpg.bb.md5" Absent
DEBUG - Asset check: "sample009_115_chg.bed.gz" "./extract/sample009_115_chg.bed.gz" Absent
DEBUG - Asset check: "sample009_115_chg.bed.gz.md5" "./extract/sample009_115_chg.bed.gz.md5" Absent
DEBUG - Asset check: "sample009_115_chg.bb" "./extract/sample009_115_chg.bb" Absent
DEBUG - Asset check: "sample009_115_chg.bb.md5" "./extract/sample009_115_chg.bb.md5" Absent
DEBUG - Asset check: "sample009_115_chh.bed.gz" "./extract/sample009_115_chh.bed.gz" Absent
DEBUG - Asset check: "sample009_115_chh.bed.gz.md5" "./extract/sample009_115_chh.bed.gz.md5" Absent
DEBUG - Asset check: "sample009_115_chh.bb" "./extract/sample009_115_chh.bb" Absent
DEBUG - Asset check: "sample009_115_chh.bb.md5" "./extract/sample009_115_chh.bb.md5" Absent
DEBUG - Asset check: "sample009_115_.bw" "./extract/sample009_115_.bw" Absent
DEBUG - Asset check: "sample009_115_.bw.md5" "./extract/sample009_115_.bw.md5" Absent
DEBUG - Asset check: "report.tex" "./report/GemBS_QC_Report.tex" Present
DEBUG - Asset check: "report.html" "./report/GemBS_QC_Report.html" Present
DEBUG - Avail slots: 45.0001, avail memory: 11.6 GB
DEBUG - No execution slots
The text was updated successfully, but these errors were encountered: