Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add gatktool to conda env for CNNScoreVariants #1786

Closed
wants to merge 2 commits into from

Conversation

FriederikeHanssen
Copy link
Contributor

PR checklist

Closes #XXX

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the module conventions in the contribution docs
  • If necessary, include test data in your PR.
  • Remove all TODO statements.
  • Emit the versions.yml file.
  • Follow the naming conventions.
  • Follow the parameters requirements.
  • Follow the input/output options guidelines.
  • Add a resource label
  • Use BioConda and BioContainers if possible to fulfil software requirements.
  • Ensure that the test works with either Docker / Singularity. Conda CI tests can be quite flaky:
    • PROFILE=docker pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=singularity pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware
    • PROFILE=conda pytest --tag <MODULE> --symlink --keep-workflow-wd --git-aware

@muffato
Copy link
Member

muffato commented Nov 30, 2022

By the way, @FriederikeHanssen , do you know why this module uses the official container broadinstitute/gatk:4.3.0.0 rather than the biocontainers quay.io/biocontainers/gatk4:4.3.0.0--py36hdfd78af_0 (which has a singularity version as well) ?

@FriederikeHanssen
Copy link
Contributor Author

Yes because I could never get the biocontainers one too work because of the missing dependency in the conda package. Didn't think about building a mulled container at that point 🙈 I think I opened an issue somewhere to fix the conda recipe. Seems like using a mulled one would be a good solution though.

@muffato muffato self-requested a review November 30, 2022 09:14
@muffato
Copy link
Member

muffato commented Nov 30, 2022

Oki. I'll give the mulled container a try 🤞🏼

@maxulysse
Copy link
Member

@muffato
Copy link
Member

muffato commented Nov 30, 2022

@FriederikeHanssen I misread the CI, and actually the test is still failing on Conda despite your change. I didn't have to try a mulled container at all !

Here is the error log:

Error executing process > 'test_gatk4_cnnscorevariants:GATK4_CNNSCOREVARIANTS (test)'
                                                                                                                                                                                                      
Caused by:                                                                                                                                                                                            
  Process `test_gatk4_cnnscorevariants:GATK4_CNNSCOREVARIANTS (test)` terminated with an error exit status (3)
                                                                                                                                                                                                      
Command executed:

  gatk --java-options "-Xmx3g" CNNScoreVariants \
      --variant test.genome.vcf.gz \
      --output test.cnn.vcf.gz \
      --reference genome.fasta \
       \
       \
       \
       \
      --tmp-dir . \
  
  
  cat <<-END_VERSIONS > versions.yml
  "test_gatk4_cnnscorevariants:GATK4_CNNSCOREVARIANTS":
      gatk4: $(echo $(gatk --version 2>&1) | sed 's/^.*(GATK) v//; s/ .*$//')
  END_VERSIONS

Command exit status:
  3

Command output:
  (empty)

Command error:
  Using GATK jar /lustre/scratch123/tol/teams/tolit/users/mm49/cache/nextflow/conda/env-03c4af1da17e374b2411e3484a94d244/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar
  Running:
      java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx3g -jar /lustre/scratch123/tol/teams/tolit/users/mm49/cache/nextflow/conda/env-03c4af1da17e374b2411e3484a94d244/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar CNNScoreVariants --variant test.genome.vcf.gz --output test.cnn.vcf.gz --reference genome.fasta --tmp-dir .
  21:35:48.407 INFO  NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/lustre/scratch123/tol/teams/tolit/users/mm49/cache/nextflow/conda/env-03c4af1da17e374b2411e3484a94d244/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
  21:35:48.538 INFO  CNNScoreVariants - ------------------------------------------------------------
  21:35:48.538 INFO  CNNScoreVariants - The Genome Analysis Toolkit (GATK) v4.3.0.0
  21:35:48.538 INFO  CNNScoreVariants - For support and documentation go to https://software.broadinstitute.org/gatk/
  21:35:48.538 INFO  CNNScoreVariants - Executing as mm49@tol-1-12-2 on Linux v4.15.0-175-generic amd64
  21:35:48.539 INFO  CNNScoreVariants - Java runtime: OpenJDK 64-Bit Server VM v11.0.15-internal+0-adhoc..src
  21:35:48.539 INFO  CNNScoreVariants - Start Date/Time: 30 November 2022 at 21:35:48 GMT
  21:35:48.539 INFO  CNNScoreVariants - ------------------------------------------------------------
  21:35:48.539 INFO  CNNScoreVariants - ------------------------------------------------------------
  21:35:48.540 INFO  CNNScoreVariants - HTSJDK Version: 3.0.1
  21:35:48.540 INFO  CNNScoreVariants - Picard Version: 2.27.5
  21:35:48.540 INFO  CNNScoreVariants - Built for Spark Version: 2.4.5
  21:35:48.540 INFO  CNNScoreVariants - HTSJDK Defaults.COMPRESSION_LEVEL : 2
  21:35:48.540 INFO  CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
  21:35:48.540 INFO  CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
  21:35:48.540 INFO  CNNScoreVariants - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
  21:35:48.540 INFO  CNNScoreVariants - Deflater: IntelDeflater
  21:35:48.540 INFO  CNNScoreVariants - Inflater: IntelInflater
  21:35:48.540 INFO  CNNScoreVariants - GCS max retries/reopens: 20
  21:35:48.541 INFO  CNNScoreVariants - Requester pays: disabled
  21:35:48.541 INFO  CNNScoreVariants - Initializing engine
  21:35:48.714 INFO  FeatureManager - Using codec VCFCodec to read file file://test.genome.vcf.gz
  21:35:48.719 WARN  IntelInflater - Zero Bytes Written : 0
  21:35:48.726 WARN  IntelInflater - Zero Bytes Written : 0
  21:35:48.732 INFO  CNNScoreVariants - Done initializing engine
  21:35:48.733 INFO  NativeLibraryLoader - Loading libgkl_utils.so from jar:file:/lustre/scratch123/tol/teams/tolit/users/mm49/cache/nextflow/conda/env-03c4af1da17e374b2411e3484a94d244/share/gatk4-4.3.0.0-0/gatk-package-4.3.0.0-local.jar!/com/intel/gkl/native/libgkl_utils.so
  21:35:48.832 INFO  CNNScoreVariants - Done scoring variants with CNN.
  21:35:48.832 INFO  CNNScoreVariants - Shutting down engine
  [30 November 2022 at 21:35:48 GMT] org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants done. Elapsed time: 0.01 minutes.
  Runtime.totalMemory()=2147483648
  java.lang.NullPointerException
        at org.broadinstitute.hellbender.utils.runtime.ProcessControllerAckResult.hasMessage(ProcessControllerAckResult.java:49)
        at org.broadinstitute.hellbender.utils.runtime.ProcessControllerAckResult.getDisplayMessage(ProcessControllerAckResult.java:69)
        at org.broadinstitute.hellbender.utils.runtime.StreamingProcessController.waitForAck(StreamingProcessController.java:235)
        at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.waitForAck(StreamingPythonScriptExecutor.java:216)
        at org.broadinstitute.hellbender.utils.python.StreamingPythonScriptExecutor.sendSynchronousCommand(StreamingPythonScriptExecutor.java:183)
        at org.broadinstitute.hellbender.tools.walkers.vqsr.CNNScoreVariants.onTraversalStart(CNNScoreVariants.java:313)
        at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1093)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
        at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
        at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
        at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
        at org.broadinstitute.hellbender.Main.main(Main.java:289)

@FriederikeHanssen
Copy link
Contributor Author

lol I checked this on my phone this morning and didn't see that this was my PR. But it stands: I could never get it to work and eventually gave up. 😆

Found the issue I had in mind: broadinstitute/gatk#7811

@FriederikeHanssen
Copy link
Contributor Author

thanks for picking this up again, I completely forgot about this PR. This one would have benefited from being nagged by an i-am-stale-please-take-care-of-me-bot 😆

@muffato
Copy link
Member

muffato commented Nov 30, 2022

OK. I'll close this PR and open an issue instead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants