Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in the native PairHMM on certain CPU / JVM combinations #187

Open
droazen opened this issue May 26, 2023 · 0 comments
Open

Segfault in the native PairHMM on certain CPU / JVM combinations #187

droazen opened this issue May 26, 2023 · 0 comments

Comments

@droazen
Copy link

droazen commented May 26, 2023

We have reports (https://gatk.broadinstitute.org/hc/en-us/community/posts/15590727717659-A-fatal-error-has-been-detected-by-the-Java-Runtime-Environment-when-running-Haplotypecaller-) that the GKL can crash on certain CPU architectures and/or with certain JVMs. Below is an excerpt of a core dump from a user running on an Intel(R) Xeon(R) Gold 6242R CPU with OpenJDK 64-Bit Server VM 17.0.3. Another user added the following:

"This particular error was quite prominent in all of my skylake/cascade lake class hardware especially when I used openjdk provided by the OS repository. Is it possible for you to try openjdk from eclipse temurin (www.adoptium.net). ? There are other options available as well like using the docker instance of gatk could help. Also you may want to try reducing the number of pairHMM threads to 1 instead of the default 4 or any other value being used by your command. AVX512 capable cpus from intel tend to have this issue with certain java builds."

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007f799b0bda54, pid=268132, tid=268133
#
# JRE version: OpenJDK Runtime Environment (17.0.3) (build 17.0.3-internal+0-adhoc..src)
# Java VM: OpenJDK 64-Bit Server VM (17.0.3-internal+0-adhoc..src, mixed mode, sharing, tiered, compressed oops, compressed class ptrs, g1 gc, linux-amd64)
# Problematic frame:
# C  [libc.so.6+0x8fa54]  __memset_sse2+0x54
#
# Core dump will be written. Default location: /scale03/nikhil/augustus2/snpcalling1/93Y21/core.268132
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

---------------  S U M M A R Y ------------

Command Line: -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 /scale03/nikhil/envs/gatk4/share/gatk4-4.4.0.0-0/gatk-package-4.4.0.0-local.jar HaplotypeCaller -R Gmax_508_v4.0.fa -I 93Y21.bam -O rawvariants.vcf

Host: Intel(R) Xeon(R) Gold 6242R CPU @ 3.10GHz, 40 cores, 2636G, CentOS Linux release 7.9.2009 (Core)
Time: Thu May 18 10:14:49 2023 CDT elapsed time: 28701.945198 seconds (0d 7h 58m 21s)

---------------  T H R E A D  ---------------

Current thread (0x00007f79940146c0):  JavaThread "main" [_thread_in_native, id=268133, stack(0x00007f799b4b7000,0x00007f799b5b8000)]

Stack: [0x00007f799b4b7000,0x00007f799b5b8000],  sp=0x00007f799b462e28,  free space=18014398509481647k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libc.so.6+0x8fa54]  __memset_sse2+0x54
C  [libgkl_pairhmm_omp8599022423123915931.so+0x1500f]  Java_com_intel_gkl_pairhmm_IntelPairHmm_computeLikelihoodsNative._omp_fn.0+0xcf

Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
J 7481  com.intel.gkl.pairhmm.IntelPairHmm.computeLikelihoodsNative([Ljava/lang/Object;[Ljava/lang/Object;[D)V (0 bytes) @ 0x00007f79858c7e9c [0x00007f79858c7e20+0x000000000000007c]
J 8402 c2 com.intel.gkl.pairhmm.IntelPairHmm.computeLikelihoods([Lorg/broadinstitute/gatk/nativebindings/pairhmm/ReadDataHolder;[Lorg/broadinstitute/gatk/nativebindings/pairhmm/HaplotypeDataHolder;[D)V (87 bytes) @ 0x00007f7985bbb200 [0x00007f7985bbb1c0+0x0000000000000040]
J 6576 c2 org.broadinstitute.hellbender.utils.pairhmm.VectorLoglessPairHMM.computeLog10Likelihoods(Lorg/broadinstitute/hellbender/utils/genotyper/LikelihoodMatrix;Ljava/util/List;Lorg/broadinstitute/hellbender/utils/pairhmm/PairHMMInputScoreImputator;)V (450 bytes) @ 0x00007f798575ef78 [0x00007f798575d840+0x0000000000001738]
J 7873 c2 org.broadinstitute.hellbender.tools.walkers.haplotypecaller.PairHMMLikelihoodCalculationEngine.computeReadLikelihoods(Lorg/broadinstitute/hellbender/utils/genotyper/LikelihoodMatrix;)V (132 bytes) @ 0x00007f79858c0fd4 [0x00007f79858bf100+0x0000000000001ed4]
J 8416 c2 org.broadinstitute.hellbender.tools.walkers.haplotypecaller.PairHMMLikelihoodCalculationEngine.computeReadLikelihoods(Lorg/broadinstitute/hellbender/tools/walkers/haplotypecaller/AssemblyResultSet;Lorg/broadinstitute/hellbender/utils/genotyper/SampleList;Ljava/util/Map;Z)Lorg/broadinstitute/hellbender/utils/genotyper/AlleleLikelihoods; (25 bytes) @ 0x00007f7985bc9850 [0x00007f7985bc70a0+0x00000000000027b0]
J 8280 c2 org.broadinstitute.hellbender.tools.walkers.haplotypecaller.HaplotypeCallerEngine.callRegion(Lorg/broadinstitute/hellbender/engine/AssemblyRegion;Lorg/broadinstitute/hellbender/engine/FeatureContext;Lorg/broadinstitute/hellbender/engine/ReferenceContext;)Ljava/util/List; (1934 bytes) @ 0x00007f7985b3ca1c [0x00007f7985b3a120+0x00000000000028fc]
J 9114% c2 org.broadinstitute.hellbender.engine.AssemblyRegionWalker.processReadShard(Lorg/broadinstitute/hellbender/engine/MultiIntervalLocalReadShard;Lorg/broadinstitute/hellbender/engine/ReferenceDataSource;Lorg/broadinstitute/hellbender/engine/FeatureManager;)V (154 bytes) @ 0x00007f7985d843ac [0x00007f7985d83c00+0x00000000000007ac]
j  org.broadinstitute.hellbender.engine.AssemblyRegionWalker.traverse()V+83
j  org.broadinstitute.hellbender.engine.GATKTool.doWork()Ljava/lang/Object;+19
j  org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool()Ljava/lang/Object;+34
j  org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs()Ljava/lang/Object;+225
j  org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain([Ljava/lang/String;)Ljava/lang/Object;+14
j  org.broadinstitute.hellbender.Main.runCommandLineProgram(Lorg/broadinstitute/hellbender/cmdline/CommandLineProgram;[Ljava/lang/String;)Ljava/lang/Object;+20
j  org.broadinstitute.hellbender.Main.mainEntry([Ljava/lang/String;)V+22
j  org.broadinstitute.hellbender.Main.main([Ljava/lang/String;)V+8
v  ~StubRoutines::call_stub

siginfo: si_signo: 11 (SIGSEGV), si_code: 2 (SEGV_ACCERR), si_addr: 0x00007f799b4a97c0
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant