-
Notifications
You must be signed in to change notification settings - Fork 596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
User gets StackOverflowError when using Multi-interval in GenomicsDBImport with GATK 4.0.6.0 #4994
Comments
It looks like there's a method in GenomicsDB ( We should find out how many intervals are in the user's list to confirm this theory. |
It also looks like the method in question is O(n^2) when it could be O(n log n) if it sorted the interval list first... |
broadinstitute/gatk#4994 Sort partitions and then look for overlaps - eliminate recursion
broadinstitute/gatk#4994 Sort partitions and then look for overlaps - eliminate recursion
The stack overflow issue should now be fixed, but the original forum user reported having around 11k intervals, which I think is still probably too many to use at once. See #5066. |
Closing -- this was patched. |
@kgururaj We got this issue report in the forum, could you please look into it? Thanks!
https://gatkforums.broadinstitute.org/gatk/discussion/12388/how-to-use-multi-interval-in-genomicsdbimport-with-gatk-4-0-6-0
I used the GenomicsDBImport with a interval list file and got a error like below.
So what is the correct way to use Multi-interval in GenomicsDBImport?
gatk version: 4.0.6.0
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -Xmx4g -Xms4g -jar /mnt/gatk/gatk-4.0.6.0/gatk-package-4.0.6.0-local.jar GenomicsDBImport -L test.intervals --genomicsdb-workspace-path ../RAW_VCF/my_database -V file1 -V file2 -V file3
02:57:15.591 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/mnt/workshop/xinchen.pan/test/gatk/gatk-4.0.6.0/gatk-package-4.0.6.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
02:57:15.772 INFO GenomicsDBImport - ------------------------------------------------------------
02:57:15.772 INFO GenomicsDBImport - The Genome Analysis Toolkit (GATK) v4.0.6.0
02:57:15.772 INFO GenomicsDBImport - For support and documentation go to https://software.broadinstitute.org/gatk/
02:57:15.772 INFO GenomicsDBImport - Executing as on Linux v3.10.0-514.6.1.el7.x86_64 amd64
02:57:15.772 INFO GenomicsDBImport - Java runtime: OpenJDK 64-Bit Server VM v1.8.0_121-b13
02:57:15.773 INFO GenomicsDBImport - Start Date/Time: July 10, 2018 2:57:15 AM EDT
02:57:15.773 INFO GenomicsDBImport - ------------------------------------------------------------
02:57:15.773 INFO GenomicsDBImport - ------------------------------------------------------------
02:57:15.773 INFO GenomicsDBImport - HTSJDK Version: 2.16.0
02:57:15.773 INFO GenomicsDBImport - Picard Version: 2.18.7
02:57:15.773 INFO GenomicsDBImport - HTSJDK Defaults.COMPRESSION_LEVEL : 2
02:57:15.773 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
02:57:15.773 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
02:57:15.773 INFO GenomicsDBImport - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
02:57:15.774 INFO GenomicsDBImport - Deflater: IntelDeflater
02:57:15.774 INFO GenomicsDBImport - Inflater: IntelInflater
02:57:15.774 INFO GenomicsDBImport - GCS max retries/reopens: 20
02:57:15.774 INFO GenomicsDBImport - Using google-cloud-java patch 6d11bef1c81f885c26b2b56c8616b7a705171e4f from https://github.com/droazen/google-cloud-java/tree/dr_all_nio_fixes
02:57:15.774 INFO GenomicsDBImport - Initializing engine
02:57:18.389 INFO IntervalArgumentCollection - Processing 11228744 bp from intervals
02:57:18.437 INFO GenomicsDBImport - Done initializing engine
Created workspace ../RAW_VCF/my_database
02:57:18.583 INFO GenomicsDBImport - Vid Map JSON file will be written to ../RAW_VCF/my_database/vidmap.json
02:57:18.583 INFO GenomicsDBImport - Callset Map JSON file will be written to ../RAW_VCF/my_database/callset.json
02:57:18.583 INFO GenomicsDBImport - Complete VCF Header will be written to ../RAW_VCF/my_database/vcfheader.vcf
02:57:18.583 INFO GenomicsDBImport - Importing to array - ../RAW_VCF/my_database/genomicsdb_array
02:57:18.583 INFO ProgressMeter - Starting traversal
02:57:18.583 INFO ProgressMeter - Current Locus Elapsed Minutes Batches Processed Batches/Minute
02:57:31.082 INFO GenomicsDBImport - Shutting down engine
[July 10, 2018 2:57:31 AM EDT] org.broadinstitute.hellbender.tools.genomicsdb.GenomicsDBImport done. Elapsed time: 0.26 minutes.
Runtime.totalMemory()=4116185088
Exception in thread "main" java.lang.StackOverflowError
at com.intel.genomicsdb.model.ImportConfig.isThereChromosomeIntervalIntersection(ImportConfig.java:95)
at com.intel.genomicsdb.model.ImportConfig.isThereChromosomeIntervalIntersection(ImportConfig.java:104)
at com.intel.genomicsdb.model.ImportConfig.isThereChromosomeIntervalIntersection(ImportConfig.java:104)
at com.intel.genomicsdb.model.ImportConfig.isThereChromosomeIntervalIntersection(ImportConfig.java:104)
at com.intel.genomicsdb.model.ImportConfig.isThereChromosomeIntervalIntersection(ImportConfig.java:104)
at com.intel.genomicsdb.model.ImportConfig.isThereChromosomeIntervalIntersection(ImportConfig.java:104)
at com.intel.genomicsdb.model.ImportConfig.isThereChromosomeIntervalIntersection(ImportConfig.java:104)
This Issue was generated from your [forums]
[forums]: https://gatkforums.broadinstitute.org/gatk/discussion/12388/how-to-use-multi-interval-in-genomicsdbimport-with-gatk-4-0-6-0/p1
The text was updated successfully, but these errors were encountered: