Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Funcotator: java.lang.IllegalArgumentException: Unexpected value: lncRNA #6708

Closed
pawelqs opened this issue Jul 17, 2020 · 18 comments
Closed
Assignees

Comments

@pawelqs
Copy link

pawelqs commented Jul 17, 2020

Bug Report

Affected tool(s) or class(es)

Funcotator

Affected version(s)

gatk-4.1.8.0
funcotator_dataSources.v1.7.20200521s

Description

I am trying to use Funcotator to annotate the variants that I have already detected. Unfortunatelly, after a few seconds Funcotator stops with the error:

java.lang.IllegalArgumentException: Unexpected value: lncRNA

I have no idea what is wrong and I did not find this error in the internet. Can it be a problem with JRE?

Full log below.

Steps to reproduce

~/programs/gatk-4.1.8.0/gatk Funcotator --variant filtered_variants/P1.vcf.gz --reference ~/resources/hg38_for_bwa/hs38DH.fa --ref-version hg38 --data-sources-path ~/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s --output filtered_variants/P1.avcf.gz --output-file-format VCF

Expected behavior

Foncotator annotates my variants

Actual behavior

(base) [pkus@master1 mutect_test]$ ~/programs/gatk-4.1.8.0/gatk Funcotator --variant filtered_variants/P1.vcf.gz --reference ~/resources/hg38_for_bwa/hs38DH.fa --ref-version hg38 --data-sources-path ~/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s --output filtered_variants/P1.avcf.gz --output-file-format VCF
Using GATK jar /home/pkus/programs/gatk-4.1.8.0/gatk-package-4.1.8.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/pkus/programs/gatk-4.1.8.0/gatk-package-4.1.8.0-local.jar Funcotator --variant filtered_variants/P1.vcf.gz --reference /home/pkus/resources/hg38_for_bwa/hs38DH.fa --ref-version hg38 --data-sources-path /home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s --output filtered_variants/P1.avcf.gz --output-file-format VCF
15:16:39.460 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/pkus/programs/gatk-4.1.8.0/gatk-package-4.1.8.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 17, 2020 3:16:39 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
15:16:39.785 INFO Funcotator - ------------------------------------------------------------
15:16:39.786 INFO Funcotator - The Genome Analysis Toolkit (GATK) v4.1.8.0
15:16:39.786 INFO Funcotator - For support and documentation go to https://software.broadinstitute.org/gatk/
15:16:39.787 INFO Funcotator - Executing as xxx on Linux v3.10.0-957.5.1.el7.x86_64 amd64
15:16:39.787 INFO Funcotator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_251-b08
15:16:39.787 INFO Funcotator - Start Date/Time: July 17, 2020 3:16:39 PM CEST
15:16:39.787 INFO Funcotator - ------------------------------------------------------------
15:16:39.787 INFO Funcotator - ------------------------------------------------------------
15:16:39.788 INFO Funcotator - HTSJDK Version: 2.22.0
15:16:39.788 INFO Funcotator - Picard Version: 2.22.8
15:16:39.788 INFO Funcotator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
15:16:39.788 INFO Funcotator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
15:16:39.788 INFO Funcotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
15:16:39.788 INFO Funcotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
15:16:39.789 INFO Funcotator - Deflater: IntelDeflater
15:16:39.789 INFO Funcotator - Inflater: IntelInflater
15:16:39.789 INFO Funcotator - GCS max retries/reopens: 20
15:16:39.789 INFO Funcotator - Requester pays: disabled
15:16:39.789 INFO Funcotator - Initializing engine
15:16:40.573 INFO FeatureManager - Using codec VCFCodec to read file file:///home/pkus/mutect_test/filtered_variants/P1.vcf.gz
15:16:40.902 INFO Funcotator - Done initializing engine
15:16:40.903 INFO Funcotator - Validating Sequence Dictionaries...
15:16:40.971 INFO Funcotator - Processing user transcripts/defaults/overrides...
15:16:40.972 INFO Funcotator - Initializing data sources...
15:16:40.975 INFO DataSourceUtils - Initializing data sources from directory: /home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s
15:16:40.978 INFO DataSourceUtils - Data sources version: 1.7.2020429s
15:16:40.978 INFO DataSourceUtils - Data sources source: ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/funcotator/funcotator_dataSources.v1.7.20200429s.tar.gz
15:16:40.978 INFO DataSourceUtils - Data sources alternate source: gs://broad-public-datasets/funcotator/funcotator_dataSources.v1.7.20200429s.tar.gz
15:16:40.996 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/CancerGeneCensus_Table_1_full_2012-03-15.txt -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/cancer_gene_census/hg38/CancerGeneCensus_Table_1_full_2012-03-15.txt
15:16:41.003 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/simple_uniprot_Dec012014.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/simple_uniprot/hg38/simple_uniprot_Dec012014.tsv
15:16:41.007 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/Familial_Cancer_Genes.no_dupes.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/familial/hg38/Familial_Cancer_Genes.no_dupes.tsv
15:16:41.010 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/oreganno.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/oreganno/hg38/oreganno.tsv
15:16:41.013 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/dnaRepairGenes.20180524T145835.csv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/dna_repair_genes/hg38/dnaRepairGenes.20180524T145835.csv
15:16:41.017 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/Cosmic.db -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/cosmic/hg38/Cosmic.db
15:16:41.020 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/hgnc_download_Nov302017.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/hgnc/hg38/hgnc_download_Nov302017.tsv
15:16:41.024 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/hg38_All_20180418.vcf.gz -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/dbsnp/hg38/hg38_All_20180418.vcf.gz
15:16:41.032 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode_xrefseq_v90_38.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/gencode_xrefseq/hg38/gencode_xrefseq_v90_38.tsv
15:16:41.036 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/cosmic_tissue.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/cosmic_tissue/hg38/cosmic_tissue.tsv
15:16:41.045 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode.v34.annotation.REORDERED.gtf -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/gencode/hg38/gencode.v34.annotation.REORDERED.gtf
15:16:41.047 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode.v34.pc_transcripts.fa -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/gencode/hg38/gencode.v34.pc_transcripts.fa
15:16:41.050 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/cosmic_fusion.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/cosmic_fusion/hg38/cosmic_fusion.tsv
15:16:41.053 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/achilles_lineage_results.import.txt -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/achilles/hg38/achilles_lineage_results.import.txt
15:16:41.056 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/clinvar_20180429_hg38.vcf -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/clinvar/hg38/clinvar_20180429_hg38.vcf
15:16:41.064 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode_xhgnc_v90_38.hg38.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/gencode_xhgnc/hg38/gencode_xhgnc_v90_38.hg38.tsv
15:16:41.064 INFO Funcotator - Finalizing data sources (this step can be long if data sources are cloud-based)...
15:16:41.066 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/CancerGeneCensus_Table_1_full_2012-03-15.txt -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/cancer_gene_census/hg38/CancerGeneCensus_Table_1_full_2012-03-15.txt
15:16:41.083 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/simple_uniprot_Dec012014.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/simple_uniprot/hg38/simple_uniprot_Dec012014.tsv
15:16:41.540 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/Familial_Cancer_Genes.no_dupes.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/familial/hg38/Familial_Cancer_Genes.no_dupes.tsv
15:16:41.545 INFO DataSourceUtils - Setting lookahead cache for data source: Oreganno : 100000
15:16:41.556 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/oreganno.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/oreganno/hg38/oreganno.tsv
15:16:41.575 INFO FeatureManager - Using codec XsvLocatableTableCodec to read file file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/oreganno/hg38/oreganno.config
15:16:41.707 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/oreganno.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/oreganno/hg38/oreganno.tsv
15:16:41.709 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/oreganno.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/oreganno/hg38/oreganno.tsv
WARNING 2020-07-17 15:16:41 AsciiLineReader Creating an indexable source for an AsciiFeatureCodec using a stream that is neither a PositionalBufferedStream nor a BlockCompressedInputStream
15:16:41.717 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/dnaRepairGenes.20180524T145835.csv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/dna_repair_genes/hg38/dnaRepairGenes.20180524T145835.csv
15:16:41.723 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/Cosmic.db -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/cosmic/hg38/Cosmic.db
15:16:42.012 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/hgnc_download_Nov302017.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/hgnc/hg38/hgnc_download_Nov302017.tsv
15:16:42.274 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/hg38_All_20180418.vcf.gz -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/dbsnp/hg38/hg38_All_20180418.vcf.gz
15:16:42.274 INFO DataSourceUtils - Setting lookahead cache for data source: dbSNP : 100000
15:16:42.297 INFO FeatureManager - Using codec VCFCodec to read file file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/dbsnp/hg38/hg38_All_20180418.vcf.gz
15:16:43.390 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/hg38_All_20180418.vcf.gz -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/dbsnp/hg38/hg38_All_20180418.vcf.gz
15:16:43.481 INFO FeatureManager - Using codec VCFCodec to read file file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/dbsnp/hg38/hg38_All_20180418.vcf.gz
15:16:43.571 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode_xrefseq_v90_38.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/gencode_xrefseq/hg38/gencode_xrefseq_v90_38.tsv
15:16:43.878 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/cosmic_tissue.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/cosmic_tissue/hg38/cosmic_tissue.tsv
15:16:43.926 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode.v34.annotation.REORDERED.gtf -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/gencode/hg38/gencode.v34.annotation.REORDERED.gtf
15:16:43.926 INFO DataSourceUtils - Setting lookahead cache for data source: Gencode : 100000
15:16:43.937 WARN GencodeGtfCodec - GENCODE GTF Header line 1 has a version number that is above maximum tested version (v 28) (given: 34): ##description: evidence-based annotation of the human genome (GRCh38), version 34 (Ensembl 100) Continuing, but errors may occur.
15:16:43.938 WARN GencodeGtfCodec - GENCODE GTF Header line 1 has a version number that is above maximum tested version (v 28) (given: 34): ##description: evidence-based annotation of the human genome (GRCh38), version 34 (Ensembl 100) Continuing, but errors may occur.
15:16:43.939 INFO FeatureManager - Using codec GencodeGtfCodec to read file file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/gencode/hg38/gencode.v34.annotation.REORDERED.gtf
15:16:43.946 WARN GencodeGtfCodec - GENCODE GTF Header line 1 has a version number that is above maximum tested version (v 28) (given: 34): ##description: evidence-based annotation of the human genome (GRCh38), version 34 (Ensembl 100) Continuing, but errors may occur.
15:16:44.093 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode.v34.pc_transcripts.fa -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/gencode/hg38/gencode.v34.pc_transcripts.fa
15:16:54.854 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/cosmic_fusion.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/cosmic_fusion/hg38/cosmic_fusion.tsv
15:16:54.876 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/achilles_lineage_results.import.txt -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/achilles/hg38/achilles_lineage_results.import.txt
15:16:54.881 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/clinvar_20180429_hg38.vcf -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/clinvar/hg38/clinvar_20180429_hg38.vcf
15:16:54.882 INFO DataSourceUtils - Setting lookahead cache for data source: ClinVar_VCF : 100000
15:16:54.890 INFO FeatureManager - Using codec VCFCodec to read file file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/clinvar/hg38/clinvar_20180429_hg38.vcf15:16:55.098 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/clinvar_20180429_hg38.vcf -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/clinvar/hg38/clinvar_20180429_hg38.vcf
15:16:55.199 INFO FeatureManager - Using codec VCFCodec to read file file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/clinvar/hg38/clinvar_20180429_hg38.vcf15:16:55.375 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode_xhgnc_v90_38.hg38.tsv -> file:///home/pkus/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s/gencode_xhgnc/hg38/gencode_xhgnc_v90_38.hg38.tsv
15:16:57.746 INFO Funcotator - Initializing Funcotator Engine...
15:16:57.777 INFO Funcotator - Creating a VCF file for output: file:/home/pkus/mutect_test/filtered_variants/P1.avcf.gz
15:16:57.894 INFO ProgressMeter - Starting traversal
15:16:57.894 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
15:16:57.979 INFO VcfFuncotationFactory - ClinVar_VCF 20180429_hg38 cache hits/total: 0/0
15:16:57.981 INFO VcfFuncotationFactory - dbSNP 9606_b151 cache hits/total: 0/0
15:16:57.991 INFO Funcotator - Shutting down engine
[July 17, 2020 3:16:57 PM CEST] org.broadinstitute.hellbender.tools.funcotator.Funcotator done. Elapsed time: 0.31 minutes.
Runtime.totalMemory()=883949568
java.lang.IllegalArgumentException: Unexpected value: lncRNA
at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfFeature$GeneTranscriptType.getEnum(GencodeGtfFeature.java:1052)
at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfFeature.(GencodeGtfFeature.java:158)
at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfGeneFeature.(GencodeGtfGeneFeature.java:19)
at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfGeneFeature.create(GencodeGtfGeneFeature.java:23)
at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfFeature$FeatureType$1.create(GencodeGtfFeature.java:753)
at org.broadinstitute.hellbender.utils.codecs.gtf.GencodeGtfFeature.create(GencodeGtfFeature.java:320)
at org.broadinstitute.hellbender.utils.codecs.gtf.AbstractGtfCodec.decode(AbstractGtfCodec.java:138)
at org.broadinstitute.hellbender.utils.codecs.gtf.AbstractGtfCodec.decode(AbstractGtfCodec.java:23)
at htsjdk.tribble.TribbleIndexedFeatureReader$QueryIterator.readNextRecord(TribbleIndexedFeatureReader.java:501)
at htsjdk.tribble.TribbleIndexedFeatureReader$QueryIterator.(TribbleIndexedFeatureReader.java:441)
at htsjdk.tribble.TribbleIndexedFeatureReader.query(TribbleIndexedFeatureReader.java:297)
at org.broadinstitute.hellbender.engine.FeatureDataSource.refillQueryCache(FeatureDataSource.java:567)
at org.broadinstitute.hellbender.engine.FeatureDataSource.queryAndPrefetch(FeatureDataSource.java:536)
at org.broadinstitute.hellbender.engine.FeatureManager.getFeatures(FeatureManager.java:352)
at org.broadinstitute.hellbender.engine.FeatureContext.getValues(FeatureContext.java:173)
at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.queryFeaturesFromFeatureContext(DataSourceFuncotationFactory.java:314)
at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.getFeaturesFromFeatureContext(DataSourceFuncotationFactory.java:229)
at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.createFuncotations(DataSourceFuncotationFactory.java:207)
at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.createFuncotations(DataSourceFuncotationFactory.java:182)
at org.broadinstitute.hellbender.tools.funcotator.FuncotatorEngine.lambda$createFuncotationMapForVariant$0(FuncotatorEngine.java:147)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at org.broadinstitute.hellbender.tools.funcotator.FuncotatorEngine.createFuncotationMapForVariant(FuncotatorEngine.java:157)
at org.broadinstitute.hellbender.tools.funcotator.Funcotator.enqueueAndHandleVariant(Funcotator.java:904)
at org.broadinstitute.hellbender.tools.funcotator.Funcotator.apply(Funcotator.java:858)
at org.broadinstitute.hellbender.engine.VariantWalker.lambda$traverse$0(VariantWalker.java:104)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.broadinstitute.hellbender.engine.VariantWalker.traverse(VariantWalker.java:102)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1049)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)

I have also tried to use older version of funcotator data sources, funcotator_dataSources.v1.6.20190124s, then the resulting error is:

org.broadinstitute.hellbender.exceptions.GATKException: Unable to query the database for geneName: NCRNA00115

@lbergelson
Copy link
Member

It looks like a typo either in the datasource or in funcotator. It's finding something labelled "lncRNA" but looking for "lincRNA".

@forg-jw
Copy link

forg-jw commented Jul 18, 2020

I am also troubled by this problem. My gatk version is 4.1.6.0 and Funcotator data source is funcotator_dataSources.v1.7.20200521s.

@pawelqs
Copy link
Author

pawelqs commented Jul 19, 2020

@lbergelson should I modify the datasource then?

@jonn-smith
Copy link
Collaborator

jonn-smith commented Jul 20, 2020

@pawel125 @forg-yu Please do not modify the datasources - they are well-formed and correct, just newer than (and incompatible with) the GATK version you're using. See below for a quick solution.

This version of the Funcotator data sources is not supported yet. Datasources have to be released prior to merging code changes that support them. I have been working on this data sources release for quite some time, but the code changes have not gone in yet to support it.

Until the 4.1.9.0 GATK release, please continue to use v1.6.20190124

@jonn-smith
Copy link
Collaborator

I've created a new issue to make sure the error message for this is better in the future (#6712). This will be included in 4.1.9.0.

@pawelqs
Copy link
Author

pawelqs commented Jul 21, 2020

I have tried out also this version and as I mentioned it also results in an error. Here is the full output:

(base) [pkus@wn45 mutect_test]$ ~/programs/gatk-4.1.8.0/gatk Funcotator --variant filtered_variants/P1.vcf.gz --reference ~/resources/hg38_for_bwa/hs38DH.fa --ref-version hg38 --data-sources-path ~/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s --output filtered_variants/P1.avcf.gz --output-file-format VCF
Using GATK jar /home/pkus/programs/gatk-4.1.8.0/gatk-package-4.1.8.0-local.jar
Running:
java -Dsamjdk.use_async_io_read_samtools=false -Dsamjdk.use_async_io_write_samtools=true -Dsamjdk.use_async_io_write_tribble=false -Dsamjdk.compression_level=2 -jar /home/pkus/programs/gatk-4.1.8.0/gatk-package-4.1.8.0-local.jar Funcotator --variant filtered_variants/P1.vcf.gz --reference /home/pkus/resources/hg38_for_bwa/hs38DH.fa --ref-version hg38 --data-sources-path /home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s --output filtered_variants/P1.avcf.gz --output-file-format VCF
12:28:16.251 INFO NativeLibraryLoader - Loading libgkl_compression.so from jar:file:/home/pkus/programs/gatk-4.1.8.0/gatk-package-4.1.8.0-local.jar!/com/intel/gkl/native/libgkl_compression.so
Jul 21, 2020 12:28:16 PM shaded.cloud_nio.com.google.auth.oauth2.ComputeEngineCredentials runningOnComputeEngine
INFO: Failed to detect whether we are running on Google Compute Engine.
12:28:16.537 INFO Funcotator - ------------------------------------------------------------
12:28:16.538 INFO Funcotator - The Genome Analysis Toolkit (GATK) v4.1.8.0
12:28:16.538 INFO Funcotator - For support and documentation go to https://software.broadinstitute.org/gatk/
12:28:16.541 INFO Funcotator - Executing as xxx on Linux v3.10.0-123.20.1.el7.x86_64 amd64
12:28:16.541 INFO Funcotator - Java runtime: Java HotSpot(TM) 64-Bit Server VM v1.8.0_251-b08
12:28:16.542 INFO Funcotator - Start Date/Time: July 21, 2020 12:28:16 PM CEST
12:28:16.542 INFO Funcotator - ------------------------------------------------------------
12:28:16.542 INFO Funcotator - ------------------------------------------------------------
12:28:16.542 INFO Funcotator - HTSJDK Version: 2.22.0
12:28:16.543 INFO Funcotator - Picard Version: 2.22.8
12:28:16.543 INFO Funcotator - HTSJDK Defaults.COMPRESSION_LEVEL : 2
12:28:16.543 INFO Funcotator - HTSJDK Defaults.USE_ASYNC_IO_READ_FOR_SAMTOOLS : false
12:28:16.543 INFO Funcotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_SAMTOOLS : true
12:28:16.543 INFO Funcotator - HTSJDK Defaults.USE_ASYNC_IO_WRITE_FOR_TRIBBLE : false
12:28:16.543 INFO Funcotator - Deflater: IntelDeflater
12:28:16.543 INFO Funcotator - Inflater: IntelInflater
12:28:16.543 INFO Funcotator - GCS max retries/reopens: 20
12:28:16.543 INFO Funcotator - Requester pays: disabled
12:28:16.543 INFO Funcotator - Initializing engine
12:28:17.254 INFO FeatureManager - Using codec VCFCodec to read file file:///home/pkus/mutect_test/filtered_variants/P1.vcf.gz
12:28:17.687 INFO Funcotator - Done initializing engine
12:28:17.688 INFO Funcotator - Validating Sequence Dictionaries...
12:28:17.755 INFO Funcotator - Processing user transcripts/defaults/overrides...
12:28:17.756 INFO Funcotator - Initializing data sources...
12:28:17.759 INFO DataSourceUtils - Initializing data sources from directory: /home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s
12:28:17.775 INFO DataSourceUtils - Data sources version: 1.6.2019124s
12:28:17.776 INFO DataSourceUtils - Data sources source: ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/funcotator/funcotator_dataSources.v1.6.20190124s.tar.gz
12:28:17.776 INFO DataSourceUtils - Data sources alternate source: gs://broad-public-datasets/funcotator/funcotator_dataSources.v1.6.20190124s.tar.gz
12:28:17.795 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/chr1_b_bed.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/chr1_b_bed/hg38/chr1_b_bed.tsv
12:28:17.805 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/oreganno.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/oreganno/hg38/oreganno.tsv
12:28:17.827 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/simple_uniprot_Dec012014.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/simple_uniprot/hg38/simple_uniprot_Dec012014.tsv
12:28:17.835 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/Familial_Cancer_Genes.no_dupes.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/familial/hg38/Familial_Cancer_Genes.no_dupes.tsv
12:28:17.841 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/hg38_All_20170710.vcf.gz -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/dbsnp/hg38/hg38_All_20170710.vcf.gz
12:28:17.849 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/CancerGeneCensus_Table_1_full_2012-03-15.txt -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/cancer_gene_census/hg38/CancerGeneCensus_Table_1_full_2012-03-15.txt
12:28:17.856 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/Cosmic.db -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/cosmic/hg38/Cosmic.db
12:28:17.862 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/cosmic_tissue.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/cosmic_tissue/hg38/cosmic_tissue.tsv
12:28:17.868 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/chr1_a_bed.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/chr1_a_bed/hg38/chr1_a_bed.tsv
12:28:17.875 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/cosmic_fusion.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/cosmic_fusion/hg38/cosmic_fusion.tsv
12:28:17.882 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode.v28.annotation.REORDERED.gtf -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/gencode/hg38/gencode.v28.annotation.REORDERED.gtf
12:28:17.884 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode.v28.pc_transcripts.fa -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/gencode/hg38/gencode.v28.pc_transcripts.fa
12:28:17.902 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/dnaRepairGenes.20180524T145835.csv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/dna_repair_genes/hg38/dnaRepairGenes.20180524T145835.csv
12:28:17.909 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode_xhgnc_v90_38.hg38.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/gencode_xhgnc/hg38/gencode_xhgnc_v90_38.hg38.tsv
12:28:17.925 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/achilles_lineage_results.import.txt -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/achilles/hg38/achilles_lineage_results.import.txt
12:28:17.932 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode_xrefseq_v90_38.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/gencode_xrefseq/hg38/gencode_xrefseq_v90_38.tsv
12:28:17.939 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/hgnc_download_Nov302017.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/hgnc/hg38/hgnc_download_Nov302017.tsv
12:28:17.939 INFO Funcotator - Finalizing data sources (this step can be long if data sources are cloud-based)...
12:28:17.940 INFO DataSourceUtils - Setting lookahead cache for data source: chr1_b_bed : 100000
12:28:17.951 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/chr1_b_bed.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/chr1_b_bed/hg38/chr1_b_bed.tsv
12:28:17.967 INFO FeatureManager - Using codec XsvLocatableTableCodec to read file file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/chr1_b_bed/hg38/chr1_b_bed.config
12:28:17.995 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/chr1_b_bed.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/chr1_b_bed/hg38/chr1_b_bed.tsv
12:28:17.997 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/chr1_b_bed.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/chr1_b_bed/hg38/chr1_b_bed.tsv
WARNING 2020-07-21 12:28:17 AsciiLineReader Creating an indexable source for an AsciiFeatureCodec using a stream that is neither a PositionalBufferedStream nor a BlockCompressedInputStream
12:28:18.002 INFO DataSourceUtils - Setting lookahead cache for data source: Oreganno : 100000
12:28:18.009 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/oreganno.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/oreganno/hg38/oreganno.tsv
12:28:18.020 INFO FeatureManager - Using codec XsvLocatableTableCodec to read file file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/oreganno/hg38/oreganno.config
12:28:18.120 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/oreganno.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/oreganno/hg38/oreganno.tsv
12:28:18.121 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/oreganno.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/oreganno/hg38/oreganno.tsv
WARNING 2020-07-21 12:28:18 AsciiLineReader Creating an indexable source for an AsciiFeatureCodec using a stream that is neither a PositionalBufferedStream nor a BlockCompressedInputStream
12:28:18.125 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/simple_uniprot_Dec012014.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/simple_uniprot/hg38/simple_uniprot_Dec012014.tsv
12:28:18.424 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/Familial_Cancer_Genes.no_dupes.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/familial/hg38/Familial_Cancer_Genes.no_dupes.tsv
12:28:18.442 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/hg38_All_20170710.vcf.gz -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/dbsnp/hg38/hg38_All_20170710.vcf.gz
12:28:18.442 INFO DataSourceUtils - Setting lookahead cache for data source: dbSNP : 100000
12:28:18.452 INFO FeatureManager - Using codec VCFCodec to read file file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/dbsnp/hg38/hg38_All_20170710.vcf.gz
12:28:18.599 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/hg38_All_20170710.vcf.gz -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/dbsnp/hg38/hg38_All_20170710.vcf.gz
12:28:19.018 INFO FeatureManager - Using codec VCFCodec to read file file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/dbsnp/hg38/hg38_All_20170710.vcf.gz
12:28:19.213 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/CancerGeneCensus_Table_1_full_2012-03-15.txt -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/cancer_gene_census/hg38/CancerGeneCensus_Table_1_full_2012-03-15.txt
12:28:19.227 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/Cosmic.db -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/cosmic/hg38/Cosmic.db
12:28:19.401 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/cosmic_tissue.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/cosmic_tissue/hg38/cosmic_tissue.tsv
12:28:19.487 INFO DataSourceUtils - Setting lookahead cache for data source: chr1_a_bed : 100000
12:28:19.495 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/chr1_a_bed.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/chr1_a_bed/hg38/chr1_a_bed.tsv
12:28:19.500 INFO FeatureManager - Using codec XsvLocatableTableCodec to read file file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/chr1_a_bed/hg38/chr1_a_bed.config
12:28:19.505 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/chr1_a_bed.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/chr1_a_bed/hg38/chr1_a_bed.tsv
12:28:19.507 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/chr1_a_bed.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/chr1_a_bed/hg38/chr1_a_bed.tsv
WARNING 2020-07-21 12:28:19 AsciiLineReader Creating an indexable source for an AsciiFeatureCodec using a stream that is neither a PositionalBufferedStream nor a BlockCompressedInputStream
12:28:19.512 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/cosmic_fusion.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/cosmic_fusion/hg38/cosmic_fusion.tsv
12:28:19.522 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode.v28.annotation.REORDERED.gtf -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/gencode/hg38/gencode.v28.annotation.REORDERED.gtf
12:28:19.522 INFO DataSourceUtils - Setting lookahead cache for data source: Gencode : 100000
12:28:19.552 INFO FeatureManager - Using codec GencodeGtfCodec to read file file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/gencode/hg38/gencode.v28.annotation.REORDERED.gtf
12:28:19.589 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode.v28.pc_transcripts.fa -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/gencode/hg38/gencode.v28.pc_transcripts.fa
12:28:27.529 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/dnaRepairGenes.20180524T145835.csv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/dna_repair_genes/hg38/dnaRepairGenes.20180524T145835.csv
12:28:27.546 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode_xhgnc_v90_38.hg38.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/gencode_xhgnc/hg38/gencode_xhgnc_v90_38.hg38.tsv
12:28:28.862 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/achilles_lineage_results.import.txt -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/achilles/hg38/achilles_lineage_results.import.txt
12:28:28.866 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/gencode_xrefseq_v90_38.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/gencode_xrefseq/hg38/gencode_xrefseq_v90_38.tsv
12:28:31.215 INFO DataSourceUtils - Resolved data source file path: file:///home/pkus/mutect_test/hgnc_download_Nov302017.tsv -> file:///home/pkus/resources/gatk/funcotator/funcotator_dataSources.v1.6.20190124s/hgnc/hg38/hgnc_download_Nov302017.tsv
12:28:31.563 INFO Funcotator - Initializing Funcotator Engine...
12:28:31.593 INFO Funcotator - Creating a VCF file for output: file:/home/pkus/mutect_test/filtered_variants/P1.avcf.gz
12:28:31.731 INFO ProgressMeter - Starting traversal
12:28:31.731 INFO ProgressMeter - Current Locus Elapsed Minutes Variants Processed Variants/Minute
12:28:31.969 INFO VcfFuncotationFactory - dbSNP 9606_b150 cache hits/total: 0/0
12:28:31.975 INFO Funcotator - Shutting down engine
[July 21, 2020 12:28:31 PM CEST] org.broadinstitute.hellbender.tools.funcotator.Funcotator done. Elapsed time: 0.26 minutes.
Runtime.totalMemory()=2200961024
org.broadinstitute.hellbender.exceptions.GATKException: Unable to query the database for geneName: NCRNA00115
at org.broadinstitute.hellbender.tools.funcotator.dataSources.cosmic.CosmicFuncotationFactory.createFuncotationsOnVariant(CosmicFuncotationFactory.java:320)
at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.determineFuncotations(DataSourceFuncotationFactory.java:245)
at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.createFuncotations(DataSourceFuncotationFactory.java:211)
at org.broadinstitute.hellbender.tools.funcotator.FuncotatorEngine.createFuncotationMapForVariant(FuncotatorEngine.java:173)
at org.broadinstitute.hellbender.tools.funcotator.Funcotator.enqueueAndHandleVariant(Funcotator.java:904)
at org.broadinstitute.hellbender.tools.funcotator.Funcotator.apply(Funcotator.java:858)
at org.broadinstitute.hellbender.engine.VariantWalker.lambda$traverse$0(VariantWalker.java:104)
at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:184)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.broadinstitute.hellbender.engine.VariantWalker.traverse(VariantWalker.java:102)
at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1049)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:163)
at org.broadinstitute.hellbender.Main.mainEntry(Main.java:206)
at org.broadinstitute.hellbender.Main.main(Main.java:292)
Caused by: org.sqlite.SQLiteException: [SQLITE_IOERR_LOCK] I/O error in the advisory file locking logic (disk I/O error)
at org.sqlite.core.DB.newSQLException(DB.java:909)
at org.sqlite.core.DB.newSQLException(DB.java:921)
at org.sqlite.core.DB.throwex(DB.java:886)
at org.sqlite.core.NativeDB.prepare_utf8(Native Method)
at org.sqlite.core.NativeDB.prepare(NativeDB.java:127)
at org.sqlite.core.DB.prepare(DB.java:227)
at org.sqlite.jdbc3.JDBC3Statement.executeQuery(JDBC3Statement.java:81)
at org.broadinstitute.hellbender.tools.funcotator.dataSources.cosmic.CosmicFuncotationFactory.createFuncotationsOnVariant(CosmicFuncotationFactory.java:288)
... 26 more

Is it another typo?

@jonn-smith
Copy link
Collaborator

@pawel125 This looks like a filesystem error - I/O error in the advisory file locking logic (disk I/O error). Are you using an NFS file system to store the datasources or some other kind of network-mounted drive?

To be clear - the first issue you had was not a typo. The v1.7 data sources are not backwards compatible and the code changes haven't been merged yet.

@pawelqs
Copy link
Author

pawelqs commented Jul 21, 2020

Oh, yes, it is surely not a typo, it was just my mental shortcut because of the previous messages. I use the local storage on our computing cluster. I will try to download the files once again, maybe it will solve the problem.

@jonn-smith
Copy link
Collaborator

Just making sure 😛

I don't think you have to download them again, but I have seen SQLite do some strange things on NFS drives sometimes. When I looked for the issue a couple StackOverflow posts indicated it was a SQLite + NFS issue.

If you have a local disk you can store the data sources on that would probably fix the issue immediately. I'm not sure what the exact problem is with NFS + SQLite, unfortunately.

@forg-jw
Copy link

forg-jw commented Jul 22, 2020

Thanks, funcotator_dataSources.v1.6.20190124s is works fine for me. I think lbergelson is right, the bug is caused by the different abbreviations of long non-coding RNA: 'lncRNA' in gencode.v34.annotation.REORDERED.gtf of v1.7.20200521, 'lincRNA' in gencode.v28.annotation.REORDERED.gtf of v1.6.20190124. I have tried substitude the whole gencode/ directory with v1.6.20190124's, and it worked OK. (Please forgive me for modifying the data source)

@jonn-smith
Copy link
Collaborator

@forg-yu Glad to hear you have it working!

You are correct about the difference between the Gencode versions long non-coding RNA tag. In addition to this, there are several other tags used in Gencode v34 that were not present in v28. The latest Funcotator code (not yet merged into master - pr #6660) has parser updates to allow for these new values, but the old code (GATK 4.1.8.1 and earlier) doesn't have these parsing updates. This is the unfortunate price we pay for updating the Gencode datasource with the new datasources release. ☹️

The issue you ran into is not exactly a bug, but an artifact of our data source release process. In order to test them, the data sources must be posted before the code changes to support them (so we can test the code against the data sources as released). Unfortunately there was no warning mechanism to users to let them know that newer data source versions are not yet supported (checks against older versions were already present). I've created an issue (#6712) and a branch (jts_funcotator_version_max_6712) that adds in such checks, so pretty soon there will be a warning rather than a confusing stack trace.

@jonn-smith
Copy link
Collaborator

@forg-yu Also, moving over the old Gencode datasource is totally fine. The datasources are a bundle, but are designed to be changed by the user. My comment earlier was more to prevent you from doing a find/replace on lincRNA->lncRNA because other things had changed as well.

You'll probably want to move to the latest version when it's merged in, though, since it will contain Gencode v34 and not v28.

@forg-jw
Copy link

forg-jw commented Jul 22, 2020

@jonn-smith, got it, thank you for your clarification.

@pawelqs
Copy link
Author

pawelqs commented Jul 22, 2020

@jonn-smith I still have the problem with SQLite and I have no idea how to store the data more locally than I do now, keeping the files in the file system. Our cluster uses Lustre, does it cause the problem?

@jonn-smith
Copy link
Collaborator

jonn-smith commented Jul 22, 2020

@pawel125 From what I've found there are several posts mentioning issues with Lustre and sqlite:

I haven't looked into it, but maybe one of them can help. If you happen to have a /local drive that is a purely local disk on a single node, then you can work from there with no issue and it won't change Funcotator's runtime.

We don't seem to have a Lustre filesystem I can play with so I can't really do any testing.

@tedsharpe
Copy link
Contributor

I'm probably being pedantic, but a lincRNA is a subtype of lncRNA. Specifically, a lincRNA is a long intergenic non-coding RNA.1

1https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5889127/

@jonn-smith
Copy link
Collaborator

@tedsharpe Interesting. That's good to know. Both annotations are still in the code (to preserve reverse-compatibility), so we can now cover both kinds of ncRNAs.

@pawelqs
Copy link
Author

pawelqs commented Jul 23, 2020

@jonn-smith Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants