Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Funcotator logic bug #6289

Closed
2 tasks
tmm211 opened this issue Nov 27, 2019 · 26 comments
Closed
2 tasks

Funcotator logic bug #6289

tmm211 opened this issue Nov 27, 2019 · 26 comments
Assignees

Comments

@tmm211
Copy link

tmm211 commented Nov 27, 2019

Bug Report

Affected tool(s) or class(es)

Funcotator

Affected version(s)

  • v.4.1.3.0 using us.gcr.io/broad-gatk/gatk:4.1.3.0
  • 11/22 failed in Terra

Description

There is a bug in the logic with how Funcotator is handling this variant. It is a variant after
chr14:24655355
Stacktrace
Screen Shot 2019-11-27 at 4 37 31 PM

Jonn has the input files, log file, and WDL.

Steps to reproduce

To reproduce this issue, all the inputs and full pipeline are listed in this Zendesk ticket 3847. Contact Tiffany for access

Expected behavior

The tool should handle this situation more gracefully?

Actual behavior

It fails with a java.lang.StringIndexOutOfBoundsException: String index out of range: 776


@lincoln-harris
Copy link

hey guys I'm seeing the same error

@jonn-smith
Copy link
Collaborator

@tmm211 @lincoln-harris I'll put this back on my radar. Thanks for your patience.

@lydiarck
Copy link

I have the same error; let me know if it would be helpful to have my input files, logs, etc. I'm seeing it in version 4.1.4.0.

@shandy79
Copy link

shandy79 commented Feb 6, 2020

Also seeing this in 4.1.4.1. From my testing, this is happening back through version 4.1.1.0, with 4.1.0.0 being unaffected.

@tmm211
Copy link
Author

tmm211 commented Jul 13, 2020

@jonn-smith I am working with someone on a featured workspace for a paper that will likely highlight this issue. Any thoughts on when it will get prioritized?

@jonn-smith
Copy link
Collaborator

I'm going to review all the Funcotator issues in the next few weeks and start addressing them then.

@jonn-smith
Copy link
Collaborator

The example variant from the above Terra run is:

chr14   24655355        .       C       CG      .       .

@DadongZ
Copy link

DadongZ commented Sep 8, 2020

I got same error for gatk 4.1.8.0. Any solutions?

@jonn-smith
Copy link
Collaborator

This will be the next bug I look into. It's part of a family of issues that all relate to how the predicted protein change is created.

@samanthahv
Copy link

Another user reported having the same issue with gatk 4.1.7.0

@alanhoyle
Copy link

I am seeing this same error with gatk 4.1.9.0. At the very least, it would be nice if Funcotator caught the error and printed a warning, omitted the problematic variant, and continued instead of crashing and leaving truncated output. the VCF I'm annotating is WGS with 150+ samples, and it's crashing on a variant on chr9.

@jonn-smith
Copy link
Collaborator

@alanhoyle This is on the todo list for this quarter - there are a variety of similar issues with the protein change string that should all be resolved together.

@alanhoyle
Copy link

Thank you. The line(s) that are causing the problem for me begin as follows:

chr9    67726241        .       T       A     104    PASS
chr9    67726241        .       TCA     TCACACA,TCACACACA,TCACA,T    182	PASS

This seems to be an interesting position.

@jonn-smith
Copy link
Collaborator

Thanks for the test case! From what I've seen in the past this issue happens with indels, so this makes sense to me.

@alanhoyle
Copy link

After a good deal of editing, this is an almost minimal complete VCF that throws the error (GRCh38):

##fileformat=VCFv4.1
##contig=<ID=chr9,length=138394717>
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	Sample
chr9	67726241	.	TCA	TCACACA,TCACACACA	182	PASS	.	GT	./.

The actual error I'm getting is this:

[February 22, 2021 11:11:51 PM UTC] org.broadinstitute.hellbender.tools.funcotator.Funcotator done. Elapsed time: 0.12 minutes.
Runtime.totalMemory()=2552233984
java.lang.StringIndexOutOfBoundsException: String index out of range: 117
	at java.lang.String.substring(String.java:1963)
	at org.broadinstitute.hellbender.tools.funcotator.ProteinChangeInfo.initializeForInsertion(ProteinChangeInfo.java:256)
	at org.broadinstitute.hellbender.tools.funcotator.ProteinChangeInfo.<init>(ProteinChangeInfo.java:93)
	at org.broadinstitute.hellbender.tools.funcotator.ProteinChangeInfo.create(ProteinChangeInfo.java:371)
	at org.broadinstitute.hellbender.tools.funcotator.dataSources.gencode.GencodeFuncotationFactory.createSequenceComparison(GencodeFuncotationFactory.java:2045)
	at org.broadinstitute.hellbender.tools.funcotator.dataSources.gencode.GencodeFuncotationFactory.createCodingRegionFuncotationForProteinCodingFeature(GencodeFuncotationFactory.java:1235)
	at org.broadinstitute.hellbender.tools.funcotator.dataSources.gencode.GencodeFuncotationFactory.createExonFuncotation(GencodeFuncotationFactory.java:1086)
	at org.broadinstitute.hellbender.tools.funcotator.dataSources.gencode.GencodeFuncotationFactory.createGencodeFuncotationOnSingleTranscript(GencodeFuncotationFactory.java:1020)
	at org.broadinstitute.hellbender.tools.funcotator.dataSources.gencode.GencodeFuncotationFactory.createFuncotationsHelper(GencodeFuncotationFactory.java:847)
	at org.broadinstitute.hellbender.tools.funcotator.dataSources.gencode.GencodeFuncotationFactory.createFuncotationsHelper(GencodeFuncotationFactory.java:831)
	at org.broadinstitute.hellbender.tools.funcotator.dataSources.gencode.GencodeFuncotationFactory.lambda$createGencodeFuncotationsByAllTranscripts$0(GencodeFuncotationFactory.java:508)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
	at org.broadinstitute.hellbender.tools.funcotator.dataSources.gencode.GencodeFuncotationFactory.createGencodeFuncotationsByAllTranscripts(GencodeFuncotationFactory.java:509)
	at org.broadinstitute.hellbender.tools.funcotator.dataSources.gencode.GencodeFuncotationFactory.createFuncotationsOnVariant(GencodeFuncotationFactory.java:564)
	at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.determineFuncotations(DataSourceFuncotationFactory.java:243)
	at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.createFuncotations(DataSourceFuncotationFactory.java:211)
	at org.broadinstitute.hellbender.tools.funcotator.DataSourceFuncotationFactory.createFuncotations(DataSourceFuncotationFactory.java:182)
	at org.broadinstitute.hellbender.tools.funcotator.FuncotatorEngine.lambda$createFuncotationMapForVariant$0(FuncotatorEngine.java:147)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
	at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1384)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:566)
	at org.broadinstitute.hellbender.tools.funcotator.FuncotatorEngine.createFuncotationMapForVariant(FuncotatorEngine.java:157)
	at org.broadinstitute.hellbender.tools.funcotator.Funcotator.enqueueAndHandleVariant(Funcotator.java:907)
	at org.broadinstitute.hellbender.tools.funcotator.Funcotator.apply(Funcotator.java:861)
	at org.broadinstitute.hellbender.engine.VariantWalker.lambda$traverse$0(VariantWalker.java:104)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.accept(ForEachOps.java:183)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:175)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.Iterator.forEachRemaining(Iterator.java:116)
	at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
	at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
	at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
	at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
	at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
	at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:485)
	at org.broadinstitute.hellbender.engine.VariantWalker.traverse(VariantWalker.java:102)
	at org.broadinstitute.hellbender.engine.GATKTool.doWork(GATKTool.java:1049)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.runTool(CommandLineProgram.java:140)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMainPostParseArgs(CommandLineProgram.java:192)
	at org.broadinstitute.hellbender.cmdline.CommandLineProgram.instanceMain(CommandLineProgram.java:211)
	at org.broadinstitute.hellbender.Main.runCommandLineProgram(Main.java:160)
	at org.broadinstitute.hellbender.Main.mainEntry(Main.java:203)
	at org.broadinstitute.hellbender.Main.main(Main.java:289)

@lydiarck
Copy link

Is there an update on this? Thanks!

@jnktsj
Copy link

jnktsj commented Sep 4, 2021

While searching the error, I found this github issue! I'm having the same error with gatk 4.2.0.0… GP is trying to incorporate Funcotator into the clinical pipeline, it would be great if this fix can be made as soon as possible. Thanks!
cc: @NiallJLennon @CarrieCibulskis

@jonn-smith
Copy link
Collaborator

jonn-smith commented Sep 4, 2021

@lydiarck - Not yet our priorities shifted and I haven't had time to address this.

@jnktsj That's great to hear! I'm surprised because this is the first I've heard of it. If you have some time, I'd love to discuss it with you (and / or Niall and / or Carrie). I'm particularly interested in how you will incorporate new versions of the software as updates are made.

Things are starting to slow down, so I should actually be able to start taking a look next or the following week. It's going to require refactoring several things deep in the Funcotator Engine, which is why it hasn't happened yet.

@jkobject
Copy link

Hi, getting the same error trying to run this new mutect2 pipeline on CCLE. We will not annotate mutect2 with funcotator in the meantime but would also be very useful to us if this problem is solved! (it impacts ~15% of our samples)

Thank you @jonn-smith !

@droazen
Copy link
Contributor

droazen commented Oct 14, 2021

@jkobject @jnktsj Thanks for the reports -- we are going to prioritize a fix for this issue this quarter!

@droazen
Copy link
Contributor

droazen commented Oct 21, 2021

@jkobject @jnktsj @lydiarck We have a prospective fix for this issue that at least avoids the crash: #7513

It should be part of the next GATK release, or you can try it out yourself if you're comfortable building the GATK from source.

@jonn-smith
Copy link
Collaborator

OK, guys. I just merged in a fix for this. If you have a chance, give it another try with the latest GATK main branch.

@jnktsj @alanhoyle @jkobject @DadongZ @lincoln-harris @lydiarck @shandy79 @samanthahv

@alanhoyle
Copy link

@jonn-smith is this in a release or would we have to download/compile to test?

@jonn-smith
Copy link
Collaborator

@alanhoyle Right now it's just in the code, so you'd have to compile it. If you can wait, we're planning on doing a release sometime in the next week or two.

@jkobject
Copy link

Hello, Having run it now on all our samples we still see this error.

It only happens on the WES (1800 samples) only in 13 of them. It does not happen for the WGS (600samples).

This is in GATK 4.2.6.1.

Let me know if you want us to share some example in a workspace.

@grimmem1
Copy link

Has this bug been addressed in recent Funcotator versions? If so, which version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests