-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Funcotator: java.lang.IllegalArgumentException: Unexpected value: lncRNA #6708
Comments
It looks like a typo either in the datasource or in funcotator. It's finding something labelled "lncRNA" but looking for "lincRNA". |
I am also troubled by this problem. My gatk version is 4.1.6.0 and Funcotator data source is funcotator_dataSources.v1.7.20200521s. |
@lbergelson should I modify the datasource then? |
@pawel125 @forg-yu Please do not modify the datasources - they are well-formed and correct, just newer than (and incompatible with) the GATK version you're using. See below for a quick solution. This version of the Funcotator data sources is not supported yet. Datasources have to be released prior to merging code changes that support them. I have been working on this data sources release for quite some time, but the code changes have not gone in yet to support it. Until the 4.1.9.0 GATK release, please continue to use v1.6.20190124 |
I've created a new issue to make sure the error message for this is better in the future (#6712). This will be included in 4.1.9.0. |
I have tried out also this version and as I mentioned it also results in an error. Here is the full output:
Is it another typo? |
@pawel125 This looks like a filesystem error - To be clear - the first issue you had was not a typo. The v1.7 data sources are not backwards compatible and the code changes haven't been merged yet. |
Oh, yes, it is surely not a typo, it was just my mental shortcut because of the previous messages. I use the local storage on our computing cluster. I will try to download the files once again, maybe it will solve the problem. |
Just making sure 😛 I don't think you have to download them again, but I have seen SQLite do some strange things on NFS drives sometimes. When I looked for the issue a couple StackOverflow posts indicated it was a SQLite + NFS issue. If you have a local disk you can store the data sources on that would probably fix the issue immediately. I'm not sure what the exact problem is with NFS + SQLite, unfortunately. |
Thanks, funcotator_dataSources.v1.6.20190124s is works fine for me. I think lbergelson is right, the bug is caused by the different abbreviations of long non-coding RNA: 'lncRNA' in gencode.v34.annotation.REORDERED.gtf of v1.7.20200521, 'lincRNA' in gencode.v28.annotation.REORDERED.gtf of v1.6.20190124. I have tried substitude the whole gencode/ directory with v1.6.20190124's, and it worked OK. (Please forgive me for modifying the data source) |
@forg-yu Glad to hear you have it working! You are correct about the difference between the Gencode versions long non-coding RNA tag. In addition to this, there are several other tags used in Gencode v34 that were not present in v28. The latest Funcotator code (not yet merged into master - pr #6660) has parser updates to allow for these new values, but the old code (GATK 4.1.8.1 and earlier) doesn't have these parsing updates. This is the unfortunate price we pay for updating the Gencode datasource with the new datasources release. The issue you ran into is not exactly a bug, but an artifact of our data source release process. In order to test them, the data sources must be posted before the code changes to support them (so we can test the code against the data sources as released). Unfortunately there was no warning mechanism to users to let them know that newer data source versions are not yet supported (checks against older versions were already present). I've created an issue (#6712) and a branch (jts_funcotator_version_max_6712) that adds in such checks, so pretty soon there will be a warning rather than a confusing stack trace. |
@forg-yu Also, moving over the old Gencode datasource is totally fine. The datasources are a bundle, but are designed to be changed by the user. My comment earlier was more to prevent you from doing a find/replace on You'll probably want to move to the latest version when it's merged in, though, since it will contain Gencode v34 and not v28. |
@jonn-smith, got it, thank you for your clarification. |
@jonn-smith I still have the problem with SQLite and I have no idea how to store the data more locally than I do now, keeping the files in the file system. Our cluster uses Lustre, does it cause the problem? |
@pawel125 From what I've found there are several posts mentioning issues with Lustre and sqlite:
I haven't looked into it, but maybe one of them can help. If you happen to have a We don't seem to have a Lustre filesystem I can play with so I can't really do any testing. |
I'm probably being pedantic, but a lincRNA is a subtype of lncRNA. Specifically, a lincRNA is a long intergenic non-coding RNA.1 |
@tedsharpe Interesting. That's good to know. Both annotations are still in the code (to preserve reverse-compatibility), so we can now cover both kinds of ncRNAs. |
@jonn-smith Thank you! |
Bug Report
Affected tool(s) or class(es)
Funcotator
Affected version(s)
gatk-4.1.8.0
funcotator_dataSources.v1.7.20200521s
Description
I am trying to use Funcotator to annotate the variants that I have already detected. Unfortunatelly, after a few seconds Funcotator stops with the error:
I have no idea what is wrong and I did not find this error in the internet. Can it be a problem with JRE?
Full log below.
Steps to reproduce
~/programs/gatk-4.1.8.0/gatk Funcotator --variant filtered_variants/P1.vcf.gz --reference ~/resources/hg38_for_bwa/hs38DH.fa --ref-version hg38 --data-sources-path ~/resources/gatk/funcotator2/funcotator_dataSources.v1.7.20200521s --output filtered_variants/P1.avcf.gz --output-file-format VCF
Expected behavior
Foncotator annotates my variants
Actual behavior
I have also tried to use older version of funcotator data sources, funcotator_dataSources.v1.6.20190124s, then the resulting error is:
The text was updated successfully, but these errors were encountered: