Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Format for breakend variants in VCF #1476

Closed
janrehker opened this issue Aug 14, 2023 · 8 comments
Closed

Format for breakend variants in VCF #1476

janrehker opened this issue Aug 14, 2023 · 8 comments
Assignees

Comments

@janrehker
Copy link

Hi,

According to #441 vep should now be able to handle breakend variants.

I tried

1 234919885 bnd A [chr1:17124942[A . . SVTYPE=BND .

which is actually the example for vcf input in the structural variant section from https://www.ensembl.org/info/docs/tools/vep/vep_formats.html
I tried to submit this variant on the vep web interface (vep v110) but unfortunately it did not work.

I might be holding it wrong.

Best regards,
Jan

@nuno-agostinho
Copy link
Contributor

nuno-agostinho commented Aug 14, 2023

Hi @janrehker,

Thanks for reporting this issue.

Web VEP indeed supports breakend variants (including that example), but for some reason it is not accepting it if you submit as the first variant. For now, you need to send another variant as the first.

We recently updated VEP to avoid this issue, which I just tested and is working for the command-line version, but not for the web interface. I am going to try to understand what is going on and I will let you know.

Kind regards,
Nuno

@janrehker
Copy link
Author

Thanks, for letting me know! Putting an snv in the first line worked for me indeed and our slightly modified Manta-results we used as a positive control were annotated as truncated, as we expected.

We're trying the command line version now.

Thanks a bunch for this new feature!

Best regards,
Jan

@janrehker
Copy link
Author

janrehker commented Aug 23, 2023

Hi,

I encounter a similar issue like @osowiecki in #441 (just moved it from there to here, because this thread still remains open and it actually fits better to the general topic)

In my case it is DELLY output:
WARNING: Line 187 skipped (2 227032111 BND00000186 A [1:106864407[A . Low...): start > end+1 : (START=227032112, END=106864407)

This is the line, as it went into vep:

2 227032111 BND00000186 A [1:106864407[A . LowQual IMPRECISE;SVTYPE=BND;SVMETHOD=EMBL.DELLYv0.7.8;CHR2=1;END=106864407;PE=2;MAPQ=60;CT=5to5;CIPOS=-231,231;CIEND=-231,231 GT:GL:GQ:FT:RCL:RC:RCR:CN:DR:DV:RR:RV 0/0:0,-19.006,-594:10000:PASS:8:5259:1:1169:101:2:0:0

This is the command line I used in vep 110.1:
vep --offline --cache --dir_cache /mnt/NFS/pathonas01/pathodata/miniconda3_ballm/envs/vep110/.vep --use_given_ref --distance 5000 --force_overwrite --input_file test2_nCHR.vcf.gz --max_sv_size 280000000 --output_file test_delly.vcf --hgvs --fasta /mnt/NFS/pathonas01/pathodata/miniconda3_ballm/envs/vep110/.vep/Homo_sapiens.GRCh38.dna_sm.toplevel.fa.gz --vcf

It appears that Manta output is recognized as such therefore can be processed. Would it help to reformat the ID field (BND00000186) in the INFO field or other parts of the vcf entry in some way in the meantime?

Best regards,
Jan

@nakib103
Copy link
Contributor

nakib103 commented Aug 24, 2023

Hello @janrehker,

Thanks for the query!

It seems the issue is coming from the END field in the INFO column. VEP is taking that as end position and complaining because it is less than the start position. If you remove END from the INFO field it will work.

We will look more into this and let you know once we have update.

Best regards,
Nakib

@janrehker
Copy link
Author

Hi @nakib103,

This seems to be a feasible workaround and vep is now annotating those variants.

Not sure though if this check makes sense for the regarding type of variants, where breakpoints can even be distributed over different chromosomes.

Nevertheless big thanks once gain for providing a workaround!

Best regards,
Jan

@nuno-agostinho
Copy link
Contributor

nuno-agostinho commented Aug 30, 2023

Hi again @janrehker, hope you are having a great day!

I cannot find official Manta documentation on how they represent breakends. However, according to my interpretation of the VCF 4.4 specifications, END should represent the end position for the reference variant only and should be larger than the start of the variant.

There is some documentation on Manta VCFs using END2 to represent the Position of breakpoint on CHR2: https://gatk.broadinstitute.org/hc/en-us/articles/5334587352219-How-to-interpret-SV-VCFs.

Could you tell me if you are using the latest version of Manta? Thanks!

Best,
Nuno

@janrehker
Copy link
Author

janrehker commented Sep 29, 2023

Hi Nuno,

Sorry for the late reply. Wouldn't surprise me if the Manta output is incompatible to vcf specifications. There does not seem to be a lot of activity there in recent years.

Version number for Manta we use is:

v1.6.0 - 2019-06-25

which would be the latest version, according to https://github.com/Illumina/manta/blob/master/CHANGELOG.md
(...assuming I am not living under a rock and they switched to a different repo)

Best regards,
Jan

@nuno-agostinho
Copy link
Contributor

Hi @janrehker, thanks for letting me know you are using the latest version (I think that are no more versions since 2019).

I updated the code to take into account such types of breakend variants. This fix will be available in the next VEP release (111). I will close this ticket when the fix is merged to our codebase.

Thanks for reporting this issue!

Best regards,
Nuno

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants