Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Standardise SV types across inputs: VEP, VCF and Region #1370

Merged

Conversation

nuno-agostinho
Copy link
Contributor

@nuno-agostinho nuno-agostinho commented Mar 6, 2023

ENSVAR-5226

Changelog

  • Validation of each format is now encapsulated in the respective file
  • All SO terms are now standardised and available in Parser.pm for any input: INS, DEL, TDUP, DUP, CNV, INV, BND (including if encapsulated in <>)
    • VEP_input now accepts the same terms as VCF, including INV, BND, <CN3>, <DEL>, etc.
    • Given the validation restrictions, Region accepts most terms, except if encapsulated in <> like <CN0> and <INV>

Discussion

Does it make sense to accept all SV types in all types of inputs? For instance:

  • What does INS do if we don't have the inserted sequence in VEP_input/Region?
  • What is the purpose of CNV without specifying number of copies?
  • BND can be specified without mates, but VEP currently does not support breakend variants (not for long if we merge Add support for breakend variants from VCF #1399)

Testing

  • I added unit tests to test new changes, so please check if VEP works with some random example inputs
  • VCF input should return the same result before and after the changes
  • Test that VEP_input accepts the new terms. Example:
1   20000     30000     <CN4>   +    cn4
1   881907    881906    -/C     +
5   140532    140532    T/C     +
12  1017956   1017956   T/A     +
12  1017956   1017956   INV     +    inv1
2   946507    946507    G/C     +
14  19584687  19584687  C/T     -
19  66520     66520     G/A     +    var1
8   150029    150029    A/T     +    var2
  • Test that Region accepts the new terms. Example:
chr21:10-10:1/A
chr21:10-20:1/DUP
21:25587759-25587769/INV

@dglemos dglemos merged commit 059d4ea into Ensembl:postreleasefix/110 Jun 14, 2023
@nuno-agostinho nuno-agostinho deleted the improve/vep-input-sv branch June 14, 2023 07:52
@dglemos
Copy link
Contributor

dglemos commented Jun 14, 2023

Merged into release/110 and main

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants