Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: 438 vep annotation order #441

Merged
merged 6 commits into from
Sep 14, 2023
Merged

feat: 438 vep annotation order #441

merged 6 commits into from
Sep 14, 2023

Conversation

ericblanc20
Copy link
Contributor

Impose user-defined annotation order for VEP.
Previously, the wrapper was using the default order, based on ENSEMBL canonical transcript.
Because of that choice, many variant were not uploaded to cBioPortal.

The default choice now prioritizes protein-coding genes with MANE transcripts.
Many variants are now recovered in cBioPortal views.

@ericblanc20 ericblanc20 requested a review from mbenary September 13, 2023 13:36
@ericblanc20 ericblanc20 linked an issue Sep 13, 2023 that may be closed by this pull request
@coveralls
Copy link

coveralls commented Sep 13, 2023

Coverage Status

coverage: 85.642% (+0.01%) from 85.63% when pulling 17d5c19 on 438-vep-annotation-order into 4fd0c73 on main.

Copy link
Contributor

@mbenary mbenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few tiny things, sorry for being picky.

@@ -26,7 +26,31 @@
Step Output
===========

TODO
Annotations can be done on all genes & transcripts overlapping with the variant locus, or
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rephrasing:

Users can annotate all genes ...

so it fits better with the second part of the sentence.

In the latter case, the output vcf file will only contain one annotation per variant, while
in the former case, there might be over 100 annotations for each variant.

The ordering of features drinving the representative annotation choice is under user control.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo:

driving

@@ -115,7 +139,7 @@
assembly: GRCh38
cache_version: 102 # WARNING- this must match the wrapper's vep version!
tx_flag: "gencode_basic" # The flag selecting the transcripts. One of "gencode_basic", "refseq", and "merged".
pick: yes # Other option: no (report one or all consequences)
pick_order: ["biotype", "mane", "appris", "tsl", "ccds", "canonical", "rank", "length"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to print a constant here? If yes, one could make PICK_ORDER a global constant and reuse it for the configuration.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the default order. In check_config, the code tests validity of all the options provided by the user in the yaml file


if selected:
for criterion in criteria:
codes[criterion] = get_value(criterion, fields[criterion])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a very simple thing: Why is codes changed here? It's not used afterwards.

Copy link
Contributor Author

@ericblanc20 ericblanc20 Sep 14, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's used before... The codes store the current value for the selected (best) annotation

  1. The codes are all set on maximum values (so the least likely to be selected).
  2. For each annotation, we loop over all criteria
    1. Get the criterion code for the current annotation
    2. If the code is unknown or equal to the current annotation, continue to the next criterion
    3. If the code is smaller than the current annotation, then this annotation is better than the selected, and should replace it.
  3. If the current annotation is selected, then set all codes for this annotation, and replace the selected values for the newly selected annotation values

I hope it makes sense...

Copy link
Contributor

@mbenary mbenary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ericblanc20 ericblanc20 merged commit c28d67e into main Sep 14, 2023
@ericblanc20 ericblanc20 deleted the 438-vep-annotation-order branch September 14, 2023 10:05
@tedil tedil mentioned this pull request Jun 28, 2024
This was referenced Dec 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

vep annotation order
3 participants