Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vep update #482

Merged
merged 11 commits into from
Jan 22, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- GATK CNVCaller uses segments instead of intervals, filters out "reference" segments between the calls, and fixes a bug with how `ch_readcount_intervals` was handled [#472](https://github.com/nf-core/raredisease/pull/472)
- bwa aligner [#474](https://github.com/nf-core/raredisease/pull/474)
- Add FOUND_IN tag, which mentions the variant caller that found the mutation, in the INFO column of the vcf files [#471](https://github.com/nf-core/raredisease/pull/471)
- A new parameter `vep_plugin_files` to supply files required by vep plugins [#482](https://github.com/nf-core/raredisease/pull/482)
- New workflow for annotating mobile elements [#483](https://github.com/nf-core/raredisease/pull/483)

### `Changed`
Expand All @@ -43,6 +44,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Changed the name of the parameter from `skip_cnv_calling` to `skip_germlinecnvcaller` [#435](https://github.com/nf-core/raredisease/pull/435)
- Check SVDB query input files for existence and correct format [#476](https://github.com/nf-core/raredisease/pull/476)
- Change hardcoded platform value to params.platform in align_MT.config [#475](https://github.com/nf-core/raredisease/pull/475)
- Installed the nf-core version of ensemblvep/vep module [#482](https://github.com/nf-core/raredisease/pull/482)

### `Fixed`

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,14 @@ frameshift_variant
stop_lost
start_lost
transcript_amplification
feature_elongation
feature_truncation
inframe_insertion
inframe_deletion
missense_variant
protein_altering_variant
splice_region_variant
splice_donor_5th_base_variant
splice_region_variant
splice_donor_region_variant
splice_polypyrimidine_tract_variant
incomplete_terminal_codon_variant
Expand All @@ -26,14 +28,14 @@ non_coding_transcript_exon_variant
intron_variant
NMD_transcript_variant
non_coding_transcript_variant
coding_transcript_variant
upstream_gene_variant
downstream_gene_variant
TFBS_ablation
TFBS_amplification
TF_binding_site_variant
regulatory_region_ablation
regulatory_region_amplification
feature_elongation
regulatory_region_variant
feature_truncation
intergenic_variant
sequence_variant
19 changes: 19 additions & 0 deletions assets/vep_plugin_files_schema.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
{
"$schema": "http://json-schema.org/draft-07/schema",
"$id": "https://raw.githubusercontent.com/nf-core/raredisease/master/assets/mobile_element_references_schema.json",
"title": "Schema for VEP plugin files and their indices",
"description": "Schema for VEP plugin files and their indices",
"type": "array",
"items": {
"type": "object",
"properties": {
"vep_files": {
"type": "string",
"format": "file-path",
"exists": true,
"errorMessage": "Path to vep plugin files and their indices"
}
},
"required": ["vep_files"]
}
}
9 changes: 4 additions & 5 deletions conf/modules/annotate_genome_snvs.config
Original file line number Diff line number Diff line change
Expand Up @@ -79,16 +79,15 @@ process {
ext.prefix = { "${vcf.simpleName}_rohann_vcfanno_filter_vep" }
ext.args = [
'--dir_plugins vep_cache/Plugins',
'--plugin LoFtool,vep_cache/LoFtool_scores.txt',
'--plugin pLI,vep_cache/pLI_values_107.txt',
'--plugin SpliceAI,snv=vep_cache/spliceai_21_scores_raw_snv_-v1.3-.vcf.gz,indel=vep_cache/spliceai_21_scores_raw_snv_-v1.3-.vcf.gz',
'--plugin MaxEntScan,vep_cache/fordownload,SWA,NCSS',
'--plugin LoFtool,LoFtool_scores.txt',
'--plugin pLI,pLI_values_107.txt',
'--plugin SpliceAI,snv=spliceai_21_scores_raw_snv_-v1.3-.vcf.gz,indel=spliceai_21_scores_raw_snv_-v1.3-.vcf.gz',
'--distance 5000',
'--buffer_size 20000',
'--format vcf --max_sv_size 248956422',
'--appris --biotype --cache --canonical --ccds --compress_output bgzip',
'--domains --exclude_predicted --force_overwrite',
'--hgvs --humdiv --no_progress --no_stats --numbers',
'--hgvs --humdiv --no_progress --numbers',
'--merged --polyphen p --protein --offline --regulatory --sift p --symbol --tsl',
'--uniprot --vcf'
].join(' ')
Expand Down
4 changes: 2 additions & 2 deletions conf/modules/annotate_mobile_elements.config
Original file line number Diff line number Diff line change
Expand Up @@ -40,12 +40,12 @@ process {
ext.args = { [
'--dir_cache vep_cache',
'--dir_plugins vep_cache/Plugins',
'--plugin pLI,vep_cache/pLI_values_107.txt',
'--plugin pLI,pLI_values_107.txt',
'--appris --biotype --buffer_size 100 --canonical --cache --ccds',
'--compress_output bgzip --distance 5000 --domains',
'--exclude_predicted --force_overwrite --format vcf',
'--fork 4 --hgvs --humdiv --max_sv_size 248956422 --merged',
'--no_progress --no_stats --numbers --per_gene --polyphen p',
'--no_progress --numbers --per_gene --polyphen p',
'--protein --offline --regulatory --sift p',
'--symbol --tsl --uniprot --vcf'
].join(' ') }
Expand Down
9 changes: 4 additions & 5 deletions conf/modules/annotate_mt_snvs.config
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,15 @@ process {
withName: '.*ANNOTATE_MT_SNVS:ENSEMBLVEP_MT' {
ext.args = [
'--dir_plugins vep_cache/Plugins',
'--plugin LoFtool,vep_cache/LoFtool_scores.txt',
'--plugin pLI,vep_cache/pLI_values_107.txt',
'--plugin SpliceAI,snv=vep_cache/spliceai_21_scores_raw_snv_-v1.3-.vcf.gz,indel=vep_cache/spliceai_21_scores_raw_snv_-v1.3-.vcf.gz',
'--plugin MaxEntScan,vep_cache/fordownload,SWA,NCSS',
'--plugin LoFtool,LoFtool_scores.txt',
'--plugin pLI,pLI_values_107.txt',
'--plugin SpliceAI,snv=spliceai_21_scores_raw_snv_-v1.3-.vcf.gz,indel=spliceai_21_scores_raw_snv_-v1.3-.vcf.gz',
'--distance 0',
'--buffer_size 20000',
'--format vcf --fork 4 --max_sv_size 248956422',
'--appris --biotype --cache --canonical --ccds --compress_output bgzip',
'--domains --exclude_predicted --force_overwrite',
'--hgvs --humdiv --no_progress --no_stats --numbers',
'--hgvs --humdiv --no_progress --numbers',
'--merged --polyphen p --protein --offline --regulatory --sift p --symbol --tsl --vcf',
'--uniprot'
].join(' ')
Expand Down
4 changes: 2 additions & 2 deletions conf/modules/annotate_structural_variants.config
Original file line number Diff line number Diff line change
Expand Up @@ -46,12 +46,12 @@ process {
ext.args = [
'--dir_cache vep_cache',
'--dir_plugins vep_cache/Plugins',
'--plugin pLI,vep_cache/pLI_values_107.txt',
'--plugin pLI,pLI_values_107.txt',
'--appris --biotype --buffer_size 100 --canonical --cache --ccds',
'--compress_output bgzip --distance 5000 --domains',
'--exclude_predicted --force_overwrite --format vcf',
'--fork 4 --hgvs --humdiv --max_sv_size 248956422 --merged',
'--no_progress --no_stats --numbers --per_gene --polyphen p',
'--no_progress --numbers --per_gene --polyphen p',
'--protein --offline --regulatory --sift p',
'--symbol --tsl --uniprot --vcf'
].join(' ')
Expand Down
7 changes: 1 addition & 6 deletions conf/test.config
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,5 @@ params {
vep_cache = "https://raw.githubusercontent.com/nf-core/test-datasets/raredisease/reference/vep_cache_and_plugins.tar.gz"
vep_filters = "https://raw.githubusercontent.com/nf-core/test-datasets/raredisease/reference/hgnc.txt"
vep_cache_version = 107
ramprasadn marked this conversation as resolved.
Show resolved Hide resolved
}

process {
withName: '.*FILTERVEP.*' {
container = "docker.io/ensemblorg/ensembl-vep:release_107.0"
}
vep_plugin_files = "https://raw.githubusercontent.com/nf-core/test-datasets/raredisease/reference/vep_files.csv"
}
7 changes: 1 addition & 6 deletions conf/test_one_sample.config
Original file line number Diff line number Diff line change
Expand Up @@ -58,10 +58,5 @@ params {
vep_cache = "https://raw.githubusercontent.com/nf-core/test-datasets/raredisease/reference/vep_cache_and_plugins.tar.gz"
vep_filters = "https://raw.githubusercontent.com/nf-core/test-datasets/raredisease/reference/hgnc.txt"
vep_cache_version = 107
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

110?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here

}

process {
withName: '.*FILTERVEP.*' {
container = "docker.io/ensemblorg/ensembl-vep:release_107.0"
}
vep_plugin_files = "https://raw.githubusercontent.com/nf-core/test-datasets/raredisease/reference/vep_files.csv"
}
25 changes: 13 additions & 12 deletions docs/usage.md
Original file line number Diff line number Diff line change
Expand Up @@ -225,22 +225,23 @@ The mandatory and optional parameters for each category are tabulated below.
| vcfanno_resources<sup>2</sup> | vcfanno_lua |
| vcfanno_toml<sup>3</sup> | vep_filters<sup>8</sup> |
| vep_cache_version | cadd_resources<sup>9</sup> |
| vep_cache<sup>4</sup> | |
| vep_cache<sup>4</sup> | vep_plugin_files<sup>10</sup> |
| gnomad_af<sup>5</sup> | |
| score_config_snv<sup>6</sup> | |

<sup>1</sup>Genome version is used by VEP. You have the option to choose between GRCh37 and GRCh38.<br />
<sup>2</sup>Path to VCF files and their indices used by vcfanno. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/vcfanno_resources.txt).<br />
<sup>3</sup>Path to a vcfanno configuration file. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/vcfanno_config.toml).<br />
<sup>4</sup> VEP caches can be downloaded [here](https://www.ensembl.org/info/docs/tools/vep/script/vep_cache.html#cache).
VEP plugins and associated files may be installed in the cache directory, and the plugin pLI is mandatory to install.
VEP plugins may be installed in the cache directory, and the plugin pLI is mandatory to install. To supply files required by VEP plugins, use `vep_plugin_files` parameter.
See example cache [here](https://raw.githubusercontent.com/nf-core/test-datasets/raredisease/reference/vep_cache_and_plugins.tar.gz).<br />
<sup>5</sup> GnomAD VCF files can be downloaded from [here](https://gnomad.broadinstitute.org/downloads). The option `gnomad_af` expects a tab-delimited file with
no header and the following columns: `CHROM POS REF_ALLELE ALT_ALLELE AF`. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/gnomad_reformated.tab.gz).<br />
<sup>6</sup>Used by GENMOD for ranking the variants. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/rank_model_snv.ini).<br />
<sup>7</sup>Used by GENMOD while modeling the variants. Contains a list of loci that show [reduced penetrance](https://medlineplus.gov/genetics/understanding/inheritance/penetranceexpressivity/) in people. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/reduced_penetrance.tsv).<br />
<sup>8</sup> This file contains a list of candidate genes (with [HGNC](https://www.genenames.org/) IDs) that is used to split the variants into canditate variants and research variants. Research variants contain all the variants, while candidate variants are a subset of research variants and are associated with candidate genes. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/hgnc.txt). Not required if --skip_vep_filter is set to true.<br />
<sup>9</sup>Path to a folder containing cadd annotations. Equivalent of the data/annotations/ folder described [here](https://github.com/kircherlab/CADD-scripts/#manual-installation), and it is used to calculate CADD scores for small indels. <br />
<sup>10</sup>A CSV file that describes the files used by VEP's named and custom plugins. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/vep_files.csv). <br />

> NB: We use CADD only to annotate small indels. To annotate SNVs with precomputed CADD scores, pass the file containing CADD scores as a resource to vcfanno instead. Files containing the precomputed CADD scores for SNVs can be downloaded from [here](https://cadd.gs.washington.edu/download) (description: "All possible SNVs of GRCh3<7/8>/hg3<7/8>")

Expand All @@ -251,22 +252,22 @@ no header and the following columns: `CHROM POS REF_ALLELE ALT_ALLELE AF`. Sampl
| genome | reduced_penetrance |
| svdb_query_dbs/svdb_query_bedpedbs<sup>1</sup> | |
| vep_cache_version | vep_filters |
| vep_cache | |
| vep_cache | vep_plugin_files |
| score_config_sv | |

<sup>1</sup> A CSV file that describes the databases (VCFs or BEDPEs) used by SVDB for annotating structural variants. Sample file [here](https://github.com/nf-core/test-datasets/blob/raredisease/reference/svdb_querydb_files.csv). Information about the column headers can be found [here](https://github.com/J35P312/SVDB#Query).

##### 9. Mitochondrial annotation

| Mandatory | Optional |
| ----------------- | ----------- |
| genome | vep_filters |
| mito_name | |
| vcfanno_resources | |
| vcfanno_toml | |
| vep_cache_version | |
| vep_cache | |
| score_config_mt | |
| Mandatory | Optional |
| ----------------- | ---------------- |
| genome | vep_filters |
| mito_name | vep_plugin_files |
| vcfanno_resources | |
| vcfanno_toml | |
| vep_cache_version | |
| vep_cache | |
| score_config_mt | |

##### 10. Mobile element annotation

Expand Down
2 changes: 1 addition & 1 deletion main.nf
Original file line number Diff line number Diff line change
Expand Up @@ -54,7 +54,7 @@ params.vcfanno_toml = WorkflowMain.getGenomeAttribute(params,
params.vcfanno_lua = WorkflowMain.getGenomeAttribute(params, 'vcfanno_lua')
params.vep_cache = WorkflowMain.getGenomeAttribute(params, 'vep_cache')
params.vep_cache_version = WorkflowMain.getGenomeAttribute(params, 'vep_cache_version')

params.vep_plugin_files = WorkflowMain.getGenomeAttribute(params, 'vep_plugin_files')
/*
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
VALIDATE & PRINT PARAMETER SUMMARY
Expand Down
5 changes: 5 additions & 0 deletions modules.json
Original file line number Diff line number Diff line change
Expand Up @@ -115,6 +115,11 @@
"git_sha": "29984d70aea47d06f0062a1785d76c357dd40ea9",
"installed_by": ["modules"]
},
"ensemblvep/vep": {
"branch": "master",
"git_sha": "76a0696a60c41c57fc5f6040ac31b11ce5d4d8dd",
"installed_by": ["modules"]
},
"expansionhunter": {
"branch": "master",
"git_sha": "0260e5d22372eae434816d6970dedf3f5adc0053",
Expand Down
7 changes: 7 additions & 0 deletions modules/nf-core/ensemblvep/vep/environment.yml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading