Skip to content

kircherlab/CADD-browserTracks

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

35 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CADD-browserTracks

This repository contains the hub.txt that provides genome browser tracks for CADD versions 1.3 to 1.7 in UCSC genome browser, NCBI Genome Data Viewer and Ensembl genome browser.

Usage

To view the tracks in UCSC genome browser, you need to link to hub.txt, i.e by clicking this link for hg19/GRCh37 and this link for hg38/GRCh38.

NCBI GDV: hg19/GRCh37 and hg38/GRCh38

Ensembl: hg19/GRCh37 and hg38/GRCh38. Please note that we have had trouble with Ensembl (no option for older CADD releases) and you may need to open the above links twice before Ensembl genome browser registers the track.

About this track

This is a track hub for UCSC genome browser, NCBI Genome Data Viewer and Ensembl genome viewer.

It displayes the highest CADD score of any 3 possible SNVs for each position. It is available for every determined genome position (i.e. non-N bases) on the major chromosomes in the reference genome.

The bigWig datasets that are displayed in the tracks are located on our webserver.

About CADD

CADD (short for Combined Annotation Dependent Depletion) is a tool for scoring the deleteriousness of single nucleotide variants, multi-nucleotide substitutions as well as insertion/deletions variants in the human genome (currently supported builds: GRCh37/hg19 and GRCh38/hg38).

While many variant annotation and scoring tools are around, most annotations tend to exploit a single information type (e.g. conservation) and/or are restricted in scope (e.g. to missense changes). Thus, a broadly applicable metric that objectively weights and integrates diverse information is needed. Combined Annotation Dependent Depletion (CADD) is a framework that integrates multiple annotations into one metric by contrasting variants that survived natural selection with simulated mutations.

CADD can quantitatively prioritize functional, deleterious, and disease causal variants across a wide range of functional categories, effect sizes and genetic architectures and can be used prioritize causal variation in both research and clinical settings.

CADD has been described in four publications. The most recent manuscript describes CADD v1.7, an extension to the annotations included in the model. Most prominently, this version improves the scoring of coding variants with features derived from the ESM-1v protein language model as well as the scoring of regulatory variants with features derived from a convolutional neural network trained on regions of open chromatin:

Schubach M, Maass T, Nazaretyan L, Röner S, Kircher M.
CADD v1.7: Using protein language models, regulatory CNNs and other nucleotide-level scores to improve genome-wide variant predictions.
Nucleic Acids Res. 2024 Jan 5. doi: 10.1093/nar/gkad989.
PubMed PMID: 38183205.

Then there is CADD-Splice (CADD v1.6), which specifically improved the prediction of splicing effects:

Rentzsch P, Schubach M, Shendure J, Kircher M.
CADD-Splice<80><94>improving genome-wide variant effect prediction using deep learning-derived splice scores.
Genome Med. 2021 Feb 22. doi: 10.1186/s13073-021-00835-9.
PubMed PMID: 33618777.

Our third manuscript describes the updates between the initial publication and CADD v1.4, introduces CADD for GRCh38 and explains how we envision the use of CADD. It was published by Nucleic Acids Research in 2018:

Rentzsch P, Witten D, Cooper GM, Shendure J, Kircher M.
CADD: predicting the deleteriousness of variants throughout the human genome.
Nucleic Acids Res. 2018 Oct 29. doi: 10.1093/nar/gky1016.
PubMed PMID: 30371827.

Finally, the original manuscript describing the method was published by Nature Genetics in 2014:

Kircher M, Witten DM, Jain P, O'Roak BJ, Cooper GM, Shendure J.
A general framework for estimating the relative pathogenicity of human genetic variants.
Nat Genet. 2014 Feb 2. doi: 10.1038/ng.2892.
PubMed PMID: 24487276.

For updates and further information, please check our website or website

Copyright

Copyright (c) University of Washington, Hudson-Alpha Institute for Biotechnology and Berlin Institute of Health at Charite -- Universitaetsmedizin Berlin 2013-2023. All rights reserved.

Permission is hereby granted, to all non-commercial users and licensees of CADD (Combined Annotation Dependent Framework, licensed by the University of Washington) to obtain copies of this software and associated documentation files (the "Software"), to use the Software without restriction, including rights to use, copy, modify, merge, and distribute copies of the Software. The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

About

UCSC genome browser tracks for CADD

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published