One of the projects from the 2021 GP2/IPDGC Hackathon. The related manuscript can be found on [biorxiv](https://www.biorxiv.org/content/10.1101/2022.05.04.490670v1)
Contributers: Shilpa Rao, Konstantin Senkevich, Prabhjyot Saini, Paula Reyes P, Will Scotton, Anni Moore, Devina Chetty
Table of Contents
The goal for this project was to develop a pipeline for colocalization analysis.
Colocalization is an analysis to test if the effect size of a SNP on the phenotype is mediated by **gene expression. This tool can be used to prioritize genes underlying GWAS hits and decode non-coding variant associations. Colocalization integrates eQTL data to determine if a non-coding variant nominated through GWAS 'colocalizes' with a known eQTL, suggesting a potential causal mechanism for that variant. This workflow serves as an example for how to use and format summary statistics and eQTL data for colocalization and visualization.
- Download/identify your desired GWAS summary statistics and eQTL data
- Format for and perform colocalization
- Visualization of colocalization
- Clone the repo
git clone https://github.com/ipdgc/Colocalization-Pipeline.git
These R scripts contain examples of how to perform colocalization with eQTL data from eQTLGen and Parkinson's Disease summary statistics from Nalls et al 2019, but by changing the file paths you can use these scripts for any eQTL data and GWAS summary statistics.
- Assumes a single causal variant
- Reduced power in the presence of multiple causal variants
- Visualization of colocalization between eQTL and GWAS data
- Comprehensive plots of colocalization between GWAS and eQTL signals and correlation between GWAS and eQTL p-values
- It provides visual information of effect size, direction of the effect, and distinguishes between congrous and incongrous effects
For more examples, please refer to eQTpLot documentation and coloc documentation