Skip to content

mcgml/gwas2vcfweb

 
 

Repository files navigation

Web interface for gwas2vcf

Web interface for processing GWAS summary data to VCF format

Install

# get source
git clone --recurse-submodules git@github.com:MRCIEU/gwas2vcfweb.git
cd gwas2vcfweb

# configure host

## store cromwell outputs here
mkdir -p /data/gwas2vcfweb

## upload & download from here
mkdir -p /data/gwas2vcfweb/data

## DB files

### Reference FASTA
mkdir -p /data/reference_genomes
cd /data/reference_genomes
wget ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/human_g1k_v37.fasta.gz
wget ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/human_g1k_v37.fasta.fai.gz
wget ftp://gsapubftp-anonymous@ftp.broadinstitute.org/bundle/b37/human_g1k_v37.dict.gz
gzip -d human_g1k_v37.fasta.gz
gzip -d human_g1k_v37.fasta.fai.gz
gzip -d human_g1k_v37.dict.gz

### 1kg
mkdir -p /data/1kg
cd /data/1kg
wget ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz
wget ftp://ftp.1000genomes.ebi.ac.uk/vol1/ftp/release/20130502/ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz.tbi
gzip -d ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz
gzip -d ALL.wgs.phase3_shapeit2_mvncall_integrated_v5b.20130502.sites.vcf.gz.tbi

### dbsnp
mkdir -p /data/dbsnp
cd /data/dbsnp
wget ftp://ftp.ncbi.nih.gov/snp/latest_release/VCF/GCF_000001405.25.gz
wget ftp://ftp.ncbi.nih.gov/snp/latest_release/VCF/GCF_000001405.25.gz.tbi
mv GCF_000001405.25.gz dbsnp.v153.b37.vcf.gz
mv GCF_000001405.25.gz.tbi dbsnp.v153.b37.vcf.gz.tbi

# build stack
docker-compose build --no-cache

## this hash MUST be the same as in the wdl file
cd gwas2vcf
hash=$(git rev-parse HEAD)
docker build --no-cache -t gwas2vcf:"$hash" .

# start
cd ..
docker-compose -p gwas2vcfweb -f ./docker-compose.yml up -d

Test

# Run unit tests
docker exec -it gwas2vcfweb_web_1 pytest -v

Usage

Web

  1. Point browser to http://<hostname>. Port can be configure in the compose file.
  2. Upload file & obtain job identifier
  3. Download results for job identifier

Citation

Lyon, M.S., Andrews, S.J., Elsworth, B. et al. The variant call format provides efficient and robust storage of GWAS summary statistics. Genome Biol 22, 32 (2021). https://doi.org/10.1186/s13059-020-02248-0

About

Web interface for gwas2vcf

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • HTML 50.5%
  • Python 36.0%
  • WDL 13.0%
  • Shell 0.5%