-
Notifications
You must be signed in to change notification settings - Fork 66
Association testing pipeline
#Introduction
An association study is a complex analysis and each analysis has to consider
- the disease/phenotype being studied and its mode of inheritance
- population structure
- other covariates
For this reason it is difficult to build a high quality, generic pipeline to do an association study.
The purpose of this pipeline is to perform a very superficial initial analysis that can be used as one piece of information to guide a rigorous analysis. Of course, we would encourage users to build their own Nextflow script for their rigorous analysis, perhaps using our script as a start.
Our script, plink-assoc.nf takes as input PLINK files that have been through quality control and
- does a principal component analysis on the data, and produces pictures from that;
- performs a simple association test giving odds ratio and raw and adjusted p values
The pipeline is run: nextflow run plink-assoc.nf
The key options are:
-
input_dir
,output_dir
: where input and output goes to and comes from; -
input_pat
: the base of set of PLINK bed,bim and fam files (this should only match one);
By default a chi2 test for association is done. But you can do multiple different tests in one run by settintg the appropriate parameter to 1. Note at least one must be set to 1
-
chi2
: should a chi2 test be used (0 or 1) -
fisher
: Fisher exact test -
linear
: should linear regreession be used? -
logistic
: should linear regression be used? -
gemma
: should gemma be used?
and then for all the tests except gemma, do you want to adjust for multiiple testing using Bonferroni correction
- adjust
For example
```nextflow run plink-assoc --input_pat raw-GWA-data --chi2 1 --logistic 1 --adjust 1``
analyses the files raw-GWA-data
bed, bim, fam files and performs a chi2 and logistic regression test, and also does multiple testing correction.
#Output
The output can be found in the specified output directory. The key outputs are
- A PDF report of the QC that was done;
- A set of PLINK files
The PDF report will explain what was done and describe the other output files.
About h3aGWAS
Getting started
Running pipelines
- Quick Start
- The nextflow config file
- Pipeline options
- The Pipelines
- Affy Calling
- Converting from Illumina Top-Bottom
- PLINK QC pipeline
- Association testing pipeline
- Post-GWAS analysis
Extending pipelines
Getting help