-
Notifications
You must be signed in to change notification settings - Fork 1
Example Usage
In the following, we give a brief example on how to use GNetLMM. As a case study, we use a subset of the genotypes from the 1000 project [1] and simulated phenotypes.
All commands can be found in demos/run_GNetLMM.sh. In the following, we give a short summary of the individual steps.
Go to the bin folder, create the output folder, set the filenames and parameters:
mkdir out
BFILE=./../data/1000G_chr22/chrom22_subsample20_maf0.10 #specify here bed basename
FFILE=./../data/1000G_chr22/ones.txt
PFILE=./out/pheno
CFILE=./out/chrom22
ASSOC0FILE=./out/lmm
GFILE=./out/genes
ANCHOR_THRESH=1e-6
ANCHORFILE=./out/cisanchor_thresh1e-6_wnd2000.txt
WINDOW=2000
VFILE=./out/vstructures_thresh1e-6_wnd2000
ASSOCFILE=./out/gnetlmm_thresh1e-6_wnd2000
PLOTFILE=./out/power.pdf
Simulating phenotype:
./../GNetLMM/bin/gNetLMM_simPheno --bfile $BFILE --pfile $PFILE
Creating the kinship matrix:
./../GNetLMM/bin/gNetLMM_preprocess --bfile $BFILE --cfile $PFILE
Running the initial association scan:
for i in $(seq 0 10000 40000)
do
./../GNetLMM/bin/gNetLMM_analyse --initial_scan --bfile $BFILE --pfile $PFILE --ffile $FFILE --cfile $CFILE.cov --assoc0file $ASSOC0FILE --startSnpIdx $i --nSnps 10000
done
./../GNetLMM/bin/gNetLMM_analyse --merge_assoc0_scan --assoc0file $ASSOC0FILE --nSnps 10000 --bfile $BFILE
Here, we split the SNPs into 5 blocks of length 10000 to demonstrate how the initial association scan can be easily parallelized and ran on the cluster.
Computing the marginal gene-gene correlations when splitting the genes in groups of size 25:
for i in $(seq 0 25 100)
do
./../GNetLMM/bin/gNetLMM_analyse --gene_corr --pfile $PFILE --gfile $GFILE.startTrait_$i --startTraitIdx $i --nTraits 25
done
./../GNetLMM/bin/gNetLMM_analyse --merge_corr --gfile $GFILE --pfile $PFILE --nTraits 25
Computing the cis anchors:
./../GNetLMM/bin/gNetLMM_analyse --compute_anchors --bfile $BFILE --pfile $PFILE --assoc0file $ASSOC0FILE --anchorfile $ANCHORFILE --anchor_thresh=$ANCHOR_THRESH --window=$WINDOW --cis
Finding v-structures:
for i in $(seq 0 10 90)
do
./../GNetLMM/bin/gNetLMM_analyse --find_vstructures --pfile $PFILE --gfile $GFILE --anchorfile $ANCHORFILE --assoc0file $ASSOC0FILE --window $WINDOW --vfile $VFILE --bfile $BFILE --startTraitIdx $i --nTraits 10
done
./../GNetLMM/bin/gNetLMM_postprocess --concatenate --infiles $VFILE --outfile $VFILE
Updating the associations:
for i in $(seq 0 10 90)
do
./../GNetLMM/bin/gNetLMM_analyse --update_assoc --bfile $BFILE --pfile $PFILE --cfile $CFILE.cov --ffile $FFILE --vfile $VFILE --assocfile $ASSOCFILE --startTraitIdx $i --nTraits 10
done
./../GNetLMM/bin/gNetLMM_postprocess --concatenate --infiles $ASSOCFILE --outfile $ASSOCFILE
Here, we split the genes into 10 blocks of length 10 to demonstrate how the steps can be parallelized.
Update initial association results:
./../GNetLMM/bin/gNetLMM_postprocess --merge_assoc --assoc0file $ASSOC0FILE --assocfile $ASSOCFILE
Creating nice output file for v-structures
./../GNetLMM/bin/gNetLMM_postprocess --nice_output --bfile $BFILE --pfile $PFILE --vfile $VFILE --outfile $VFILE.nice
Running the algorithm again, this time blocking the causal chain anchor snp -> anchor gene -> focal gene by conditioning on the focal gene:
for i in $(seq 0 10 90)
do
./../GNetLMM/bin/gNetLMM_analyse --block_assoc --bfile $BFILE --pfile $PFILE --cfile $CFILE.cov --ffile $FFILE --vfile $VFILE --assocfile $ASSOCFILE.block --startTraitIdx $i --nTraits 10
done
./../GNetLMM/bin/gNetLMM_postprocess --concatenate --infiles $ASSOCFILE.block --outfile $ASSOCFILE.block
./../GNetLMM/bin/gNetLMM_postprocess --merge_assoc --assoc0file $ASSOC0FILE --assocfile $ASSOCFILE.block
Plotting results:
./../GNetLMM/bin/gNetLMM_postprocess --plot_power --assocfile $ASSOCFILE --assoc0file $ASSOC0FILE --plotfile $PLOTFILE --pfile $PFILE --bfile $BFILE --window $WINDOW --blockfile $ASSOCFILE.block
GNet-LMM increases the power compared to a standard LMM, Block-LMM decreases the power since the causal chain is interrupted.
Converting updated associations in human readable format:
./../GNetLMM/bin/gNetLMM_postprocess --nice_output --bfile $BFILE --pfile $PFILE --vfile $VFILE --outfile $VFILE.nice --assocfile $ASSOCFILE --assoc0file $ASSOC0FILE --blockfile $BLOCKFILE
[1]: Genomes Project, C. et al. An integrated map of genetic variation from 1,092 human genomes. Nature 491, 56-65 (2012).