-
Notifications
You must be signed in to change notification settings - Fork 55
Tutorial 2: Special Options
The mtag
command line tool allows users to take advantage of several "special cases" of MTAG described in the Online Methods of the paper. Here, we walk through these options and describe their underlying assumptions. While all of the options described below tend to speed up the runtime of mtag
, they will lead to non-optimal -- and possibly even misleading -- results if the underlying assumptions do not hold. Please take care in checking that your data approximately satisfies the assumptions (constant sample size, no sample overlap, etc.) before using any of the flags!
All applications of MTAG discussed below will use two summary statistics that have been specifically formatted for the command line tool (see the Sample GWAS Results and Data Format section in in the first tutorial). We use the main results of a GWAS on educational attainment by Okbay et al. (2016) (EA2) along with a GWAS on educational attainment of individuals in the UK Biobank (UKB). These EA2 and UKB summary statistics can be found here and here.
Assumes: no overlap between any of the cohorts in any pair of GWAS studies fed into mtag
.
When there is no sample overlap between any pair of GWAS summary statistics used in mtag
, we can use the --no_overlap
flag to automatically set the residual covariance terms (i.e., the off-diagonal terms of Sigma) to 0. As a result, for T summary statistics files, bivariate LD Score Regression will only need to be run T rather than T(T+1)/2 times. This flag may only lead to significant reductions in time when T is large and Omega is not estimated numerically. This specification of MTAG does not account for correlation in estimation error across traits that is due to bias, which means that the resulting MTAG standard errors should be inflated by the square root of the estimated LD score intercept (Sigma's diagonal terms).
python mtag/mtag.py \
--sumstats EducAtt_ea2.txt,EducAtt_ukb.txt \
--out ./tutorial_results_2.1 \
--stream_stdout \
--no_overlap &
[...]
Summary of MTAG results:
------------------------
Trait N (max) N (mean) # SNPs used GWAS mean chi^2 MTAG mean chi^2 GWAS equiv. (max) N
1 .../EducAtt_ea2.txt 250000 134825 790847 2.011 2.501 371134
2 .../EducAtt_ukb.txt 52863 52863 790847 1.204 2.468 380237
Estimated Omega:
[[ 6.492e-06 5.026e-06]
[ 5.026e-06 3.970e-06]]
Estimated Sigma:
[[ 0.92 0. ]
[ 0. 1.028]]
Assumes: the T summary statistics used in MTAG are GWAS estimates for traits that are perfectly correlated with one another, i.e., each GWAS is on a different measure of the same "trait".
When multiple GWAS are assumed to be on different measures of the same trait with (possibly) different degrees of measurement error, the --perfect_gencov
flag can be used to pin down the covariance terms of Omega so that genetic correlations across the separate GWAS traits are unity.
Our set of sample summary statistics appears to fulfill this condition as the individual "traits" we are analyzing both measure the years of education. From the estimates of Omega listed above, we can see that the two traits are almost genetically perfectly correlated with another anyway, so it is no surprise that our results only marginally change when we restrict the genetic correlation to be 1:
python mtag/mtag.py \
--sumstats EducAtt_ea2.txt,EducAtt_ukb.txt \
--out ./tutorial_results_2.2 \
--stream_stdout \
--perfect_gencov &
[...]
Summary of MTAG results:
------------------------
Trait N (max) N (mean) # SNPs used GWAS mean chi^2 MTAG mean chi^2 GWAS equiv. (max) N
1 .../EducAtt_ea2.txt 250000 134825 790847 2.011 2.022 252528
2 .../EducAtt_ukb.txt 52863 52863 790847 1.204 2.022 264574
Estimated Omega:
[[ 6.492e-06 5.077e-06]
[ 5.077e-06 3.970e-06]]
Estimated Sigma:
[[ 0.92 0.385]
[ 0.385 1.028]]
Requires: --perfect_gencov
Assumes: Variation between "traits" is only due to non-genetic factors. All summary statistics files have in MTAG have the same heritability as they considered to be results on the same measure of a single trait.
If we also specify --equal_h2
(equal heritability of traits) in addition to --perfect_gencov
then we are assuming that the multiple input files are GWA studies of the same measure of a single trait. In this case, we can use mtag
to implement a type of inverse-variance meta-analysis that can handle sample overlap in the GWAS results. As described in the Online Methods section of the accompanying paper, the MTAG effect size estimator simplifies such that it no longer requires an estimate of Omega.
python mtag/mtag.py \
--sumstats EducAtt_ea2.txt,EducAtt_ukb.txt \
--out ./tutorial_results_2.3 \
--stream_stdout \
--perfect_gencov \
--equal_h2 &
[...]
Summary of MTAG results:
------------------------
Trait N (max) N (mean) # SNPs used GWAS mean chi^2 MTAG mean chi^2 GWAS equiv. (max) N
1 .../EducAtt_ea2.txt 250000 134825 790847 2.011 1.998 246760
2 .../EducAtt_ukb.txt 52863 52863 790847 1.204 1.998 258531
Omega hat not computed because --equal_h2 was used.
Estimated Sigma:
[[ 0.92 0.385]
[ 0.385 1.028]]