Run one of a selection of statistical tests for gene set enrichment on a matrix of gene weights.
- clone the repo and its submodule
git clone --recurse-submodules https://github.com/perslab/19-BMI-brain-genesettests.git
- go to the directory:
cd 19-BMI-brain-genesettests
- start R session:
R
- install
renv
:ìnstall.packages("renv")
- install R package dependencies for the code:
renv::restore()
orrenv::hydrate()
. If this fails, install the packages manually using the R commandìnstall.packages()
. See the full list of packages as well as R version under 'R session info' below. - quit R:
quit("no")
- add paths to the data, test and any other parameters in
call_run_geneset_tests_celltypes_vs_BMI.sh
- run the analysis:
bash call_run_geneset_tests_celltypes_vs_BMI.sh
Takes a gene x annotation table of weights, where the first column contains gene names and column names are celltype or similar.
csv table with output depending on statistical test
To reproduce the results from the paper, adjust the file path parameters in calL_run_geneset_tests_celltypes_vs_BMI.sh
and leave the other parameters as they are, then bash call_run_geneset_tests_celltypes_vs_BMI.sh
Adjust the file path parameters in calL_run_geneset_tests_celltypes_vs_modules.sh
and leave the other parameters as they are, then bash call_run_geneset_tests_celltypes_vs_modules.sh
Rscript ./code/run_geneset_tests.R --help
- empirical p-values can be impractically slow especially for the
t.test
andwilcox.test
- the
GSEA
function uses the liger package, which computes p-values by permuting gene labels on the input weights. This is faster than the original GSEA algorithm but may introduce false positives when testing against genesets where some genes are co-expressed.
R version 3.5.3 (2019-03-11)
Platform: x86_64-pc-linux-gnu (64-bit)
locale: LC_CTYPE=en_US.UTF-8, LC_NUMERIC=C, LC_TIME=en_US.UTF-8, LC_COLLATE=en_US.UTF-8, LC_MONETARY=en_US.UTF-8, LC_MESSAGES=en_US.UTF-8, LC_PAPER=en_US.UTF-8, LC_NAME=C, LC_ADDRESS=C, LC_TELEPHONE=C, LC_MEASUREMENT=en_US.UTF-8 and LC_IDENTIFICATION=C
attached base packages: parallel, stats, graphics, grDevices, datasets, utils, methods and base
other attached packages: pander(v.0.6.3), liger(v.1.0), here(v.0.1), optparse(v.1.6.2), Matrix(v.1.2-15), usethis(v.1.4.0), devtools(v.2.0.1), magrittr(v.1.5) and workflowr(v.1.4.0)
loaded via a namespace (and not attached): Rcpp(v.1.0.2), compiler(v.3.5.3), prettyunits(v.1.0.2), base64enc(v.0.1-3), remotes(v.2.0.2), tools(v.3.5.3), digest(v.0.6.20), pkgbuild(v.1.0.2), pkgload(v.1.0.2), evaluate(v.0.14), memoise(v.1.1.0), lattice(v.0.20-38), rlang(v.0.3.3), cli(v.1.1.0), xfun(v.0.8), withr(v.2.1.2), knitr(v.1.24), desc(v.1.2.0), fs(v.1.2.6), rprojroot(v.1.3-2), grid(v.3.5.3), getopt(v.1.20.3), glue(v.1.3.1), R6(v.2.4.0), processx(v.3.2.0), rmarkdown(v.1.14), sessioninfo(v.1.1.1), callr(v.3.0.0), backports(v.1.1.2), ps(v.1.2.1), htmltools(v.0.3.6), assertthat(v.0.2.1), renv(v.0.6.0-108) and crayon(v.1.3.4)> plibrary("pander")
A workflowr project.