SpiderLearner Workflow

This repository contains the code for simulations testing the performance of our R package ensembleGGM and an illustrative application to ovarian cancer data.

Ensemble learning can be computationally expensive, and so most of the code in this repository is designed to be run in a large-scale computing environment with multiple cores. Our simulations were conducted primarily on the Massachusetts Green High Performance Computing Cluster (https://www.mghpcc.org/).

To allow the user to pilot these simulations and applications locally and allocate resources depending on what they have available, we have used the config R package along with configuration files Simulations/config.yml and Applications/config.yml to encode pilot versions of the scripts that include a small number of folds (2-3) and a small number of cores (1-2). We recommend using K=10 folds in general. The number of cores can be adjusted as desired based on the available hardware.

Please use GitHub issues to report general questions. For individual-specific questions, email Kate Hoff Shutta at kshutta at hsph.harvard.edu.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
Application		Application
Figures		Figures
Simulations		Simulations
Tables		Tables
.gitignore		.gitignore
0.generateGoldStandards.sh		0.generateGoldStandards.sh
1.runSimulationsABCD.sh		1.runSimulationsABCD.sh
2.extraSims.sh		2.extraSims.sh
3.runSimulationsABCD_clime.sh		3.runSimulationsABCD_clime.sh
README.md		README.md
dependencies.R		dependencies.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SpiderLearner Workflow

About

Releases

Packages

Languages

katehoffshutta/SpiderLearnerWorkflow

Folders and files

Latest commit

History

Repository files navigation

SpiderLearner Workflow

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages