This tutorial provides Jupyter notebooks based on the Hail GWAS tutorial to demonstrate how to perform a GWAS (Genome-Wide Association Study) analysis using a VCF file, while storing Hail data structures in an external S3 storage.
To get started, clone this repository:
git clone https://github.com/crs4/hail_tutorial.git
cd hail_tutorial
Then, start the Docker environment:
docker compose up -d
On the first run, Docker will download two images:
hail_tutorial
– The environment for running the tutorials in a Jupyter Lab server.minio
– A high-performance object storage service with an API compatible with Amazon S3.
To shut down the Docker containers, run:
docker compose down
- Open a browser and go to localhost:18888.
- Enter the password:
12345678
(only required the first time).
- Open localhost:9001 in a browser.
- Use the credentials:
- Username:
root
- Password:
passpass
- Username:
Once you run a Jupyter notebook, the data-hail
bucket will be created in MinIO.
The notebooks
folder contains three Jupyter notebooks:
- Hail_tutorial-GWAS-vcf.ipynb:
A complete GWAS analysis using a VCF file. - Hail_tutorial-GWAS-vcf-medium.ipynb:
A medium-scale example of GWAS analysis. - Hail_tutorial_GWAS_vcf_mini.ipynb:
A minimal version with essential GWAS code.
-
Ensure Docker compose is installed and running before starting the environment.
-
If you encounter issues accessing Jupyter Lab, check if the container is running:
docker compose ps
-
Restart the Docker environment if needed:
docker compose down && docker compose up -d