The Bioinformatics Group is organizing an R programming workshop this Februrary for BRI scientists and employees. The workshop is geared towards people that have never programmed or have limited programming experience in R. We will teach how to organize your code, your data, and produce some common plots and figures.
The workshop curriculum is largely based on the publicly available Data Carpentry workshop "Data Analysis and Visualization in R for Ecologists". However, the curriculum has been tailored to work with the COVID-19 CYTOF and clinical metadata data set provided by Hamid Bolouri and Cate Speake published in JCI:
- Bolouri, H., Speake, C., Skibinski, D., Long, S. A., Hocking, A. M., Campbell, D. J., Hamerman, J. A., Malhotra, U., & Buckner, J. H. (2021). The COVID-19 immune landscape is dynamically and reversibly correlated with disease severity. Journal of Clinical Investigation, 131(3), 1–14. https://doi.org/10.1172/JCI143648.
This repository houses the schedule, code, and resources for the workshop.
For more information regarding the mission and purpose of Data Carpentry visit their website. For further reading about the Carpentry approach to teaching scientific computing see their paper.
When and Where: The workshop will run for 4 mornings (9 – 12pm), Feb 12 to 15 and will be in person in 4N. Though we recognize the commitment of a 4 morning workshop, we ask attendees to attend all sessions as the curriculum is designed to progressively build upon skills from prior days. As of 2/1/2024, course enrollment is full and due to space constraints, we cannot accommodate additional participants. However, if you would like to be added to the waitlist, please reach out.
Requirements: Participants must bring a personal or BRI-provided laptop with a Mac, Linux, or Windows operating system (not a tablet, Chromebook, etc.) that you have the ability to install software on. You should have a few specific software packages installed (listed below). For any questions regarding installing software please contact the BRI helpdesk.
Contact: Please contact Matthew Lawrance mlawrance@benaroyaresearch.org for any questions regarding the schedule, workshop goals, or curriculum.
-
Complete the Pre-Workshop survey by Friday Feb 2nd. Click here to take the google survey.
-
Install R and RStudio on your machine. Follow the instructions here.
- Note R must be installed prior to installing and using RStudio.
- Please email the BRI helpdesk regarding any issues installing the software.
-
Install necessary packages.
NOTE: The time required to dowload and install packages can vary. Please download and install these packages prior to the workshop starting!
During the course we will need a number of R packages. Packages contain useful R code written by other people.
To install these packages, open RStudio (after R is installed) and copy and paste the following command into the console window (look for a blinking cursor on the bottom left), then press Enter (Windows and Linux) or Return (MacOS) to execute the command.
install.packages(c("tidyverse","ggpubr"))
We will also need to download several packages from the website Bioconductor, which has many helpful packages for biological data analysis. Enter the following command in the console and press enter or Return to execute.
if (!require("BiocManager", quietly = TRUE)) install.packages("BiocManager") BiocManager::install(c("ComplexHeatmap", "PCAtools"))
Alternatively, you can install the packages using RStudio’s graphical user interface by going to
Tools > Install Packages
and typing the names of the packages separated by a comma. When the installation has finished, you can try to load the packages by pasting the following code into the console and pressingrun
on the top right of the screen:library(tidyverse) library(ggpubr) library(ComplexHeatmap) library(PCAtools)
If you do not see an error like
there is no package called ‘...’
you are good to go!
Download a subsampled version of the publically available Bolouri et al., (2021) COVID-19 CYTOF and clinical metadata dataset here.
We will use this public, collaborative document https://pad.riseup.net/p/BRI_Data_Carpentry_2024 for chatting, taking notes, and sharing URLs and bits of code. This collaborative document will expire 60 days from it's creation (May 13th, 2022).
Time | Topic | Data Carpentry Curriculum Reference |
---|---|---|
8:30am | Pre-Workshop Help | |
9:00am | Official Workshop Start/Introduction | Before we start |
9:10am | Introduction to R | Learn Rstudio! |
10:30am-10:45am | BREAK | |
10:45am-11:15am | Starting with Data in R (dataframes) | Starting with data |
11:15am-12:00pm | Starting with Data in R Cont. (factors) | |
12:00pm | End of Day 1 |
Time | Topic | Data Carpentry Curriculum Reference |
---|---|---|
9:00am | Manipulating Data in R (summarizing, reshaping) | https://datacarpentry.org/R-ecology-lesson/03-dplyr.html#Pipes |
10:15am-10:30am | BREAK | |
10:30am-12:00pm | Visualizing Data (ggplot) | https://datacarpentry.org/R-ecology-lesson/04-visualization-ggplot2.html |
12:00pm | End of Day 2 |
Time | Topic | Data Carpentry Curriculum Reference |
---|---|---|
9:00am | Visualizing Data Continued (pipes, faceting, themes) | https://datacarpentry.org/R-ecology-lesson/04-visualization-ggplot2.html |
10:15am-10:30am | BREAK | |
10:30am-12:00pm | Visualizing Data Cont. (customization, arranging, exporting) | https://datacarpentry.org/R-ecology-lesson/04-visualization-ggplot2.html |
12:00pm | End of Day 3 |
Time | Topic | Data Carpentry Curriculum Reference |
---|---|---|
10:30am-12:00 | Panel Discussion | |
12:00pm | End of Workshop! |