Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 

Workshop Schedule

Pre-requisite for this workshop: The Basic Data Skills Introduction to the command-line interface workshop or a working knowledge of the command line and cluster computing.

Pre-reading

Day 1

Time Topic Instructor
09:30 - 09:45 Workshop Introduction Meeta
09:45 - 10:25 Working in an HPC environment - Review Emma
10:25 - 11:05 Project Organization (using Data Management best practices) Meeta
11:05 - 11:45 Quality Control of Sequence Data: Running FASTQC Emma
11:45 - 12:00 Overview of self-learning materials and homework submission Meeta

Before the next class:

  1. Please study the contents and work through all the code within the following lessons:
  1. Complete the exercises:
    • Each lesson above contain exercises; please go through each of them.
    • Add your answers to the questions to Google forms the day before the next class.

Questions?

  • If you get stuck due to an error while runnning code in the lesson, email us

Day 2

Time Topic Instructor
09:30 - 10:30 Self-learning lessons review All
10:30 - 11:10 Expression quantification: Theory and Tools Meeta
11:10 - 11:50 Quantifying expression using alignment-free methods (Salmon) Emma
11:50 - 12:00 Review of workflow Emma

Before the next class:

  1. Please study the contents and work through all the code within the following lessons:
  • Quantifying expression using alignment-free methods (Salmon on multiple samples)

    Click here for a preview of this lesson
    Now that we know how to run the quantification of one sample with Salmon, this lesson will guide you to run multiple samples by creating a job submission script

  • QC with Alignment Data

    Click here for a preview of this lesson
    Besides transcript-level quantification, we also want to understand the quality of the mapping, which is not provided in Salmon output.

    This lesson will cover:
    - Aligning the reads with an aligner, STAR
    - Assessing QC metrics among samples

  • Documenting Steps in the Workflow with MultiQC

    Click here for a preview of this lesson
    It would be great to have a summary document of all QC results from the previous analysis.

    This lesson will cover:
    - Generating such a summary report with multiQC
    - Generating alignment metric with Qualimap

    NOTE: To run through the code above, you will need to be logged into O2 and working on a compute node (i.e. your command prompt should have the word compute in it).

    1. Log in using ssh rc_trainingXX@o2.hms.harvard.edu and enter your password (replace the "XX" in the username with the number you were assigned in class).
    2. Once you are on the login node, use srun --pty -p interactive -t 0-2:30 --mem 8G /bin/bash to get on a compute node or as specified in the lesson.
    3. Proceed only once your command prompt has the word compute in it.
    4. If you log out between lessons (using the exit command twice), please follow points 1. and 2. above to log back in and get on a compute node when you restart with the self learning.
  1. Complete the exercises:
    • Each lesson above contain exercises; please go through each of them.
    • Add your answers to the questions to Google forms the day before the next class.

Questions?

  • If you get stuck due to an error while runnning code in the lesson, email us

Day 3

Time Topic Instructor
09:30 - 10:10 Self-learning lessons review All
10:10 - 11:10 Automating the RNA-seq workflow Will
11:10 - 11:45 Troubleshooting RNA-seq Data Analysis Emma
11:45 - 12:00 Wrap up Will


Resources


Building on this workshop


These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.