Skip to content

Latest commit

 

History

History
245 lines (202 loc) · 15 KB

resources.md

File metadata and controls

245 lines (202 loc) · 15 KB
layout title published
page
Resources
true

This page contains course descriptions for fredhutch.io classes. If you're interested in documentation about computing at Fred Hutch, or looking for additional tutorials or online lessons to learn programming, please see the Fred Hutch Biomedical Data Science Wiki.

Self-paced learning: Each course below includes a link to the publicly available online training materials. You are welcome to read through the teaching notes and other resources at your own pace. Feel free to join weekly office hours or a community discussion group to ask questions; please see the calendar of events and the Wiki page on training, office hours, and community groups for more information.

Instructor-led sessions Most courses meet for 1-2 hours a class for 3-4 classes; see the course listings below for more information. If you are a Fred Hutch or SCCA employee, you can register for classes in Hutch Learning (internal link). If the class you're interested in isn't currently available, we recommend adding your name to the interest tracking system so you will receive a notification when the course is available again. If the class in which you are interested in currently full, add your name to the waitlist so we can add you if someone drops. We also announce new classes in the monthly Coop newsletter.

Concepts courses

The following courses do not require prior technical knowledge, and do not teach coding:

Coding courses

The following courses teach coding skills related to biomedical research. "Introduction" courses do not require previous coding experience. See course descriptions for the pre-requisites of other courses.

Retired and archived training materials

Earlier versions of courses, as well as classes that are no longer offered, are available for reference here.

Data for Data Science

Researchers face a growing data management challenge, starting with data collection and continuing through data analysis, publication, and archival. Potential problems research labs may face include scalability of their data management methods to many and/or very large data files, fully documenting data and its organization, and meeting requirements of grants/publication related to data sharing. This four-class course is designed to introduce attendees to best practices in data organization and management. Each one-hour lecture will include lecture, discussion, and practice exercises. This course assumes no prior training in data science. At the end of this course, you will be able to identify resources at Fred Hutch for data management and apply best practices in data organization to your own research projects. Course materials here.

  • Class 1: Data entry and creating spreadsheets
  • Class 2: Organizing data and project files
  • Class 3: Documenting data with metadata
  • Class 4: Data manipulation and reproducibility

Concepts in Machine Learning

This four class course is designed to introduce attendees to central concepts in machine learning as well as examples of applications in biomedical research. Each one hour lecture will emphasize conceptual and practical aspects of machine learning paradigms, explore the foundations of underlying mechanisms, and look at current or potential applications through examples or case studies. The course assumes a solid foundation in basic statistics, but does not assume any prior coding experience. At the end of this course, you will be able to understand the core differences between different forms of machine learning and consider their application with respect to a variety of problem spaces. This course (or equivalent knowledge/preparation) is intended as a prerequisite for future courses covering machine learning skills in both R and Python. Course materials here.

  • Class 1: Introduction and Conceptual Overview; Machine Learning and Experimental Design
  • Class 2: Supervised Learning via Regression
  • Class 3: Supervised Learning via Classification
  • Class 4: Unsupervised Learning via Dimensionality Reduction, Clustering, and Transfer Learning

Introduction to Git and GitHub

This two-class course (with optional third class) is designed to introduce attendees to git version control software and GitHub as a repository for code and/or data. Each two hour session will include brief tutorials interspersed with challenge exercises. The first two sessions assume no prior programming knowledge. At the end of these two sessions, you will be able to use git to track changes to software and other files, and use GitHub to work collaboratively to publish repositories of code and/or data. The optional third session assumes attendees have a basic familiarity with using the command line to navigate through directories and work with files and will include an overview of the command line interface to access the full functionality of the git software. Course materials here.

  • Class 1: introduction to version control, git workflow with desktop clients (tracking changes, branching, merging, ignoring things)
  • Class 2: Collaboration and code sharing with GitHub, resolving conflicts
  • Class 3 (optional): git workflow on the command line

Introduction to R

This four-class course is designed to introduce attendees to R statistical programming and its broad applications. Each two hour session will include brief tutorials interspersed with challenge exercises, and assumes attendees have no prior computer coding experience. At the end of this course, you will be able to use R to import, manipulate, and visualize data. Course materials here.

  • Class 1: R syntax, assigning objects, using functions
  • Class 2: Data types and structures; slicing and subsetting data
  • Class 3: Data manipulation with dplyr
  • Class 4: Data visualization in ggplot2

Intermediate R: Machine Learning

This four class course introduces participants to implementation of machine learning methods in R using RStudio. Each two hour session will include brief tutorials interspersed with challenge exercises, and assumes attendees are familiar with basic R syntax, using packages, and basic data manipulation using tidyverse. The course also assumes a strong foundation in basic statistics as well as prior/concurrent participation in the fredhutch.io course Concepts in Machine Learning (or equivalent experience). At the end of this course, you will be able to apply basic principles of machine learning to research questions and will have established a foundation for further exploration of machine learning techniques. Course materials are being developed here.

  • Class 1: Conceptual Overview; CRISP-DM framework; EDA; Our Tools
  • Class 2: Case Study in Regression
  • Class 3: Case Study in Classification
  • Class 4: Case Study in Deep Learning and Transfer Learning

Introduction to Python

This four-class course is designed to introduce attendees to Python programming and its broad applications. Each two hour session will include brief tutorials interspersed with challenge exercises, and assumes attendees have no prior computer coding experience. At the end of this course, you will be able to use Python to import, manipulate, and visualize data. Course materials here.

  • Class 1: Intro to python, jupyter notebooks, and data types
  • Class 2: Using pandas to explore data frames
  • Class 3: Extracting data from data frames
  • Class 4: Data visualization with ggplot

Intermediate Python: Programming

This four-class course focuses on task automation using Python programming. Each two hour session will include brief tutorials interspersed with challenge exercises, and assumes participants are familiar with all material in Introduction to Python (basic syntax including variables and functions, importing data, data types and structures, subsetting data). At the end of this course, you will be able to create fully documented and automated workflows to perform data analysis tasks. Course materials here.

  • Class 1: Review of pre-requisites, repeating actions with loops
  • Class 2: Analyzing data from multiple files, conditional statements, creating functions
  • Class 3: Errors and exceptions, defensive programming
  • Class 4: Debugging, modules/packaging for reproducibility

Intermediate Python: Machine Learning

This four class course introduces participants to implementation of machine learning methods in Python using Jupyter Notebooks. Each two hour session will include brief tutorials interspersed with challenge exercises, and assumes attendees are familiar with basic Python syntax, using packages, and basic data manipulation using Pandas. The course also assumes a strong foundation in basic statistics as well as prior/concurrent participation in the fredhutch.io course Concepts in Machine Learning (or equivalent experience). At the end of this course, you will be able to apply basic principles of machine learning to research questions and will have established a foundation for further exploration of machine learning techniques. Course materials here.

  • Class 1: Conceptual Overview; CRISP-DM framework; EDA; Our Tools
  • Class 2: Case Study in Regression
  • Class 3: Case Study in Classification
  • Class 4: Case Study in Deep Learning and Transfer Learning

Concepts in RNAseq

This four-class course introduces bulk RNAseq analysis for biomedical research, and is designed for research scientists (lab, clinical, computational) who have no prior experience working with genomic data. Each one-hour session includes lecture and discussion about topics in RNAseq data acquisition and analysis. This course requires participants have a general understanding of the central dogma of molecular biology (DNA->RNA->protein), but assumes no experience handling genomic data or performing computational analyses. By the end of this course, you will be able to identify data types and applications for bulk RNAseq analysis in biomedical research, design statistically robust RNAseq experiments, choose appropriate analytical approaches for RNAseq data, interpret common visualizations and hypothesis tests associated with RNAseq, and connect data types, experimental design, and analysis methods to appropriately frame research questions and understand technical limitations of RNAseq analyses. This course, or equivalent background knowledge, is a pre- or co-requisite for courses performing RNAseq analysis. Course materials here.

  • Class 1: Introduction to RNAseq data and experimental design
  • Class 2: Read mapping and quantification
  • Class 3: Hypothesis testing and visualization
  • Class 4: Interpreting results and applying concepts to other types of RNAseq data

Bulk RNAseq analysis: Unix and R

This four-class course introduces software and analysis methods asociated with bulk RNAseq analysis for biomedical research. These genomics-focused materials are designed for research scientists with minimal prior coding experience who are interested in learning to perform their own analyses, as well as computationally proficient staff who are interested in learning best practices for working with research software. Each two hour session includes Unix and/or R coding tutorials using Hutch computational resources, interspersed with challenge exercises. The Concepts in RNAseq course (or equivalent knowledge) is a pre- or co-requisite for this course. As this class focuses on applying reproducible computational methods (e.g., computer coding) to interrogate bioinformatics data, additional pre-requisites include prior experience with both the Unix shell and R statistical programming. By the end of this course, you will be able to manage data and organize projects associated with RNAseq experiments; recognize and interpret common file formats for genomic data and software appropriate for interacting with such data; validate and assess quality of RNAseq data before, during, and after analysis; quantify RNAseq data at the gene level; and create visualizations and test hypotheses for reporting results. Course materials here.

  • Class 1: Introduction to RNAseq data
  • Class 2: Read mapping and quantification
  • Class 3: Hypothesis testing
  • Class 4: Visualization

Software Carpentry Bootcamp: Unix, Git, and Python

This bootcamp-style course is designed to introduce attendees to a collection of tools useful for reproducible computational research: Unix (bash) shell scripting, version control with Git, and programming with Python. The class includes four sessions for three hours each session. Each class meeting will include brief tutorials interspersed with challenge exercises. No prior programming experience is required. Course materials here.

  • Class 1: Task automation with Unix (bash) shell scripting
  • Class 2: Version control with Git
  • Class 3: Programming with Python: data structures, loops, conditionals
  • Class 4: Programming with Python: creating functions, errors and exceptions, defensive programming

Retired and archived training materials

  • [Your toolbox: an overview of Galaxy, R, Python, and the command line]({% post_url 2014-07-13-toolbox %})
  • Introduction to bioinformatics
  • [Feeling cozy on the command line]({% post_url 2017-01-05-command-line-cozy %})
  • [Shell scripting, the Puritan way]({% post_url 2017-02-25-shell-scripting %})
  • [R: first steps]({% post_url 2014-05-20-R %})
  • [R: introductory course material]({% post_url 2014-10-20-introductory-r-course-material %})
  • [Galaxy: introduction]({% post_url 2014-05-09-galaxy %})
  • Inkscape tutorial