The code in this repository is designed for use with single-cell RNAseq data to help determine the cell types present in the dataset.
Perform the following four steps to obtain results:
- Load data
- Perform quality control
- Cluster filtered data
- Compute differential expresssion among the clusters
The analyses here are based on those in https://github.com/broadinstitute/BipolarCell2016 and https://github.com/broadinstitute/single_cell_analysis ported to tools and techniques available (but not limited to) Google Cloud Platform.
- All steps occur in the cloud.
- Data loading makes use of Docker, and dsub via Compute Engine for batch processing.
- The analyses make use of:
- Standard SQL via BigQuery
- Apache Beam via Dataflow
- TensorFlow via Cloud Machine Learning Engine
- We suggest working through the introductory materials for each tool before working with the code in this repository.