Skip to content

Python Implementation of SIMLR for single-cell visualization and analysis

License

Notifications You must be signed in to change notification settings

bowang87/SIMLR_PY

Repository files navigation

SIMLR

This is a python implementation of the paper published in Nature Methods titled as "Visualization and analysis of single-cell RNA-seq data by kernel-based similarity learning".

OVERVIEW

Single-cell RNA-seq technologies enable high throughput gene expression measurement of individual cells, and allow the discovery of heterogeneity within cell populations. Measurement of cell-to-cell gene expression similarity is critical to identification, visualization and analysis of cell populations. However, single-cell data introduce challenges to conventional measures of gene expression similarity because of the high level of noise, outliers and dropouts. We develop a novel similarity-learning framework, SIMLR (Single-cell Interpretation via Multi-kernel LeaRning), which learns an appropriate distance metric from the data for dimension reduction, clustering and visualization. SIMLR is capable of separating known subpopulations more accurately in single-cell data sets than do existing dimension reduction methods. Additionally, SIMLR demonstrates high sensitivity and accuracy on high-throughput peripheral blood mononuclear cells (PBMC) data sets generated by the GemCode single-cell technology from 10x Genomics.

IMPLEMENTATIONS

We provide implementations of SIMLR for large scale single-cell RNA-seq data. With small dataset (e.g, dataset with less than 3,000 cells), we recommend the user to use the matlab package or R package from https://github.com/BatzoglouLabSU/SIMLR. For Large dataset (with more than 3,000 cells), we recommend the user to use the python function called "SIMLR_LARGE".

This large-scale implementation uses approximate version of SIMLR to address the computational issue.

DEMO

We provide two demos for the usage of SIMLR in large scale. In test_largescale.py we run SIMLR on Zeisel dataset with 3005 cells in our paper.

DEBUG

Please feel free to send us emails if you have touble running our SIMLR. The correspondence email is bowang87@stanford.edu

Requirements

  • numpy>=1.8
  • scipy>=0.13.2
  • annoy>=1.8
  • sklearn>=0.17
  • fbpca>=1.0

Installation

python setup.py install

or pip install SIMLR

Tutorial

see tests/test_largescale.py

About

Python Implementation of SIMLR for single-cell visualization and analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published