Single-Cell Remover of Doublets
Python code for identifying doublets in single-cell RNA-seq data. For details and validation of the method, see our preprint on bioRxiv.
For a typical workflow, including interpretation of predicted doublet scores, see the example notebook.
Given a raw (unnormalized) UMI counts matrix counts_matrix
with cells as rows and genes as columns, calculate a doublet score for each cell:
import scrublet as scr
scrub = scr.Scrublet(counts_matrix)
doublet_scores, predicted_doublets = scrub.scrub_doublets()
scr.scrub_doublets()
simulates doublets from the observed data and uses a k-nearest-neighbor classifier to calculate a continuous doublet_score
(between 0 and 1) for each transcriptome. The score is automatically thresholded to generate predicted_doublets
, a boolean array that is True
for predicted doublets and False
otherwise.
git clone https://github.com/swolock/scrublet.git
cd scrublet
pip install -r requirements.txt
pip install --upgrade .
Previous versions can be found here.