a Method to Determine the Number of Cell Clusters based on Batch Effect Removal.
REBET, is a method for determining the number of cell clusters is based on gene expression profile.
Two inputs are required to run REBET: (1) gene expression data matrix with samples in rows and genes in columns. (2) maximum number of clusters.
REBET is implemented as an R package, which is freely available for non-commercial use.
Step 1: Download the above REBET package and install it in R (tested on version 4.0.3).
Step 2: Install the "sva","flexclust","SC3","SingleCellExperiment","cluster","infotheo","scater","foreach","doParallel" R package (tested on version 4.0.3), which is dependent of REBET.
Notes: REBET was tested on linux and windows.
Using REBET is very simple. Just follow the steps below:
Step 1: open your R or Rstudio.
Step 2: in the R command window, run the following command to load the R package.
Step 3: in R command window, run the following command to see the help document for running REBET. Then, you should be able to see a help page.
Step 4: At the end of the help page, there is an example code. Copy these codes to command to run as follows:
This dataset consists of gene expression values of 21042 genes from 33 samples. The true number of clusters is 7.
Ramsköld, D. et al. (2012). Full-length mRNA-Seq from single-cell levels of RNA and individual circulating tumor cells. Nat. Biotechnol., 30(8), 777–782.
result = REBET(data, Kmax=10)
In general, we first set Kmax to 10, and if the estimated optimal number of clusters happens to be 10, we then increase the value of Kmax. The result is a value, which is the optimal number of cell clusters returned by REBET.
The result returned by this example is 7, indicating that REBET accurately estimated the number of cell clusters in the Ramskold dataset.
If any questions, please do not hesitate to contact us at: Hongdong Li, hongdong@csu.edu.cn
If you use this tool, please cite the following work.
ZhaoYu Fang, CuiXiang Lin, YunPei Xu, Hongdong Li, QingSong Xu, REBET: a Method to Determine the Number of Cell Clusters based on Batch Effect Removal, 2021, submitted