Skip to content

infercnv 10x

Brian Haas edited this page May 11, 2018 · 7 revisions

Leveraging 10x count data in InferCNV

Below provides an example of how you might generate a matrix of counts-per-100k (CP100k) total reads matrix for use with inferCNV, starting with 10x data.

Here, we'll use Seurat for converting 10x count data to a CP100k matrix.

library(Seurat)                                                                                           
                                                                                                          
data = Read10X(data.dir = "10x_data_dir/") 
seurat_obj = CreateSeuratObject(raw.data=data, min.cells=3, min.genes=200)
counts_matrix = as.matrix(seurat_obj@raw.data[,seurat_obj@cell.names])
cpm = scale(counts_matrix, center=F, scale=colSums(counts_matrix)/1e6) 
cp100k = cpm/10   # convert counts-per-million to counts-per-100k total reads                          
log2cp100k = log2(cp100k+1)  # log2 transform  

# use more palatable column names (cell identifiers)            
cell.names <- sapply(seq_along(colnames(log2cp100k)), function(i) paste0("cell_", i), USE.NAMES = F)      
colnames(log2cp100k) = cell.names    

# write the output table 
write.table(round(log2cp100k, digits=3), file='cp100k.log2.matrix', quote=F, sep="\t")                    
                                                                                           

Now, the 'cp100k.log2.matrix' is ready for use with InferCNV.

Clone this wiki locally