Single-cell ATAC-seq related tools and genomics data analysis resources. Tools are sorted by publication date, reviews and most recent publications on top. Unpublished tools are listed at the end of each section. Please, contribute and get in touch! See MDmisc notes for other programming and genomics-related notes. See scRNA-seq_notes for scRNA-seq related resources.
- Preprocessing pipelines
- Integration, Multi-omics methods
- Clustering, visualization
- Technology
- Data
- Miscellaneous
- Review of chromatin accessibility profiling methods (wet-lab, technologies, downstream analysis and tools, applications), both bulk and single cell. DNAse-seq, ATAC-seq, MNase-seq, many more. Multi-omics technologies, integratie approaches.
Paper
Minnoye, Liesbeth, Georgi K. Marinov, Thomas Krausgruber, Lixia Pan, Alexandre P. Marand, Stefano Secchia, William J. Greenleaf, et al. “Chromatin Accessibility Profiling Methods.” Nature Reviews Methods Primers, (December 2021), https://doi.org/10.1038/s43586-020-00008-9. Supplementary Table 1 - Commonly used bioinformatics tools for data processing and analysis of bulk and single-cell chromatin accessibility data, https://static-content.springer.com/esm/art%3A10.1038%2Fs43586-020-00008-9/MediaObjects/43586_2020_8_MOESM1_ESM.pdf
- Single-cell multiomics technologies, integration of transriptome with genome, epigenome, and proteome. Table 1 - summary of technologies. Cell isolation and barcoding. Figure 2 - genome-transcriptome single-cell technologies, Figure 3 - epigenome-transcriptome technologies, Figure 4 - proteome-transcriptome technologies. Figure 5 - overview of computational methods (dimensionality reduction, clustering, network, pseudotime inference, CNV detection), references to reviews. Integrative analysis methods (LIGER, MOFA).
Paper
Lee, Jeongwoo, Do Young Hyeon, and Daehee Hwang. “Single-Cell Multiomics: Technologies and Data Analysis Methods.” Experimental & Molecular Medicine, September 15, 2020. https://doi.org/10.1038/s12276-020-0420-2.
-
scATAC-seq analysis guidelines. Technologies, data preprocessing, peak annotation, QC, matrix building, batch correction, dimensionality reduction, visualization, clustering, cell identity annotation, chromatin accessibility dynamics, motif analysis. Table 1 - summary of 13 pipelines. Tools, methods, databases.
- Baek, Seungbyn, and Insuk Lee. “Single-Cell ATAC Sequencing Analysis: From Data Preprocessing to Hypothesis Generation.” Computational and Structural Biotechnology Journal, (2020)
-
Benchmarking of 10 scATAC-seq analysis methods (brief description of each in Methods) on 10 synthetic (various depth and noise levels) and 3 real datasets. scATAC technology overview, problems. Three clustering methods (K-means, Louvain, hierarchical clustering), adjusted Rand index, adjusted mutual information, homogeneity for benchmarking against gold-standard clustering, Residual Average Gini Index for benchmarking against gene markers (silver standard). SnapATAC, Cusanovich2018, cisTopic perform best overall. R code, Jupyter notebooks
- Chen, Huidong, Caleb Lareau, Tommaso Andreani, Michael E. Vinyard, Sara P. Garcia, Kendell Clement, Miguel A. Andrade-Navarro, Jason D. Buenrostro, and Luca Pinello. “Assessment of Computational Methods for the Analysis of Single-Cell ATAC-Seq Data.” Genome Biology 20, no. 1 (December 2019)
Extended Data Fig. 1: Comparison of supported features from currently available scATAC-seq software., from ArchR paper.
- SnapATAC2 - scATAC-seq processing pipeline. Main improvement - a fast nonlinear dimensionality reduction algorithm, matrix-free spectral clustering, Lanczos algorithm to derive eigenvectors while implicitly using the Laplacian matrix. Four primary modules: preprocessing, embedding/clustering (includes batch correction), functional enrichment analysis, and multi-modal omics analysis. Starts from BAM files, scaling columns with IDF. Outperforms ArchR (LSI), Signac (LSI), cisTopic (LDA), epyScanpy (PCA), PeakVI, scBassett in speed, scalability and precision in resolving cell heterogeneity on synthetic and experimental data from different technologies, species, and tissue types. Applicable to any omics data (scHi-C, scRNA-seq, single-cell methylation). Rust, with Python interface. Benchmarking datasets, Docker image and code to reproduce the analysis.
Paper
Zhang, Kai, Nathan R. Zemke, Ethan J. Armand, and Bing Ren. "SnapATAC2: a fast, scalable and versatile tool for analysis of single-cell omics data." bioRxiv (2023). https://doi.org/10.1101%2F2023.09.11.557221
- PIC-snATAC - Paired-Insertion-Counting method for snATAC-seq feature characterization. Fragment-based (number of reads in the union of peaks) and insertion-based (number of Tn5 insertions in the appropriate direction) ATAC-seq quantification. Methods and tools overview, contrasting differences. Applied to mouse kidney snATAC-seq data, 10X genomics PBMC, a Bone Marrow Mononuclear Cells (BMMC) multiome dataset. Better resolves cell types, association with gene expression.
Paper
Miao, Zhen, and Junhyong Kim. “Is Single Nucleus ATAC-Seq Accessibility a Qualitative or Quantitative Measurement?” Preprint. Bioinformatics, April 21, 2022. https://doi.org/10.1101/2022.04.20.488960.
-
ArchR - R package for processing and analyzing single-cell ATAC-seq data. Compared to Signac and SnapATAC, has more functionality, faster, handles large (>1M cells) data. Input - BAM files. Efficient HDF5-based storage allows for large dataset processing. Quality control, doublet detection (similar performance to Scrublet), genome-wide 500bp binning and peak identification, assignment to genes using best performing model, dimensionality reduction (optimized Latent Semantic Indexing, multiple iterations of LSI), clustering, overlap enrichment with a compendium of previously published ATAC-seq datasets, trajectory analysis (Slingshot and Monocle 3), integration with scRNA-seq data (Seurat functionality). Code to reproduce the paper, GitHub. Tweet 1, Tweet 2, Tweet 3
- Granja, Jeffrey M., M. Ryan Corces, Sarah E. Pierce, S. Tansu Bagdatli, Hani Choudhry, Howard Y. Chang, and William J. Greenleaf. “ArchR Is a Scalable Software Package for Integrative Single-Cell Chromatin Accessibility Analysis.” Nature Genetics, February 25, 2021
-
SnapATAC - scATAC-seq pipeline for processing, clustering, and motif identification. Snap file format Genome is binned into equal-size (5kb) windows, binarized with 1/0 for ATAC reads present/absent, Jaccard similarity between cells, normalized to account for sequencing depth (observed over expected method, two others), PCA on the matrix KNN graph and Louvain clustering to detect communities, tSNE or UMAP for visualization. Motif analysis and GREAT functional enrichment for each cluster. Nystrom algorithm to reduce dimensionality, ensemble approach. Outperforms ChromVAR, LSA, Cicero, Cis-Topic. Very fast, can be applied to ChIP-seq, scHi-C. SnapTools to work with snap files
- Fang, Rongxin, Sebastian Preissl, Yang Li, Xiaomeng Hou, Jacinta Lucero, Xinxin Wang, Amir Motamedi, et al. “Comprehensive Analysis of Single Cell ATAC-Seq Data with SnapATAC.” Nature Communications 12, no. 1 (February 1, 2021)
-
scATAC-pro - pipeline for scATAC-seq mapping, QC, peak detection, clustering, TF and GO enrichment analysis, visualization (via VisCello). Compared with Scasat, Cellranger-atac.
- Yu, Wenbao, Yasin Uzun, Qin Zhu, Changya Chen, and Kai Tan. “ScATAC-pro: A Comprehensive Workbench for Single-Cell Chromatin Accessibility Sequencing Data.” Genome Biology 21, no. 1 (December 2020)
-
scOpen - estimating open chromatin status in scATAC-seq experiments, aka imputation/smoothing of extreme sparse matrices. Uses positive-unlabelled learning of matrices to estimate the probability that a region is open in a given cell. The probability matrix can be used as input for downstream analyses (clustering, visualization). Integrated with the footprint transcription factor activity score (scHINT). scOpen estimated matrices tested as input for scABC, chromVAR, cisTopic, Cicero, improve performance
- Li, Zhijian, Christoph Kuppe, Mingbo Cheng, Sylvia Menzel, Martin Zenke, Rafael Kramann, and Ivan G Costa. “ScOpen: Chromatin-Accessibility Estimation of Single-Cell ATAC Data.” Preprint. Bioinformatics, December 5, 2019
-
scABC, single-cell Accessibility Based Clustering - scATAC-seq clustering. Weights cells by a nonlinear transformation of library sizes, then, weighted K-medoids clustering. Input - single-cell mapped reads, and the full set of called peaks. Applied to experimental and synthetic scATAC-seq data, outperforms simple K-means-based clustering, SC3
- Zamanighomi, Mahdi, Zhixiang Lin, Timothy Daley, Xi Chen, Zhana Duren, Alicia Schep, William J. Greenleaf, and Wing Hung Wong. “Unsupervised Clustering and Epigenetic Classification of Single Cells.” Nature Communications 9, no. 1 (December 2018)
-
Cicero - connect distal regulatory elements with target genes (covariance-based, graphical Lasso to compute a regularized covariance matrix) along pseudotime-ordered (Monocle2 or 3) scATAC-seq data. Optionally, adjusts for batch covariates. Applied to the analysis of skeletal myoblast differentiation, sciATAC-seq. R package
- Pliner, Hannah A., Jonathan S. Packer, José L. McFaline-Figueroa, Darren A. Cusanovich, Riza M. Daza, Delasa Aghamirzaie, Sanjay Srivatsan, et al. “Cicero Predicts Cis-Regulatory DNA Interactions from Single-Cell Chromatin Accessibility Data.” Molecular Cell 71, no. 5 (September 2018)
-
ChromVAR - scATAC-seq analysis. Identifying peaks, get a matrix of counts across aggregated peaks, tSNE for clustering, identifying motifs. Integrated with Seurat.
- Schep, Alicia N, Beijing Wu, Jason D Buenrostro, and William J Greenleaf. “ChromVAR: Inferring Transcription-Factor-Associated Accessibility from Single-Cell Epigenomic Data.” Nature Methods 14, no. 10 (August 21, 2017)
-
scATAC-pro - A comprehensive tool for processing, analyzing and visulizing single cell chromatin accessibility sequencing data.
-
ShinyArchR.UiO - R Shiny app for scATAC-seq data analysis and visualization using ArchR. UMAP, clustering, integration with scRNA-seq, functional enrichment analysis. Runs locally. Demo server with a hematopoietic tutorial dataset.
Paper
Sharma, Ankush, Akshay Akshay, Marie Rogne, and Ragnhild Eskeland. “ShinyArchR.UiO: User-Friendly,Integrative and Open-Source Tool for Visualization of Single-Cell ATAC-Seq Data Using ArchR.” Edited by Can Alkan. Bioinformatics 38, no. 3 (January 12, 2022): 834–36. https://doi.org/10.1093/bioinformatics/btab680.
-
ChromSCape - Shiny/R application for single-cell epigenomic data visualization. clustering, differential peak analysis (Wilcoxon, edgeR), linking peaks to genes, pathway enrichment (hypergeometric on MSigDb). Wraps scater, scran, corrects for batch effect using fastMNN from batchelor, determines the optimal number of clusters with ConsensusClusterPlus (2-10 clusters). Input - BAM, BED files, or count matrix. Compared with Cusanovich2018, SnapATAC, CisTopic, EpiScanpy. Multiple datasets. Web demo, GitHub, Code for the paper
- Prompsy, Pacôme, Pia Kirchmeier, Justine Marsolier, Marc Deloger, Nicolas Servant, and Céline Vallot. “Interactive Analysis of Single-Cell Epigenomic Landscapes with ChromSCape.” Nature Communications, (December 2020)
-
cisTopic - R/Bioconductor package for probabilistic modelling of cis-regulatory topics from scATAC-seq. Topic modelling for identification of cell types, enhancers, transcription regulators. Binarizing chromatin accessibility matrix, Latent Dirichlet Allocation (LDA, collapsed Gibbs sampler) and model selection, cell state identification using the topic-cell distributions, explorations of the region-topic distributions.
Paper
Bravo González-Blas, Carmen, Liesbeth Minnoye, Dafni Papasokrati, Sara Aibar, Gert Hulselmans, Valerie Christiaens, Kristofer Davie, Jasper Wouters, and Stein Aerts. “CisTopic: Cis-Regulatory Topic Modeling on Single-Cell ATAC-Seq Data.” Nature Methods 16, no. 5 (May 2019): 397–400. https://doi.org/10.1038/s41592-019-0367-1.
- scOpen - imputation for scATAC-seq data. Regularized NMF via a coordinate descent algorithm on binarized, TF-IDF-transformed ATAC-seq matrix. Tested on simulated (Chen et al. 2019) and four public scATAC-seq datasets against MAGIC, SAVER, scImpute, DCA, scBFA, cisTopic, SCALE, and PCA. Improves recovery of true open chromatin regions, clustering (ARI, silhouette), reduces memory footprint, fast. Improves the performance of downstream state-of-the-art scATAC-seq methods (cisTopic, chromVAR, Cicero). Applied to kidney fibrosis scATAC-seq data, Runx1 discovery. Scripts to reproduce analyses
- Li, Zhijian, Christoph Kuppe, Susanne Ziegler, Mingbo Cheng, Nazanin Kabgani, Sylvia Menzel, Martin Zenke, Rafael Kramann, and Ivan G. Costa. "Chromatin-accessibility estimation of single-cell ATAC data with scOpen." bioRxiv (2021)
- Review of single-cell multi-omics (scATAC-seq, scRNA-seq) integration principles, methods, and tools. Integration of matched and unmatched data, annotated group matching, matching with common features, aligning spaces. Quantitative causal modeling, statistical modeling, latent space inference, consensus of individual inferences (late integration). Integrating multimodal (jointly profiled) omics data. Brief description of technologies, tools. Visualization of multi-omics data, challenges. Table 1 - tools for matched data analysis, with links.
Paper
Miao, Zhen, Benjamin D. Humphreys, Andrew P. McMahon, and Junhyong Kim. “Multi-Omics Integration in the Age of Million Single-Cell Data.” Nature Reviews Nephrology 17, no. 11 (November 2021): 710–24. https://doi.org/10.1038/s41581-021-00463-x.
-
MUON - multimodal data structure to store and compute on multi-omics data. Meta-data can be object-specific or shared. Implemented in Python, MuData objects stored in HDF5 files. Includes Scanpy (omics data handling), MOFA+ (multi-omics factor analysis), neighbor graph analysis methods, visualization using matplotlib and seaborn. Examples on scRNA- and scATAC-seq PBMC data, others. Documentation. Tutorials web, GitHub. Interfaces for R, Julia
- Bredikhin, Danila, Ilia Kats, and Oliver Stegle. “Muon: Multimodal Omics Analysis Framework.” Preprint. Genomics, June 1, 2021.
-
JVis, j-SNE and j-UMAP - joint visualization and clustering of multimodal omics data. Goal is to arrange points (here cells) in low-dimensional space such that similarities observed between points in high-dimensional space are preserved, but in all modalities at the same time. Python implementation. Tweet
- Do, Van Hoan, and Stefan Canzar. “A Generalization of T-SNE and UMAP to Single-Cell Multimodal Omics.” Preprint. Bioinformatics, January 10, 2021
-
MAESTRO - Model-based AnalysEs of Single-cell Transcriptome and RegulOme. Full pipeline for the integrative analysis of scRNA-seq and scATAC-seq data, wraps external tools (STARsolo/minimap2, RseQC, MACS2, Seurat for normalization, LISA, GIGGLE). From preprocessing, alignment, QC (technology-specific), expression/accessibility quantification to clustering (graph-based and density-based), differential analysis, cell type annotation (CIBERSORT, brain cell signatures), transcription regulator inference (regulatory potential model, using CistromeDB data), integration/cell label transfer (Canonical Correlation Analysis). Handles data from various platforms (with/without barcodes). Outperforms SnapATAC, cicero, Seurat. Snakemake workflow, HDF5 data format, Conda installation. Tweet.
- Wang, Chenfei, Dongqing Sun, Xin Huang, Changxin Wan, Ziyi Li, Ya Han, Qian Qin, et al. “Integrative Analyses of Single-Cell Transcriptome and Regulome Using MAESTRO.” Genome Biology 21, no. 1 (December 2020)
-
Signac is an extension of Seurat for the analysis, interpretation, and exploration of single-cell chromatin datasets, and integration with scRNA-seq. ChromatinAssay object class, Latent Semantic Indexing and the modified TF-IDF procedure for dimensionality reduction. Applied to the PBMC 10X multiomics dataset and the Brain Initiative Cell Census Network data. Also, the Sinto Python package for processing aligned single-cell data
- Stuart, Tim, Avi Srivastava, Caleb Lareau, and Rahul Satija. “Single-cell chromatin state analysis with Signac,” Nature Methods, 01 November 2021
-
UnionCom - integration of multi-omics single-cell data using unsupervised topological alignment. Based on GUMA (generalized unsupervised manifold alignment) algorithm. Three steps: 1) embedding each single-cell dataset into the geometric distance matrix; 2) Align distance matrices; 3) Project unmatched features onto common embedding space. Tested on simulated and experimental data (sc-GEM, scNMT). Neighborhood overlap metric for testing, outperforms Seurat, MMD-MA, scAlign.
- Cao, Kai, Xiangqi Bai, Yiguang Hong, and Lin Wan. “Unsupervised Topological Alignment for Single-Cell Multi-Omics Integration.” Preprint. Bioinformatics, February 3, 2020.
-
scAI - integrative analysis of scRNA-seq and scATAC-seq or scMethylation data measured from the same cells (in contrast to different measures sampled from the same cell population). Overview of multi-omics single-cell technologies, methods for data integration in bulk samples and single-cell samples (MATCHER, Seural, LIGER), sparsity (scATAC-seq is ~99% sparse and nearly binary). Deconvolution of both single-cell matrices into gene loading and locus loading matrices, a cell loading matrix, in which factors K correspond to loadings of gene, locus, and cell in the K-dimensional space. A strategy to reduce over-aggregation. Cell subpopulations identified by Leiden clustering of the cell loading matrix. Visualization of the low-rank matrices with the Sammon mapping. Multi-omics simulation using MOSim, eight scenarios of simulated data, AUROC and Normalized Mutual Information (NMI) assessment of matrix reconstruction quality. Compared with MOFA, Seurat, LIGER. Tested on 8837 mammalian kidney cells scRNA-seq and scATAC-seq data, 77 mouse ESCs scRNA-seq and scMethylation, interpretation.
- Jin, Suoqin, Lihua Zhang, and Qing Nie. “ScAI: An Unsupervised Approach for the Integrative Analysis of Parallel Single-Cell Transcriptomic and Epigenomic Profiles.” Genome Biology 21, no. 1 (December 2020)
-
Harmony - scRNA-seq integration by projecting datasets into a shared embedding where cells differences between cell clusters are maximized while differences between datasets of origin are minimized = focus on clusters driven by cell differences. Can account for technical and biological covariates. Can integrate scRNA-seq datasets obtained with different technologies, or scRNA- and scATAC-seq, scRNA-seq with spatially-resolved transcriptomics. Local inverse Simpson Index (LISI) to test for database- and cell-type-specifc clustering. Outperforms MNN, BBKNN, MultiCCA, Scanorama. Memory-efficient, fast, scales to large datasets, included in Seurat. Python version
- Korsunsky, Ilya, Nghia Millard, Jean Fan, Kamil Slowikowski, Fan Zhang, Kevin Wei, Yuriy Baglaenko, Michael Brenner, Po-ru Loh, and Soumya Raychaudhuri. “Fast, Sensitive and Accurate Integration of Single-Cell Data with Harmony.” Nature Methods 16, no. 12 (December 2019)
-
Seurat v.3 paper. Integration of multiple scRNA-seq and other single-cell omics (spatial transcriptomics, scATAC-seq, immunophenotyping), including batch correction. Anchors as reference to harmonize multiple datasets. Canonical Correlation Analysis (CCA) coupled with Munual Nearest Neighborhoors (MNN) to identify shared subpopulations across datasets. CCA to reduce dimensionality, search for MNN in the low-dimensional representation. Shared Nearest Neighbor (SNN) graphs to assess similarity between two cells. Outperforms scmap. Extensive validation on multiple datasets (Human Cell Atlas, STARmap mouse visual cortex spatial transcriptomics. Tabula Muris, 10X Genomics datasets, others in STAR methods). Data normalization, variable feature selection within- and between datasets, anchor identification using CCA (methods), their scoring, batch correction, label transfer, imputation. Methods correspond to details of each Seurat function. Preprocessing of real single-cell data.
- Stuart, Tim, Andrew Butler, Paul Hoffman, Christoph Hafemeister, Efthymia Papalexi, William M Mauck, Marlon Stoeckius, Peter Smibert, and Rahul Satija. “Comprehensive Integration of Single Cell Data.” Preprint. Genomics, November 2, 2018.
-
Single-cell ATAC + RNA co-assay methods - overview of technologies and protocols, references to the original papers
-
Multi-omics methods - Table 1 from Sierant, Michael C., and Jungmin Choi. “Single-Cell Sequencing in Cancer: Recent Applications to Immunogenomics and Multi-Omics Tools.” Genomics & Informatics 16, no. 4 (December 2018)
-
Spatial scATAC-seq technology. Integrates transposase-accessible chromatin profiling in tissue sections with barcoded solid-phase capture to perform spatially resolved epigenomics. Highly concordant with single-nucleus snATAC-seq. Applied to three stages of mouse embryonic development. Enables discovery of regulatory programs via clustering (TFIDF) integration with Visium spatial scRNA-seq. Preprocessing with 10X Genomics’ CellRanger ATAC pipeline (v.2.0.0), STutility R package, ArchR. GSE214991 - spatial ATAC data matrix, Github.
Paper
Llorens-Bobadilla, Enric, Margherita Zamboni, Maja Marklund, Nayanika Bhalla, Xinsong Chen, Johan Hartman, Jonas Frisén, and Patrik L. Ståhl. “Solid-Phase Capture and Profiling of Open Chromatin by Spatial ATAC.” Nature Biotechnology, January 5, 2023. https://doi.org/10.1038/s41587-022-01603-9.
- mtscATAC-seq - mitochondrial scATAC-seq for mtDNA mutation calling and/using mitochondrial chromatin accessibility. Inverence of mtDNA heneroplasmy (two or more variants in the same cell), clonal relationships, cell state, chromatin accessibility variation. 10X Genomics, processing whole cells without depleting mitochondria. Computational approach to map reads aligning to NUMTs in the nuclear genome to mtDNA. About 1% reads from NUMTs would be detected, unlikely to confound. Applied to GM11906. Developed the Mitochondrial Genome Analysis Toolkit mgatk to identify clonal substructure in mtscATAC-seq data, variants annotated using MITOMAP database. GSE142745 - mtscATAC-seq data for several studies. GitHub - scripts to reproduce analyses.
Paper
Lareau, Caleb A., Leif S. Ludwig, Christoph Muus, Satyen H. Gohil, Tongtong Zhao, Zachary Chiang, Karin Pelka, et al. “Massively Parallel Single-Cell Mitochondrial DNA Genotyping and Chromatin Profiling.” Nature Biotechnology 39, no. 4 (April 2021): 451–61. https://doi.org/10.1038/s41587-020-0645-6.
- scGET-seq, single-cell genome and epigenome by transposases sequencing technology, uses a hybrid transposase treatment including the canonical Tn5 and TnH recognizing the chromodomain of the heterochromatin protein-1a (HP-1a) that maintains heterochromatin by binding to H3K9me3. Each transposase differentialy barcoded. Probes both open and chlosed chromatin, better resolves CNVs than scATAC-seq. scGET-seq in NIH-3T3 cells before and after Kdm5c histone demethylase knockdown (impairs H3K9me3 deposition). Chromatin Velosity method that identifies the trajectories of epigenetic modifications. Data on Array Express, scGET analysis scripts and scatACC for custom scATAC analysis.
Paper
Tedesco, Martina. “Chromatin Velocity Reveals Epigenetic Dynamics by Single-Cell Profiling of Heterochromatin and Euchromatin.” Nature Biotechnology, 11 October 2021, https://doi.org/10.1038/s41587-021-01031-1
- SHARE-seq - simultaneous profiling of scRNA-seq and sc-ATAC-seq from the same cells. Built upon SPLiT-seq, a combinatorial indexing method. Confirmed by separate scRNA-seq and scATAC-seq datasets. Chromatin opening precedes transcriptional activation.
Paper
Ma, Sai, Bing Zhang, Lindsay M. LaFave, Andrew S. Earl, Zachary Chiang, Yan Hu, Jiarui Ding, et al. “Chromatin Potential Identified by Shared Single-Cell Profiling of RNA and Chromatin.” Cell 183, no. 4 (November 2020): 1103-1116.e20. https://doi.org/10.1016/j.cell.2020.09.056.
- dscATAC-seq (droplet single-cell assay for transposase-accessible chromatin using sequencing), with combinatorial indexing (dsciATAC-seq, cells are combinatorially barcoded, multiple cells per droplet). After Tn5 transposing (increased concentration), intact nuclei are isolated into droplets. Increased library size, complexity (chromVAR), proportion of TSS/nuclear fragments, high human/mouse specificity. tSNE clustering using the latent semantic indexing (LSI), better resolved clusters, uncorrelated with technical batches. Applied to (1) a reference map of chromatin accessibility in the mouse brain (46,653 cells) and (2) an unbiased map of human hematopoietic states in the bone marrow (60,495 cells), isolated cell populations from bone marrow and blood (52,873 cells), and bone marrow cells in response to stimulation (75,958 cells). Data at GSE123581. Analysis code, computational pipeline BAP.
Paper
Lareau, Caleb A., Fabiana M. Duarte, Jennifer G. Chew, Vinay K. Kartha, Zach D. Burkett, Andrew S. Kohlway, Dmitry Pokholok, et al. “Droplet-Based Combinatorial Indexing for Massive-Scale Single-Cell Chromatin Accessibility.” Nature Biotechnology 37, no. 8 (August 2019): 916–24. https://doi.org/10.1038/s41587-019-0147-6.
- CATLAS, Cis-element ATLAS - sciATAC-seq on 25 human tissue types, approx. 500,000 nuclei, over 750,000 candidate cis-regulatory elements (cCREs) in 54 distinct cell types. Cell- and tissue-specific gene regulatory programs. Analysis of noncoding variant effect on TF binding sites (deltaSVM model, 460 TFs affected, 302 likely causal GWAS variants prioritized). Downloadable data, hg38 coordinates of cCREs, chromatin accessibility matrices aggregated as cell x cCRE, cell x gene (promoter), cell metadata, ontology, UMAP embeddings, bigWig tracks, cCRE to gene linkage data predicted by the Activity-By-Contact (ABC) model. README
Paper
Zhang, Kai, James D. Hocker, Michael Miller, Xiaomeng Hou, Joshua Chiou, Olivier B. Poirion, Yunjiang Qiu, et al. “A Single-Cell Atlas of Chromatin Accessibility in the Human Genome.” Cell, November 2021, S0092867421012794. https://doi.org/10.1016/j.cell.2021.10.024.
- Single-cell epigenomic identification of inherited risk loci in Alzheimer’s and Parkinson’s disease. scATAC-seq data integrated with published HiChIP data. GitHub. GEO GSE147672 - processed scATAC-seq data, BED, bigWig, SummarizedExperiment. WashU session drS3o1n4kJ with scATAS clusters, cell types, neuron subclusters and cell types. Supplementary data with scATAC-seq peaks, neuronal cluster definitions, differential accessibility.
Paper
Corces, M. Ryan, Anna Shcherbina, Soumya Kundu, Michael J. Gloudemans, Laure Frésard, Jeffrey M. Granja, Bryan H. Louie, et al. “Single-Cell Epigenomic Analyses Implicate Candidate Causal Variants at Inherited Risk Loci for Alzheimer’s and Parkinson’s Diseases.” Nature Genetics 52, no. 11 (November 2020): 1158–68. https://doi.org/10.1038/s41588-020-00721-x.
- scRNA-seq and scATAC-seq of normal mammary epithelial cells (MECs, mouse). 4 main clusters, their characteristics. Trajectory analysis, regulatory modules and TFs. Seurat/Signac, Monocle, Cicero, cisTopic, ChromVar, Homer. Processed data: GSE157890.
Paper
Pervolarakis, Nicholas, Quy H. Nguyen, Justice Williams, Yanwen Gong, Guadalupe Gutierrez, Peng Sun, Darisha Jhutty, et al. “Integrated Single-Cell Transcriptomics and Chromatin Accessibility Analysis Reveals Regulators of Mammary Epithelial Cell Identity.” Cell Reports 33, no. 3 (October 2020): 108273. https://doi.org/10.1016/j.celrep.2020.108273.
- Single-cell ATAC-seq, ~100,000 single cells from 13 adult mouse tissues. Two sequence platforms, good concordance. Filtered data assigned into 85 clusters. Genes associated with the corresponding ATAC sites (Cicero for identification). Differential accessibility. Motif enrichment (Basset CNN). GWAS results enrichment. All data and metadata are available for download as text or rds format
- Cusanovich, Darren A., Andrew J. Hill, Delasa Aghamirzaie, Riza M. Daza, Hannah A. Pliner, Joel B. Berletch, Galina N. Filippova, et al. “A Single-Cell Atlas of In Vivo Mammalian Chromatin Accessibility.” Cell, (August 2018)
-
sciATAC-seq protocols on frozen tissues, protocol 1 from Hocker, James D., et al. "Cardiac cell type–specific gene regulatory programs and disease risk association." Science advances 7.20 (2021), protocol 2 from Preissl, Sebastian, et al. "Single-nucleus analysis of accessible chromatin in developing mouse forebrain reveals cell-type-specific transcriptional regulation." Nature neuroscience, (2018)
-
scATAC-seq lectures by Ming Tang, YouTube. scATAC-seq technique, 12 min, Preprocessing and QC, 19 min, Analysis, 14 min, scATAC-scRNA-seq integration, 24 min