Collections of library structure and sequence of popular single cell genomic methods (mainly scRNA-seq).
Make sure you understand the basic configuration of the Illumina libraries, because most single cell sequencing methods are developed to be sequenced on the Illumina platforms. If you are not familiar with the Illumina sequencing libraries, click here to check some general information about Illumina library structures and the nature of library preparation.
The HTML pages listed below contain step-by-step procedures of how the libraries are generated experimentally. For the computational preprocessing pipelines for each method, please see this accompanying ReadTheDocs documentation. For the machine-readable format of the library structure, check seqspec.
Click the following links to view the methods. Notes:
- Index1 (i7) is always sequenced using the bottom strand as template, regardless of the Illumina machine in use. That is why the index sequences are reverse complementary to the primer sequences.
- IMPORTANT: In a dual-index library, how index2 (i5) is sequenced differs from machines to machines. According to the Index Sequencing Guide from Illumina, Miseq, Hiseq2000/2500, MiniSeq (Rapid) and NovaSeq 6000 (v1.0) use the bottom strand as template (Forward Strand Workflow), which is why the index sequences are the same as the primer sequences in those machines. iSeq 100, MiniSeq, NextSeq, HiSeq X, HiSeq 3000/4000 and NovaSeq 6000 (v1.5) use the top strand as template (Reverse Complement Workflow), which is why the index sequences are reverse-complementary to the primer sequences in those machines. All methods listed below use iSeq 100, MiniSeq (Standard), NextSeq, HiSeq X, HiSeq 3000/4000 and NovaSeq 6000 (v1.5) as examples, because this configuration is more frequently used nowadays.
-
- SMART-seq family (including SMART-seq, SMART-seq2/3/3xpress and FLASH-seq)
- STRT-seq family (including STRT-seq, STRT-seq-C1 and STRT-seq-2i)
- sci-RNA-seq family (including sci-RNA-seq and sci-RNA-seq3)
- Quartz-seq family (including Quartz-seq and Quartz-seq2)
- CEL-seq family (including CEL-seq and CEL-seq2)
- 10x Chromium Single Cell 3' FeatureBarcoding
- 10x Chromium Single Cell 3' GE V2 - V4
- 10x Chromium Single Cell 3' GE V1
- 10x Chromium Single Cell 5' VDJ
- 10x Chromium Single Cell 5' GE
- SureCell 3' WTA for ddSEQ
- MARS-seq / MARS-seq2.0
- SCRB-seq / mcSCRB-seq
- SPLiT-seq / microSPLiT
- Drop-seq / Seq-Well
- scifi-RNA-seq
- Microwell-seq
- BD Rhapsody
- HyDrop-RNA
- Seq-Well S3
- Tang 2009
- PETRI-seq
- VASA-seq
- FIPRESCI
- PIP-seq
- inDrop
-
- Trac-looping
- MATQ-seq
- ASTAR-seq
- Drop scChIP-seq
- sci-Plex
- sci-CAR-seq
- snmC2T-seq
- MALBAC-DT
- GRID-seq
- ZipSeq
- scCC
- scSPRITE
- TEA-seq/ICICLE-seq
- hsrChST-seq
- TIP-seq
- ECCITE-seq
- ASAP-seq/DOGMA-seq
- PHAGE-ATAC
- Spatial-ATAC-seq
- Spatial C&T
- scTEM-seq
- DBiT-seq
- scGET-seq
- Multi-CUT&Tag
- MulTI-Tag
- TIME-seq
- scSPLAT
- EpiDamID
- scRibo-seq
- sc-end5-seq
- Slide-seq / Slide-seqV2 / Slide-DNA-seq / Slide-tags
- CoTECH
- PairedTag
- GoT-ChA
- Methyl-HiC
- SNuBar-ATAC
- RAISIN RNA-seq & MIRACL-seq
- Microbe-seq
- SEC-seq
- scONE-seq
- BacDrop
- SPEAC-seq
- DisCo
- spinDrop
- sciPlex-ATAC-seq
- SCITO-seq
- snRandom-seq
- LAST-seq
- GAGE-seq
- scCARE-seq
- HiRES
- LiMCA
- nano-CT
- NTT-seq
- BuTT-seq
- M3-seq
- inDrops-2
- scRCAT-seq
- Phospho-seq
- RamDA-seq
- SIMPLE-seq
- scTAPS
- MUSIC
- Direct-seq
- Strand-seq
- Drop-BS
- CROPseq-multi
- ChAIR
- SPATAC-seq
- snapTotal-seq
- easySHARE-seq
- EasySci
- OAK
- BAG DNA RNA
- CAP-seq
- wellDA-seq
- microSPLiT <- need to check updates
- ProBac-seq
The basic chemistry is very similar, the main differences among those scRNA-seq methods are summarised in the table below. For a detailed discussion, check the text boxes from our review: From Tissues to Cell Types and Back: Single-Cell Gene Expression Analysis of Tissue Architecture
Single cell isolation/capture | Where RT happens | 2nd strand synthesis | Full-length cDNA synthesis | Barcode addition | Pooling before library | Library amplification | Gene coverage | |
---|---|---|---|---|---|---|---|---|
10x Chromium Single Cell 3' | Droplet | In droplets | TSO | Yes | Barcoded RT primers | Yes | PCR | 3' |
10x Chromium Single Cell 5' | Droplet | In droplets | TSO | Yes | Barcoded TSO primers | Yes | PCR | 5' |
BD Rhapsody | Nanowells | In collection tubes | Random priming and primer extension | No | Barcoded RT primers | Yes | PCR | 3' |
CEL-seq/CEL-seq2 | FACS | In 96w/384w wells | RNase H and DNA pol I | No | Barcoded RT primers | Yes | In vitro transcription | 3' |
Drop-seq | Droplet | In collection tubes | TSO | Yes | Barcoded RT primers | Yes | PCR | 3' |
Illumina Bio-Rad SureCell 3' WTA | Droplet | In droplets | RNase H and DNA pol I | No | Barcoded RT primers | Yes | PCR | 3' |
inDrop | Droplet | In droplets | RNase H and DNA pol I | No | Barcoded RT primers | Yes | In vitro transcription | 3' |
MARS-seq/MARS-seq2.0 | FACS | In 96w/384w wells | RNase H and DNA pol I | No | Barcoded RT primers | Yes | In vitro transcription | 3' |
Microwell-seq | Nanowells | In collection tubes | TSO | Yes | Barcoded RT primers | Yes | PCR | 3' |
Quartz-seq | FACS | In 96w/384w wells | PolyA tailing and primer ligation | Yes in principle | Ligation of barcoded Truseq adapters | No | PCR | 3' |
Quartz-seq2 | FACS | In 96w/384w wells | PolyA tailing and primer ligation | Yes in principle | Barcoded RT primers | Yes | PCR | 3' |
sci-RNA-seq | Not needed | In situ | RNase H and DNA pol I | No | Barcoded RT primers and library PCR with barcoded primers | Yes | PCR | 3' |
sci-RNA-seq3 | Not needed | In situ | RNase H and DNA pol I | No | Barcoded RT primers and hairpin adapters | Yes | PCR | 3' |
scifi-RNA-seq | Droplet multiple cells | In situ | TSO | Yes | Barcoded RT primers and gel bead barcodes | Yes | PCR | 3' |
SCRB-seq/mcSCRB-seq | FACS | In 96w/384w wells | TSO | Yes | Barcoded RT primers | Yes | PCR | 3' |
Seq-Well | Nanowells | In collection tubes | TSO | Yes | Barcoded RT primers | Yes | PCR | 3' |
Seq-Well S3 | Nanowells | In collection tubes | Random priming and primer extension | No | Barcoded RT primers | Yes | PCR | 3' |
SMART-seq/SMART-seq2/SMART-seq3 | FACS or Fluidigm C1 | In 96w/384w wells | TSO | Yes | Library PCR with barcoded primers | No | PCR | full-length |
SPLiT-seq | Not needed | In situ | TSO | Yes | Ligation of barcoded RT primers | Yes | PCR | 3' |
STRT-seq | FACS | In 96w/384w wells | TSO | Yes | Barcoded TSO primers | Yes | PCR | 5' |
STRT-seq-C1 | Fluidigm C1 | In microfluidic chambers | TSO | Yes | Barcoded Tn5 transposase | No | PCR | 5' |
STRT-seq-2i | FACS or dilution | In 9600w wells | TSO | Yes | Barcoded PCR primers and Tn5 transposase | Yes | PCR | 5' |
Tang 2009 | FACS or manual | In 96w/384w wells | PolyA tailing and primer extension | Yes in principle | Ligation of barcoded adaptors | No | PCR | Biased to 3' |
This is basically Table 1 from our scATAC-seq protocol: A plate-based single-cell ATAC-seq workflow for fast and robust profiling of chromatin accessibility
Tn5 and adaptors | Staring cell number | Tagmentation | Single-cell/nucleus isolation | Library amplification | Barcode addition | Throughput | |
---|---|---|---|---|---|---|---|
sci-ATAC-seq/snATAC-seq | Custom-made | 500,000+ | Bulk | FACS or dilution | PCR | Tn5 + PCR barcodes | 10,000 |
scTHS-seq | Custom-made | 500,000+ | Bulk | FACS or dilution | IVT and PCR | Tn5 + PCR barcodes | 10,000 |
Plate_scATAC-seq and Pi-ATAC-seq | Nextera | 5,000+ | Bulk | FACS | PCR | PCR barcodes | 1,000 |
Fluidigm C1 | Nextera | 4,000-20,000 | Single cells | Microfluidics | PCR | PCR barcodes | 100 |
Takara ICELL8 | Nextera | 16,000 | Single cells | Microfluidics | PCR | PCR barcodes | 1,000 |
10x Chromium Single Cell ATAC | Nextera | 800-15,000 | Bulk | Droplets | PCR | PCR barcodes | 10,000 |
Bio-Rad dscATAC-seq | Nextera | 60,000+ | Bulk | Droplets | PCR | PCR barcodes | 10,000 |
Bio-Rad dsciATAC-seq | Custom-made | 600,000+ | Bulk | Droplets | PCR | Tn5 + PCR barcodes | 100,000 |
I was a little bit bombarded with all the single cell methods and got completely lost. To help myself understand all of them and future troubleshooting, I start to perform an on-paper library preparation whenever I see a new single cell method.
Here I borrow from Feyman:
What I cannot create, I do not understand.
If you find this repository useful and would like to cite this resource, please consider citing this repo and the seqspec
preprint together:
@misc{xi_chen_teichlabscg_lib_structs_2023,
title = {Teichlab/scg\_lib\_structs: {Release} 26th {Oct} 2023},
copyright = {Creative Commons Attribution 4.0 International},
shorttitle = {Teichlab/scg\_lib\_structs},
url = {https://zenodo.org/doi/10.5281/zenodo.10042390},
abstract = {This is the first release to get a DOI so that people can cite the repo.},
urldate = {2023-10-26},
publisher = {Zenodo},
author = {Xi Chen and Patrick Roelli and Darío Hereñú and Pontus Höjer and Tim Stuart},
month = oct,
year = {2023},
doi = {10.5281/ZENODO.10042390},
}
@article{booeshaghi.pachter.Bioinformatics2024,
title = {A Machine-Readable Specification for Genomics Assays},
author = {Booeshaghi, Ali Sina and Chen, Xi and Pachter, Lior},
editor = {Kendziorski, Christina},
year = {2024},
month = mar,
journal = {Bioinformatics},
volume = {40},
number = {4},
pages = {btae168},
issn = {1367-4811},
doi = {10.1093/bioinformatics/btae168},
urldate = {2024-05-01},
abstract = {Motivation: Understanding the structure of sequenced fragments from genomics libraries is essential for accurate read preprocessing. Currently, different assays and sequencing technologies require custom scripts and programs that do not leverage the common structure of sequence elements present in genomics libraries.},
copyright = {https://creativecommons.org/licenses/by/4.0/},
langid = {english}
}
I would be very happy if you go through them and let me know what you think. If you spot some errors/mistakes, or I've missed some key methods. Feel free to raise an issue in the GitHub repository, or contact me directly:
Xi Chen
chenx9@sustech.edu.cn