Filter whitelisted barcodes within edit distance of another whitelisted barcode #138

TomSmithCGAT · 2017-06-19T11:56:20Z

Based on my analysis of droplet scRNA-Seq cell barcodes (e.g blog post here), I think we should add the option to remove cell barcodes within the automatically generated whitelist if they are within an edit distance threshold of another whitelisted barcode with greater frequency. I believe there is sufficient evidence to suggest error barcodes (INDEL or sequencing error) may pass the whitelist threshold. We could merge these barcodes into the true barcode from which they derive but this risks merging two truly different cells. On balance, removing these potential error barcodes seems like the best approach. This is compatible with the current error correction within extract which is restricted to only barcodes not in the whitelist. Thus the steps for whitelist generation and filtering would be:

Parse first 50M reads, extract cell barcodes and generate a whitelist using the knee method
(Optionally) identify all cell barcodes within an edit distance threshold of exactly one whitelisted barcode
(Optionally) Remove whitelisted barcodes within an edit distance threshold of another whitelisted barcode within greater frequency
Parse all reads, extract cell barcodes and filter reads against the whitelist (with optional correction of cell barcodes not in the whitelist)

TomSmithCGAT · 2019-02-06T16:05:47Z

This is now available on the master branch and will be in the next release

TomSmithCGAT self-assigned this Jun 19, 2017

TomSmithCGAT added the enhancement label Jun 19, 2017

TomSmithCGAT added this to the 0.5 milestone Jun 19, 2017

TomSmithCGAT mentioned this issue Jun 19, 2017

{ts} extract cell barcodes #128

Merged

TomSmithCGAT removed this from the 0.5 milestone Oct 13, 2017

TomSmithCGAT added the Next release label Oct 13, 2017

TomSmithCGAT mentioned this issue Jan 24, 2019

{ts} whitelist filter pos errors #309

Merged

TomSmithCGAT closed this as completed Feb 6, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter whitelisted barcodes within edit distance of another whitelisted barcode #138

Filter whitelisted barcodes within edit distance of another whitelisted barcode #138

TomSmithCGAT commented Jun 19, 2017

TomSmithCGAT commented Feb 6, 2019

Filter whitelisted barcodes within edit distance of another whitelisted barcode #138

Filter whitelisted barcodes within edit distance of another whitelisted barcode #138

Comments

TomSmithCGAT commented Jun 19, 2017

TomSmithCGAT commented Feb 6, 2019