You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have encountered a strange result where in the original file I have these 2 reads (subsampled 10X chromium 5' scRNA-seq public data mapped by STAR v2.7.10a)
which somehow disappears after deduplication. On different umitools dedup runs, I sometimes see 5 examples of similar read groups and sometimes only see 2 examples.
What is more strange is that if I create a test.bam just containing these 2 reads, the deduplication always results in choosing 1 representative reads.
Do you have any idea what is going on here? I have read through #458 but still could not figure out why. Is it because I subsampled the data? Thank you so much for your help.
The text was updated successfully, but these errors were encountered:
camelest
changed the title
Inconsistent result on different runs of umitools dedup ended up to no representative reads from the same mapping coordinate
Inconsistent result of no reads from the same coordinate in UMI-tools dedup
Nov 28, 2022
umi_tools is not deterministic by default, so different runs can yield different results. There's an open PR to make it deterministic, with links to other issues describing how to make it deterministic in the current version, if you want to read further (#550).
Without seeing the full input and output for all reads with the same alignment coordinates as the reads above, it's not possible to be certain what's happening. However, I expect you have more reads with the same aligment coordinates and similar enough UMIs that form a network with more than one possible solution.
Hi, thank you so much for the wonderful tool.
I have encountered a strange result where in the original file I have these 2 reads (subsampled 10X chromium 5' scRNA-seq public data mapped by STAR v2.7.10a)
which somehow disappears after deduplication. On different umitools dedup runs, I sometimes see 5 examples of similar read groups and sometimes only see 2 examples.
What is more strange is that if I create a test.bam just containing these 2 reads, the deduplication always results in choosing 1 representative reads.
Do you have any idea what is going on here? I have read through #458 but still could not figure out why. Is it because I subsampled the data? Thank you so much for your help.
umitools v1.1.2
umi_tools dedup --per-cell -I input.bam --extract-umi-method=tag --umi-tag=UR --cell-tag=CR -S output.bam
The text was updated successfully, but these errors were encountered: