You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A SAM file generated from paired-end sequences will have strand suffix (/1 for forward strand, /2 for reverse strand) trimmed (thus the query names are identical), but the strand information will be embeded in the flag (2nd column), an integer in which the 7th bit represents the forward read and the 8th bit represents the reverse read.
For example, flag = 99 (0b1100011) is the forward read, flag = 147 (0b10010011) is the reverse read, flag = 16 (0b10000) is an unpaired sequence.
This information needs to be appended to the query name. Previously, the solution was:
if flag & (1 << 6):
qname += '/1'
elif flag & (1 << 7):
qname += '/2'
I tested several solutions. There is an elegant solution, which performs bit operation once and get the correct strand information for both:
if strand := flag >> 6 & 0b11:
qname = f'{qname}/{strand}'
The variable strand will have three values: 0 for unpaired, 1 for forward, and 2 for reverse.
However, the expensive part is the subsequent string concatenation. I couldn't find a way that is more efficient than the original solution.
The text was updated successfully, but these errors were encountered:
A SAM file generated from paired-end sequences will have strand suffix (
/1
for forward strand,/2
for reverse strand) trimmed (thus the query names are identical), but the strand information will be embeded in the flag (2nd column), an integer in which the 7th bit represents the forward read and the 8th bit represents the reverse read.For example, flag = 99 (0b1100011) is the forward read, flag = 147 (0b10010011) is the reverse read, flag = 16 (0b10000) is an unpaired sequence.
This information needs to be appended to the query name. Previously, the solution was:
I tested several solutions. There is an elegant solution, which performs bit operation once and get the correct strand information for both:
The variable
strand
will have three values: 0 for unpaired, 1 for forward, and 2 for reverse.However, the expensive part is the subsequent string concatenation. I couldn't find a way that is more efficient than the original solution.
The text was updated successfully, but these errors were encountered: