You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have downloaded some Illumina PE reads from SRA and we got the CONTRADICT_FASTQ error.
Both R1 and R2 were in Sanger+33 quality format. However we found in R1 that the first read has a quality symbol K which is Phred 42. Usually Illumina qualities stop at 40 but they can be hire (eg. in Moleculo sequencing etc) which is described here: https://en.wikipedia.org/wiki/FASTQ_format#Encoding
I think you need to adjust the thresholds in the code below to be more flexible in terms of what high Q values you allow for SANGER_FASTQ. Maybe change 74 to 80 ?
Here skewer proceeded until it saw K and decided it's Solexa/Illumina 1.3+/Illumina 1.5+ encoding while we clearly see that it's Illumina 1.8+.
HiSeq 3000/4000 and the X series can produce scores which include K.
We have downloaded some Illumina PE reads from SRA and we got the
CONTRADICT_FASTQ
error.Both R1 and R2 were in Sanger+33 quality format. However we found in R1 that the first read has a quality symbol
K
which is Phred 42. Usually Illumina qualities stop at 40 but they can be hire (eg. in Moleculo sequencing etc) which is described here: https://en.wikipedia.org/wiki/FASTQ_format#EncodingI think you need to adjust the thresholds in the code below to be more flexible in terms of what high Q values you allow for
SANGER_FASTQ
. Maybe change 74 to 80 ?The text was updated successfully, but these errors were encountered: