You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Why are my q scores dropping so much with --trim adapters?
We are using FLO-MIN114 and R10 chemistry for a cDNA library derived from human RNA.
We noticed a high percentage (>50%) of unusable reads detected by pychopper with the wf-transcriptome workflow, and then identified that it could be improved to <10% if we trimmed the adapters (therefore keeping primers).
However, I was surprised to see a significant drop in the quality scores when I turn on --trim adapters: nextflow run epi2me-labs/wf-basecalling \ -profile singularity \ --sample_name $sample_name \ --input $pod5_dir \ --dorado_ext pod5 \ --basecaller_cfg dna_r10.4.1_e8.2_400bps_sup@v5.0.0 \ --qscore_filter 10 \ --basecaller_args "--trim adapters" \ --output_fmt fastq \ --out_dir $results_folder
While it makes sense for pychopper to work better with the primers present, I can't understand while the basecalling quality drops do much. In the example below, I demonstrate the different q scores from the same sample.
I appreciate any help to understand this!
Felipe
The text was updated successfully, but these errors were encountered:
This is due to the fact that the phred scores for bases in adapter, barcode, and primer regions are typically surpressed compared to bases further into reads. The default in dorado trims all of these components and so when the read quality score is computer by the workflow from the remaining bases the value is higher than when --trim adapters is enabled and barcode and primer sequences are left in place.
Dorado itself reports read quality scores having dropped the first 60 quality scores (see e.g. CRFModelConfig.cpp#L41). The workflow component that is responsible for the data behind these graphs does not do this as it does not have knowledge of whether the basecall has been pretrimmed of adapters, barcodes, and primers.
@cjw85 Thanks for your answer. So, in terms of overall assessment of our sequencing quality, it seems that the top graph would be more representative of the true quality of basecalling. And that the drop in Q score is an artifact caused by the adapters or primers (in this case) as opposed to poor sequencing quality. We are not multiplexing, so no barcodes. Would you agree with this assessment?
Why are my q scores dropping so much with --trim adapters?
We are using FLO-MIN114 and R10 chemistry for a cDNA library derived from human RNA.
We noticed a high percentage (>50%) of unusable reads detected by pychopper with the wf-transcriptome workflow, and then identified that it could be improved to <10% if we trimmed the adapters (therefore keeping primers).
However, I was surprised to see a significant drop in the quality scores when I turn on --trim adapters:
nextflow run epi2me-labs/wf-basecalling \ -profile singularity \ --sample_name $sample_name \ --input $pod5_dir \ --dorado_ext pod5 \ --basecaller_cfg dna_r10.4.1_e8.2_400bps_sup@v5.0.0 \ --qscore_filter 10 \ --basecaller_args "--trim adapters" \ --output_fmt fastq \ --out_dir $results_folder
While it makes sense for pychopper to work better with the primers present, I can't understand while the basecalling quality drops do much. In the example below, I demonstrate the different q scores from the same sample.
I appreciate any help to understand this!
Felipe
The text was updated successfully, but these errors were encountered: