Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-fitted error rates #1135

Closed
GianyA opened this issue Sep 17, 2020 · 4 comments
Closed

Non-fitted error rates #1135

GianyA opened this issue Sep 17, 2020 · 4 comments

Comments

@GianyA
Copy link

GianyA commented Sep 17, 2020

Hello there,

I was running the pipeline and during the error rate modeling, I got a quite bizarre results during the modeling (find attached the graphs).
forward-err.pdf
reverse-error.pdf

This data was sequenced with Iseq-Illumina and it's 16s amplicons from a bioreactor.
So, is this kind of deviation normal for the error learning or it's just my data?

Cheers,
Giany.

@benjjneb
Copy link
Owner

The issue here is that the iSeq uses binned quality scores, rather than the normal quality scores 1-40, and this interacts with error model learning. See this issue for a longer discussion: #791

Short answer, things still seem to work fine as far as we can tell, but there are tweaks you can pursue to improve the monotonicity of the fitted error rates.

@benjjneb
Copy link
Owner

See also this analysis of DADA2 performance on iSeq data by @ong8181: #1083 (comment)

@diriano
Copy link

diriano commented Dec 10, 2020

Hi,
I think I am having a problem with binned quality scores.
I have 56 samples, each with over 400K paired-end reads (2x150bp, using 16S rDNA V4 region), after filtering.
When trying to run learnErros using loessErrfun, the R1 was OK, could fit the data. But learnErrors was always failing for R2, no matter which sample I was using. The error mesage is:

139227062 total bases in 926543 reads from 2 samples will be used for learning the error rates.
Error rates could not be estimated (this is usually because of very few reads).
Error in getErrors(err, enforce = TRUE) : Error matrix is NULL.

See the aggregate quality plots for the 56 samples forward reads:

image

And reverse reads:

image

From these figs, the quality valies seems to be binned.

Following are the results of plotErrors for the foward reads, where loessErrfun works. That seems OK to me.

image

AS the reverse reads dis not work with loessErrfun , I tried with noqualErrfun, and that was able to finish fitting the data. Here is the plotErrors
image

Would it be OK to go on with the learnErrors for the reverse reads? Can I mix the learnErrors of the forward reads using loessErrfun and the reverse reads with noqualErrfun?^

Thanks,
Diego

@benjjneb
Copy link
Owner

I think I might just use the forward read error model for both forward and reverse reads in this case. The observed data looks very similar, so it should work well enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants