Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

filterAndTrim #2007

Open
AliciaBalbin opened this issue Aug 27, 2024 · 1 comment
Open

filterAndTrim #2007

AliciaBalbin opened this issue Aug 27, 2024 · 1 comment

Comments

@AliciaBalbin
Copy link

AliciaBalbin commented Aug 27, 2024

Hi, I would like to ask how "filterAndTrim" works. Since I run it with my forward and reward reads but I think it is just doing it by the forward reads. At least in the output I just only find my forward reads. However i am not bekomming any error. And I guess the function is doing this filtering and afterwards filtFs and filtRs should be filtered direct in the folder? - how to check this?
I use the following comands:
cpu = 40
out <- filterAndTrim(cutFs, filtFs, cutRs, filtRs, maxN = 0, maxEE = c(1,1), truncQ = 2, rm.phix=TRUE,
minLen = 100, compress = TRUE, multithread = cpu, verbose = TRUE)
out

Besides I also get a script from a friend where they are doing a dereplication before denoinsing. And indeed in your pipeline says "At this step, the core sample inference algorithm is applied to the dereplicated data." . Is then this step missing?
#dereplication
derepFs <- derepFastq(filtFs, verbose = TRUE)
#denoise reads
dadaFs <- dada(derepFs, err = errF, multithread = cpu, pool = FALSE)

Thank you very much :)

@benjjneb
Copy link
Owner

benjjneb commented Sep 2, 2024

Hi, I would like to ask how "filterAndTrim" works. Since I run it with my forward and reward reads but I think it is just doing it by the forward reads. At least in the output I just only find my forward reads. However i am not bekomming any error. And I guess the function is doing this filtering and afterwards filtFs and filtRs should be filtered direct in the folder? - how to check this?

filterAndTrim looks at each forward-reverse read pair and makes filtering decisions jointly. That is, it keeps or throws away the whole pair, and never e.g. keeps the forward read from the pair while throwing away the reverse read. This keeps the filtered output in matched order.

I'm not sure why you are finding only your forward reads, but I would start by inspecting the filepaths you are providing filterAndTrim and making sure they look appropriate. For example, what do head(cutFs), head(filtFs), head(cutRs), head(filtRs)look like. Is all as expected? You can also use thefile.exists(filtRs)function to check if the files exist after runningfilterAndTrim`.

Besides I also get a script from a friend where they are doing a dereplication before denoinsing. And indeed in your pipeline says "At this step, the core sample inference algorithm is applied to the dereplicated data." . Is then this step missing?
#dereplication

No it isn't missing. A few years ago we implemented dereplication "on the fly" in learnErrors and dada. This is preferred because it can dramatically reduce memory usage, since only one sample is loaded into memory at a time whereas the previous method of running derepFastq explicitly loaded all the samples into memory at once.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants