Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify object management #23

Open
realmichaelhoffert opened this issue Feb 9, 2022 · 1 comment
Open

Simplify object management #23

realmichaelhoffert opened this issue Feb 9, 2022 · 1 comment
Labels
enhancement New feature or request

Comments

@realmichaelhoffert
Copy link

As of 1.14, DADA2 now supports passing directories rather than paths to functions:

Directories containing fastq files (possibly compressed) can now be provided to core dada2 functions instead of a character
vector of the fastq filenames. This functionality is supported by filterAndTrim, learnErrors, dada, mergePairs and derepFastq.
Note, this feature requires fastqs in the provided directory to have standard file extensions: .fastq, .fastq.gz or .fastq.bz2. 

This makes a lot of the pipeline code to organize filepaths and copy files between different locations unnecessary. We could reduce the number of required lines of code and also make the process of running a dataset simpler.

[](https://github.com/fiererlab/dada2_fiererlab#1-filter-and-trim-for-quality)# Put filtered reads into separate sub-directories for big data workflow
dir.create(filter.fp)
    subF.fp <- file.path(filter.fp, "preprocessed_F") 
    subR.fp <- file.path(filter.fp, "preprocessed_R") 
dir.create(subF.fp)
dir.create(subR.fp)

# Move R1 and R2 from trimmed to separate forward/reverse sub-directories
fnFs.Q <- file.path(subF.fp,  basename(fnFs)) 
fnRs.Q <- file.path(subR.fp,  basename(fnRs))
file.rename(from = fnFs.cut, to = fnFs.Q)
##  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [15] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
file.rename(from = fnRs.cut, to = fnRs.Q)
##  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
## [15] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE

# File parsing; create file names and make sure that forward and reverse files match
filtpathF <- file.path(subF.fp, "filtered") # files go into preprocessed_F/filtered/
filtpathR <- file.path(subR.fp, "filtered") # ...
fastqFs <- sort(list.files(subF.fp, pattern="fastq.gz"))
fastqRs <- sort(list.files(subR.fp, pattern="fastq.gz"))
if(length(fastqFs) != length(fastqRs)) stop("Forward and reverse files do not match.")
@cliffbueno cliffbueno added the enhancement New feature or request label Oct 23, 2024
@cliffbueno
Copy link
Collaborator

I am not going to implement this is Version 2.0.0. Our current file system is working just fine so don't think it's super urgent to implement this. We'll leave the enhancement tag on it and it could be something to implement in version 3.0.0 down the road.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants