Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to reverse complement sequences before writing them #33

Open
ElDeveloper opened this issue Sep 8, 2017 · 11 comments
Open

Comments

@ElDeveloper
Copy link
Member

ElDeveloper commented Sep 8, 2017

Improvement Description
split_libraries_fastq.py provided a --rev_comp flag, that would let you reverse complement the sequences before they were written out.

After talking with @wasade, we both agree that such functionality might not be appropriate for q2-quality-filter, but we couldn't really figure out where this would fit better, so also opening this issue hoping this can get relocated to the proper repo.

@jairideout
Copy link
Member

q2-demux?

@ElDeveloper
Copy link
Member Author

Maybe, though then there's no way to reverse complement per-sample FASTQ files.

@jairideout
Copy link
Member

Are you wanting support for reverse complementing any kind of sequence data stored in .qza files, not just during demultiplexing?

@ElDeveloper
Copy link
Member Author

Yes. This was the case in QIIME1, since the functionality was available in split_libraries_fastq.py and multiple_split_libraries_fastq.py used split_libraries_fastq.py.

@jairideout
Copy link
Member

Sounds good! Are you available to work on this for the 2017.9 release?

@ElDeveloper
Copy link
Member Author

Yes, where should this be added to?

@gregcaporaso
Copy link
Member

What's the use case for reverse complementing fastq files? I worry that that might lead to assumptions about the quality profiles being violated (e.g., median quality will be positively correlated with sequence position in the resulting files, rather than negatively correlated, which could mess things up downstream). It seems like a reverse complement action might make more sense if it operated on FeatureData[Sequence] instead.

@ElDeveloper
Copy link
Member Author

@gregcaporaso, after solving the problem I had with that particular dataset, I don't think I have a use-case anymore. Seems like the only place where this could matter would be with the FeatureData[Sequence] as you mention. I am down to put that in place if you all think this still makes sense.

@ebolyen
Copy link
Member

ebolyen commented Sep 13, 2017

Since it looks like this isn't immediately necessary, we're dropping it from our sprint plan. We'll keep an eye out for more situations where this could be needed.

@ElDeveloper
Copy link
Member Author

ElDeveloper commented Sep 13, 2017 via email

@ARW-UBT
Copy link

ARW-UBT commented Mar 22, 2019

Would the following be a use case?
We continously analyse fungal ITS amplicons (paired-end data). Due to the length heterogeneitiy of ITS, it is not possible to joine forward and reverse reads for many of them. Therefore, joined reads cannot be processed by
a) q2 quality-filter q-score-joined (followed by q2 deblur denoise-other) or by
b) q2 dada2 denoise-paired.
I such situations, one can process single end data from R1 reads (forward), but this fails for R2 reads, probably because the 'antisense' R2 sequences are sorted out at q2 deblur denoise-other.

Since independent analyses can be combined in q2, I thought about

  1. reverse-complement the FeatureData[sequence] artifact
  2. analyse the rc-R2-sequences as single-end data and
  3. finally combine both analysis results...

Best regards

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants