Take care with samples with no matching tags&primers #1

tobiasgf · 2016-09-12T10:23:05Z

Dear Frederic (& others)

I just realized a potential pitfall with this pipeline (and similar approaches).
Like me, you may include empty/blank samples (or have real samples with no reads matching tags and primer sequences). When processing such a sample, Cutadapt will eventually have nothing to work on (so to speak), resulting in an empty file after removal of tags and primers. However vsearch will just work on the (fasta) tmp-file from the previous sample, resulting in the final dereplicated file (S00x.fas) for the current empty/blank/negative being identical to the previous "real" sample. This is of course easy to spot in a project with only a few samples, but may be less apparent with large datasets.

You will need to identify samples that were devoid of reads matching the actual tags for that sample, and remove them before downstream processing. They can be identified eg by searching for the sentence "Unable to read from file" in the logfiles. For example like this:

grep -c "Unable to read from file" S[0-9][0-9][0-9]*log

Regards
Tobias

tobiasgf changed the title ~~Tak care with empty files~~ Take care with empty files Sep 12, 2016

tobiasgf changed the title ~~Take care with empty files~~ Take care with samples with no matching tags&primers Sep 12, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Take care with samples with no matching tags&primers #1

Take care with samples with no matching tags&primers #1

tobiasgf commented Sep 12, 2016

Take care with samples with no matching tags&primers #1

Take care with samples with no matching tags&primers #1

Comments

tobiasgf commented Sep 12, 2016