split-paired-reads.py has nonstandard behavior with orphaned reads #847

camillescott · 2015-02-25T19:35:38Z

By default, split-paired-reads.py mixes orphaned reads into their respective .1 / .2 files, which is a nonstandard behavior -- orphaned reads should be put into their own files. Tools like bowtie and Trinity fail when they encounter orphaned reads mixed into split paired files.

ctb · 2015-03-02T13:57:53Z

We talked about this in person, and the conclusion was that we cannot alter this until khmer 2.0; but the -p was added in #818 to force the expected behavior.

Two thoughts -

shall we update the documentation to make this clear?
@mr-c, what's the right way to punt these kinds of issues to khmer 2.0 release?

ctb · 2015-06-12T10:38:53Z

For 2.0,

make -p default on split-paired-reads;
change to using broken_paired_reader(..., require_paired=True) in split-paired-reads;
upgrade extract-paired-reads to properly handle streaming input and specification of output files;
in the error message that results from -p, mention that extract-paired-reads can be used to fix;

@camillescott @mr-c the alternative here is to add an option to split-paired-reads to sideline or trash orphans. I think this makes the script too complicated so am -0 on it but would appreciate your thoughts.

ctb · 2015-08-05T16:59:33Z

Closed by #1164.

camillescott added discussion-needed Python known-issue labels Feb 25, 2015

ctb added this to the 2.0 milestone Jun 12, 2015

This was referenced Jun 12, 2015

add support for /dev/stdin and --output-dir to extract-paired-reads #1085

Merged

Update command-line option defaults for khmer 2.0 #1097

Closed

ctb mentioned this issue Jul 10, 2015

Add --output-orphaned option to split-paired-reads.py #1164

Merged

ctb mentioned this issue Jul 19, 2015

Deal with "broken-paired" input/output, for better streaming. #733

Closed

ctb closed this as completed Aug 5, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

split-paired-reads.py has nonstandard behavior with orphaned reads #847

split-paired-reads.py has nonstandard behavior with orphaned reads #847

camillescott commented Feb 25, 2015

ctb commented Mar 2, 2015

ctb commented Jun 12, 2015

ctb commented Aug 5, 2015

split-paired-reads.py has nonstandard behavior with orphaned reads #847

split-paired-reads.py has nonstandard behavior with orphaned reads #847

Comments

camillescott commented Feb 25, 2015

ctb commented Mar 2, 2015

ctb commented Jun 12, 2015

ctb commented Aug 5, 2015