-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix lowercase actgn input handling. #1435
Changes from all commits
9e27de6
13284fb
4146358
a94218d
5fe8f65
38b2b8e
079cea4
e196cf7
decd944
b486264
3e60bcf
768d92f
286574a
7b40857
95b3d0f
6f52164
cd72c8f
72167c9
974ced3
4bc1135
2e5e84b
c162bdd
e7258b4
ba24ff7
e6524e4
0ac2942
9cff349
df2eaa4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -61,7 +61,7 @@ | |
from khmer.kfile import (check_space, check_space_for_graph, | ||
check_valid_file_exists, add_output_compression_type, | ||
get_file_writer, is_block, describe_file_handle) | ||
from khmer.utils import write_record, broken_paired_reader | ||
from khmer.utils import (write_record, broken_paired_reader, ReadBundle) | ||
from khmer.khmer_logger import (configure_logging, log_info, log_error) | ||
|
||
|
||
|
@@ -168,24 +168,13 @@ def __call__(self, is_paired, read0, read1): | |
* if any read's median k-mer count is below desired coverage, keep all; | ||
* consume and yield kept reads. | ||
""" | ||
batch = ReadBundle(read0, read1) | ||
desired_coverage = self.desired_coverage | ||
|
||
passed_filter = False | ||
|
||
batch = [] | ||
batch.append(read0) | ||
if read1 is not None: | ||
batch.append(read1) | ||
|
||
for record in batch: | ||
seq = record.sequence.replace('N', 'A') | ||
if not self.countgraph.median_at_least(seq, desired_coverage): | ||
passed_filter = True | ||
|
||
if passed_filter: | ||
for record in batch: | ||
seq = record.sequence.replace('N', 'A') | ||
self.countgraph.consume(seq) | ||
# if any in batch have coverage below desired coverage, consume &yield | ||
if not batch.coverages_at_least(self.countgraph, desired_coverage): | ||
for record in batch.reads: | ||
self.countgraph.consume(record.cleaned_seq) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Cleaner and more concise. I like it! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. :) yes, a little complicated on the side of double and triple negatives when you dig into it, but nice and concise now that it's done! |
||
yield record | ||
|
||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I'm fine with moving the read cleaning code to a function rather than a class. (Having the cleaned seq as part of the original record is better organization anyway, IMO.) But then that leaves the question of what the
ReadBundle
class is really for. Just aggregation? If so, we need to update the docs from my last PR to make sure we're clear about what code is doing what.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed!
Yes, the
ReadBundle
class is about aggregation (pairs/singletons of reads). I'll go update the docs.