-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
expose khmer.Read to python and use ReadParser in scripts #1491
Conversation
Some more background/thoughts/components of this --
So it's half yard cleaning and half optimization, I think. Make sense? |
Ok, so as a simple minded person I will take that to mean: we want a faster way of doing what Read cleaning in C++ vs python: did some exploring and came across |
Having re-read what |
On Mon, Oct 24, 2016 at 07:04:36AM -0700, Tim Head wrote:
yep, and making this code faster and more sensible is also a +1. |
0bc426e
to
96e749d
Compare
Two notes to myself:
|
However, replacing |
On Wed, Nov 09, 2016 at 04:11:59AM -0800, Tim Head wrote:
exellent!
Do you think the test should be discarded? We have the test_streaming_io tests |
Probably yes. It is super hard to debug because not only is the deadlock happening in another thread, but in some c++ code that is being called by another script being |
ea9c94d
to
f7bcbc0
Compare
Looking good on a quick skim. One comment - does eliminating the FIFO-based streaming tests drop coverage at all? |
Current coverage is 95.74% (diff: 88.23%)@@ master #1491 diff @@
==========================================
Files 36 36
Lines 2938 2938
Methods 0 0
Messages 0 0
Branches 448 449 +1
==========================================
- Hits 2815 2813 -2
- Misses 54 55 +1
- Partials 69 70 +1
|
I removed the stuff that attempted to add the We can go back to #1483 if we really want to add the |
No drop in coverage.
|
5e7dcbf
to
fb4aa0e
Compare
On Mon, Nov 14, 2016 at 03:05:32AM -0800, Tim Head wrote:
agreed. |
This should pass on both legacy python and python any minute now. |
Ready for review! @luizirber, @camillescott, @standage , or @ctb |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice; LGTM.
Let's go ahead and merge when the tests pass! @betatim do you think you could go through the various Read-related issues and update them with this new functionality? |
I'll give it a go. Do you have something particular in mind? Otherwise I'll locate read related issues by searching, etc |
On Mon, Nov 14, 2016 at 05:30:35AM -0800, Tim Head wrote:
they're all linked issues, if only transitively :) |
ReadBundles group several Reads together.
The test gets stuck in a deadlock, seqan does subtly different things when reading from a path that is not '-' and this is covered by test_streaming_io.
508af89
to
18eab38
Compare
Fix #1098
and related to #1483
After getting started on this I realised I wasn't sure how we wanted to use this. So far I am assuming the use case is to replace
utils.ReadBundle
as used in for exampletrim-low-abund.py
.Is it really worth it? We will end up turning all the screed
Records
intoread_parser::Read
s via various unicode -> bytes stuff, to then do some computation on them. In the current setup we pass the cleaned sequence toget_median_count
and friends. Maybe there is something I am missing.make test
Did it pass the tests?make clean diff-cover
If it introduces new functionality inscripts/
is it tested?make format diff_pylint_report cppcheck doc pydocstyle
Is it wellformatted?
ChangeLog
?http://en.wikipedia.org/wiki/Changelog#Format
changes were made?
tested for streaming IO?)
ReadBundles group several Reads together.