-
Notifications
You must be signed in to change notification settings - Fork 80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] gather optimizations #615
Conversation
Codecov Report
@@ Coverage Diff @@
## master #615 +/- ##
==========================================
- Coverage 89.45% 89.38% -0.07%
==========================================
Files 27 27
Lines 4191 4203 +12
Branches 37 39 +2
==========================================
+ Hits 3749 3757 +8
- Misses 442 446 +4
Continue to review full report at Codecov.
|
LGTM! |
I'll add some tests for |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work!
A couple of fixes for gather perf problems
remove_many
method to remove hashes from a minhashremove_many
method to remove matches from the query (instead of building a new one). This makes a huge difference for large queries.I ran
master
with a large metagenome query, got 4 matches in 17h. This PR takes 17 minutes to find the same matches.Checklist
make test
Did it pass the tests?make coverage
Is the new code covered?without a major version increment. Changing file formats also requires a
major version number increment.
changes were made?