-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
More consolidation of Hashtable derived types. #1504
Conversation
Current coverage is 95.76% (diff: 100%)@@ master #1504 diff @@
==========================================
Files 36 36
Lines 2938 2952 +14
Methods 0 0
Messages 0 0
Branches 449 449
==========================================
+ Hits 2813 2827 +14
Misses 55 55
Partials 70 70
|
* rename CountingHash to Countgraph throughout * rename Hashbits to Nodegraph throughout
Ready for initial review, y'all. @betatim @luizirber @camillescott @standage. Might be easier to go commit by commit :(. Still have to check out my modifications to the abundance dist functions, and I'm not sure if I should add tests for Counttable and Nodetable on this PR, but I think most of it is done. |
Prefer |
|
…khmer into feature/assembly/junction_count-merge-storage2
…-merge-storage2 A reconciliation branch for the storage/hashgraph refactoring & junction count stuff.
Counttable * counttable; | ||
} khmer_KCounttable_Object; | ||
|
||
static PyMethodDef khmer_counttable_methods[] = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need this here if it is just empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went both ways on this - it's nice to have it there for if/when we add methods... but yeah, I guess no need.
0, /*tp_setattro*/ | ||
0, /*tp_as_buffer*/ | ||
Py_TPFLAGS_DEFAULT | Py_TPFLAGS_BASETYPE, /*tp_flags*/ | ||
"hashgraph object", /* tp_doc */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete all the following lines
0, /* tp_new */ | ||
}; | ||
|
||
#define is_hashgraph_obj(v) (Py_TYPE(v) == &khmer_KHashgraph_Type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
delete me
++since; | ||
} | ||
} | ||
#else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we keep this around or delete?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see these bits of code detritus go, but I think perhaps that should get its own PR -- there's a fair amount of it.
|
||
// Iterate through the reads and consume their k-mers. | ||
while (!parser->is_complete( )) { | ||
BoundedCounterType max_count = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you move this up to be the first definition in the method you (probably) don't have to pay for a copy when returning it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, but I don't understand why :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://en.wikipedia.org/wiki/Return_value_optimization it probably isn't that exciting in this case but for other/bigger/expensive to copy types it might be more interesting
I assume that the substantial bits of code that were moved didn't get modified when you moved them because the tests still pass. |
|
||
if (self != NULL) { | ||
WordLength k = 0; | ||
PyListObject * sizes_list_o = NULL; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know who owns the reference to the list that comes from PyArg_ParseTuple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PyArg_ParseTuple doesn't increase the reference count, and presumably the ref count can't decrease to zero while we are using an argument to this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep (or at least not with the GIL held).
Let's ahead and push the magic merge button when the build ends! (but please don't squash commits :) |
Doh, I was too late :( |
wassup @camillescott? |
Briefly, this PR:
These changes make a clear distinction between 'tables' and 'graphs' - tables have all of
the basic functionality needed for counting, while graphs support various traversal methods.
This paves the way for:
(a) adding new hash functions, including irreversible ones supporting k > 32 for the *table objects; and
(b) building out a new Counttable CPython object that will support the non-graph operations for k > 32.
(I don't like the 'Counttable' name that much but it fits with Countgraph. I suppose we could do Countstable. But that might engender confusion. Thoughts?)
make test
Did it pass the tests?make clean diff-cover
If it introduces new functionality inscripts/
is it tested?make format diff_pylint_report cppcheck doc pydocstyle
Is it wellformatted?
without a major version increment. Changing file formats also requires a
major version number increment.
ChangeLog
?http://en.wikipedia.org/wiki/Changelog#Format
changes were made?
tested for streaming IO?)