-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use reference counting to avoid over-pruning trie #93
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Am I correct that this will not work unless the database is empty when given to the Trie
class? I.E. we cannot use this in trinity with our level db database?
Exactly, that's what I was trying to get at with the docstring:
Any idea how to say that more clearly? FWIW, not being able to prune an on-disk DB was already an issue. It's not made worse by the current PR (hopefully made a little better by at least mentioning it in a docstring). In practice, we have only ever used pruning for a fresh in-memory database, or in a context like |
I'm inclined to just deprecate and remove the pruning feature since:
ergo... our pruning feature isn't useful and adding complexity to support it doesn't seem worth it. |
Two come to mind:
It's possible that we might be able to hide away the |
This direction seems preferable. |
There is an example of where the current pruner incorrectly removes a necessary node hash, because it got duplicated. See the new test: test_hexary_trie_avoid_over_pruning() Added an incremental reference counter, which only prunes nodes when the number of usages drops to zero. It only works on fresh databases. With existing databases, it would still delete required nodes. Also, skip an unnecessary database write, which now incorrectly increments the reference counter, and causes a failure in the reference counter.
61e6ec3
to
ff9d79e
Compare
Cool, I noted it as not for external usage, and that it is likely to be deprecated. I also added an issue to change how pruning is enabled, to make it an internal API. |
trie/hexary.py
Outdated
self._prune_key(prune_key) | ||
|
||
def _prune_key(self, key): | ||
self.ref_count[key] -= 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this should have an existence check since if the key were to not be in the database this would end up with the refcount being negative.
8f7a10b
to
22e21fd
Compare
…sion bump sphinx version and set py version rtd uses to 3.8
What was wrong?
Fixes #92
There is an example of where the current pruner incorrectly removes a
necessary node hash, because it got duplicated. See the new test:
test_hexary_trie_avoid_over_pruning()
How was it fixed?
Added an incremental reference counter, which only prunes nodes when the
number of usages drops to zero. It only works on fresh databases. With
existing databases, it would still delete required nodes.
Also, it skips an unnecessary database write, which would incorrectly
increment the reference counter.
Cute Animal Picture