-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hexary trie prunes when it shouldn't #92
Comments
As a workaround, you should be able to get it working with import rlp
from trie import HexaryTrie
ALL_ZEROS = b'\x00' * 32
def add_values_to_trie(trie, values):
for index, value in enumerate(values):
index_key = rlp.encode(index, sedes=rlp.sedes.big_endian_int)
trie[index_key] = value
def make_trie_root_and_nodes(values):
kv_store = {}
trie = HexaryTrie(kv_store)
with trie.squash_changes() as memory_trie:
add_values_to_trie(memory_trie, values)
return trie.root_hash, kv_store
make_trie_root_and_nodes([ALL_ZEROS] * 128) |
Unfortunately that throws the same exception because squash_changes() also sets prune=True: def squash_changes(self):
scratch_db = ScratchDB(self.db)
with scratch_db.batch_commit(do_deletes=self.is_pruning):
memory_trie = type(self)(scratch_db, self.root_hash, prune=True)
yield memory_trie
try:
self.root_node = memory_trie.root_node
except MissingTrieNode:
# if the new root node is missing,
# (or no changes happened in a squash trie where the old root node was missing),
# then we shouldn't crash here
self.root_hash = memory_trie.root_hash If you run your workaround, but with 129 hashes instead of 128, it throws the same exception: make_trie_root_and_nodes([ALL_ZEROS] * 129) |
Ah, great. Off-by-one will get you every time. I have it reproduced locally, and will take a look tomorrow. |
Yup, when storing identical values, with pruning on, this is bound to happen. I don't see a clean, simple way around this problem. Roughly, it's that you can have multiple subtrees that are identical, so removing one kills the other. In the following tree, if different letters are different hashes:
Removing Reference counting seems like the obvious solution to this. I don't think it needs to be an enormous change, although sometimes you don't know until you dig in. I don't expect it to be a tiny change. It will have a performance impact as well, especially if you want to enable pruning when starting from a non-empty trie. (which means you have to store the reference-counting to the database) You shouldn't run into this if you are always adding unique values, so we may add some kind of uniqueness option that would allow you to prune at the same speed as today by ignoring ref-counting. |
Also @hyperevo you mentioned that this was happening in py-evm:
I am not sure how this could happen, because you aren't allowed to insert duplicates of the same transaction into the block. Was this happening even without a duplicate value? |
I expect #93 to resolve the problem. It would be great if you could pull my branch to test it out. |
Thanks for such a quick response! I will try your fix asap. |
Sorry for the delay. I got caught up with other work. I tried your solution and it works. Thank you for that fix! |
minor formatting updates, remove additional docs to separate pr
What was wrong?
Trying to create a trie of a list of identical hashes, using make_trie_root_and_nodes from py-evm, and I get the exception: trie.exceptions.MissingTrieNode: Trie database is missing hash.
This only occurs if the HexaryTrie has prune = True.
It looks like a leaf was pruned when it shouldn't have been.
The exception occurs on the 129th iteration of the loop.
What did you expect it to do?
Expected it to add the hashes to the trie database like normal so that I can get a root hash and nodes.
Code to reproduce the error
The text was updated successfully, but these errors were encountered: