trie: fix a temporary memory leak in the memcache #17111
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR is a 3 liner fix. The rest is just a repro/verification.
We've received from time to time reports that Geth closes with
ERROR[...] Dangling trie nodes after full cleanup
. We've been hacking on the memcache for quite a lot, so I'm unsure if the old issues are the same as the one this PR fixes, but this one addesses specifically an issue where certain nodes remain in the memory cache even though they have no more references left.The issue happens when a trie node exists on disk (never loaded or already committed), and it's recreated by the trie again (short node split into full, and them merged back into short). This will cause the node to be inserted into the memcache, with a potentially invalid parent count (0). Dereferencing this trie node form memory will first overflow its counter to MAXINT, thus never cleaning it out.
The node still remains part of the flush-list, so it will eventually be pushed out to disk, but in the mean time it's a dangling node. The PR fixes it by adding an explicit check for the 0-parent case.