Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

trie: fix a temporary memory leak in the memcache #17111

Merged
merged 1 commit into from
Jul 3, 2018

Conversation

karalabe
Copy link
Member

@karalabe karalabe commented Jul 2, 2018

This PR is a 3 liner fix. The rest is just a repro/verification.

We've received from time to time reports that Geth closes with ERROR[...] Dangling trie nodes after full cleanup. We've been hacking on the memcache for quite a lot, so I'm unsure if the old issues are the same as the one this PR fixes, but this one addesses specifically an issue where certain nodes remain in the memory cache even though they have no more references left.

The issue happens when a trie node exists on disk (never loaded or already committed), and it's recreated by the trie again (short node split into full, and them merged back into short). This will cause the node to be inserted into the memcache, with a potentially invalid parent count (0). Dereferencing this trie node form memory will first overflow its counter to MAXINT, thus never cleaning it out.

The node still remains part of the flush-list, so it will eventually be pushed out to disk, but in the mean time it's a dangling node. The PR fixes it by adding an explicit check for the 0-parent case.

@karalabe karalabe added this to the 1.8.12 milestone Jul 2, 2018
@karalabe karalabe requested a review from fjl July 2, 2018 09:30
Copy link
Contributor

@fjl fjl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand how a node with zero parents can get into the Database, but the fix looks OK. Maybe the root cause is a missing dereference call?

@karalabe
Copy link
Member Author

karalabe commented Jul 3, 2018

The node is already in the database, and the same node is recreated (not loaded). That will result in it being inserted into the memcache again.

@karalabe karalabe merged commit 67a7857 into ethereum:master Jul 3, 2018
@emile
Copy link
Contributor

emile commented Jul 20, 2018

I'm trying out Streaming chain tracers but get an error on any commit from 67a7857 onwards when this fix was merged:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x18 pc=0x69cb7e]

goroutine 130 [running]:
github.com/ethereum/go-ethereum/trie.(*Database).reference(...)
        github.com/ethereum/go-ethereum/trie/database.go:423
github.com/ethereum/go-ethereum/trie.(*Database).Reference(0xc420648870, 0x64044303454d7ed6, 0x855343477142ae25, 0xe6dda1c0db60b87a, 0xf32b533acdb5414b, 0x0, 0x0, 0x0, 0x0)
        github.com/ethereum/go-ethereum/trie/database.go:412 +0x18e
github.com/ethereum/go-ethereum/eth.(*PrivateDebugAPI).traceChain.func2(0xc42008fc80, 0xc42020a890, 0xc42032a1b0, 0xc4205ce240, 0xbecc99cb06ef0fd6, 0x1e8ead4f0, 0x1b20400, 0xc42008fce0, 0xc42055c4c0, 0x0, ...)
        github.com/ethereum/go-ethereum/eth/api_tracer.go:295 +0x894
created by github.com/ethereum/go-ethereum/eth.(*PrivateDebugAPI).traceChain
        github.com/ethereum/go-ethereum/eth/api_tracer.go:220 +0x698

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants