-
Notifications
You must be signed in to change notification settings - Fork 912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix slow memleak in DEVELOPER mode #4931
Fix slow memleak in DEVELOPER mode #4931
Conversation
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Example output on my production machine, running v0.10.2rc1-9-g3d8293d: ``` Top sizes by name backtrace: 1172021248 bytes, 4578208 items wallet/txfilter.c:111:struct bitcoin_outpoint: 49584636 bytes, 1377351 items lightningd/plugin.c:1651:char[]: 17261056 bytes, 16 items lightningd/log.c:282:char[]: 16653908 bytes, 838086 items wally_tal: 13390257 bytes, 122503 items lightningd/log.c:250:struct log_entry[]: 7340032 bytes, 1 items lightningd/jsonrpc.c:939:char[]: 6979648 bytes, 214 items lightningd/log.c:440:u8[]: 3278359 bytes, 1338 items common/memleak.c:51:struct memleak_notleak: 2216763 bytes, 2216763 items Top sizes by path ['']: 2579337788 bytes, 1 items ['', 'lightningd/lightningd.c:103:struct lightningd']: 831485962 bytes, 1 items ['', 'lightningd/lightningd.c:103:struct lightningd', 'wallet/wallet.c:77:struct wallet']: 756169631 bytes, 1 items ['', 'lightningd/lightningd.c:103:struct lightningd', 'wallet/wallet.c:77:struct wallet', 'wallet/txfilter.c:133:struct outpointfilter']: 756166227 bytes, 2 items ['', 'lightningd/lightningd.c:103:struct lightningd', 'wallet/wallet.c:77:struct wallet', 'wallet/txfilter.c:133:struct outpointfilter', 'wallet/txfilter.c:134:struct outpointset']: 706581191 bytes, 2 items ['', 'ccan/ccan/tal/link/link.c:40:struct linkable']: 457580181 bytes, 1 items ['', 'ccan/ccan/tal/link/link.c:40:struct linkable', 'lightningd/log.c:236:struct log_book']: 430307262 bytes, 1 items ['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/chaintopology.c:1061:struct chain_topology']: 48054597 bytes, 1 items ['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/chaintopology.c:1061:struct chain_topology', 'lightningd/chaintopology.c:853:struct block']: 47076437 bytes, 1 items ['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/chaintopology.c:1061:struct chain_topology', 'lightningd/chaintopology.c:853:struct block', 'bitcoin/block.c:210:struct bitcoin_tx *[]']: 47034285 bytes, 1 items ['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/plugin.c:77:struct plugins']: 17569217 bytes, 1 items ['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/plugin.c:77:struct plugins', 'lightningd/plugin.c:246:struct plugin', 'lightningd/plugin.c:1651:char[]']: 16784384 bytes, 5 items ['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/chaintopology.c:1061:struct chain_topology', 'lightningd/chaintopology.c:853:struct block', 'bitcoin/block.c:210:struct bitcoin_tx *[]', 'bitcoin/tx.c:605:struct bitcoin_tx']: 13730127 bytes, 452 items ['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/jsonrpc.c:1189:struct jsonrpc']: 7745890 bytes, 1 items ['', 'ccan/ccan/tal/link/link.c:40:struct linkable', 'lightningd/log.c:236:struct log_book', 'lightningd/log.c:250:struct log_entry[]']: 7340032 bytes, 1 items ['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/chaintopology.c:1061:struct chain_topology', 'lightningd/chaintopology.c:853:struct block', 'bitcoin/block.c:210:struct bitcoin_tx *[]', 'bitcoin/tx.c:605:struct bitcoin_tx', 'wally_tal']: 4480328 bytes, 166 items ['', 'tmpctx']: 0 bytes, 1 items ```
We now reach into the uintmap, so this is unnecesary. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
a464fee
to
7cf4838
Compare
static void log_prefix_drop(struct log_prefix *lp) | ||
{ | ||
if (--lp->refcnt == 0) | ||
tal_free(lp); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like Valgrind considered this as invalid behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and it's right: access after free. We need to do this a bit differently, will address tomorrow!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For sure this is a very stupid question, but I'm missing the point of the code that makes valid happy again.
I know, it is stupid (maybe)
Do proper refcounting on log prefixes; previously we kept them around, which is fine, but the extra notleak() and backtrace in developer mode could get quite heavy (I ended up with 1G of backtraces!). This is mainly due to creating one on every JSONRPC command, and running clboss. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au> Changelog-Fixed: lightningd: remove slow memory leak in DEVELOPER builds.
This lets us mark it directly. Get rid of long-unused "notleaks" member of struct lightningd too. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This happened in my tal_dump(), and I couldn't see how we ended up with object having more than one "backtrace". Adding asserts that we never added a second backtrace didn't trigger. Finally I wondered if we were tal_steal() backtraces, and sure enough we do that blinding in one place: libwally wrapping. So fix that. Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
7cf4838
to
867b7da
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can not wait until tomorrow to review this PR, this memory stuff is so cool.
ack 867b7da
Valgrind is happy again, and CI looks like an unrelated failure, some daemon dead during the test.
Wondering if we try to downgrade the number of parallel tests in the CI and see if the memory consumption is the problem?
*shivers in 20 years of Assembly, C, and C++ coding* |
1G of backtraces seems a bit excessive after 1 month of running!