Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix slow memleak in DEVELOPER mode #4931

Merged
merged 6 commits into from
Nov 27, 2021

Conversation

rustyrussell
Copy link
Contributor

1G of backtraces seems a bit excessive after 1 month of running!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Example output on my production machine, running v0.10.2rc1-9-g3d8293d:

```
Top sizes by name
backtrace: 1172021248 bytes, 4578208 items
wallet/txfilter.c:111:struct bitcoin_outpoint: 49584636 bytes, 1377351 items
lightningd/plugin.c:1651:char[]: 17261056 bytes, 16 items
lightningd/log.c:282:char[]: 16653908 bytes, 838086 items
wally_tal: 13390257 bytes, 122503 items
lightningd/log.c:250:struct log_entry[]: 7340032 bytes, 1 items
lightningd/jsonrpc.c:939:char[]: 6979648 bytes, 214 items
lightningd/log.c:440:u8[]: 3278359 bytes, 1338 items
common/memleak.c:51:struct memleak_notleak: 2216763 bytes, 2216763 items
Top sizes by path
['']: 2579337788 bytes, 1 items
['', 'lightningd/lightningd.c:103:struct lightningd']: 831485962 bytes, 1 items
['', 'lightningd/lightningd.c:103:struct lightningd', 'wallet/wallet.c:77:struct wallet']: 756169631 bytes, 1 items
['', 'lightningd/lightningd.c:103:struct lightningd', 'wallet/wallet.c:77:struct wallet', 'wallet/txfilter.c:133:struct outpointfilter']: 756166227 bytes, 2 items
['', 'lightningd/lightningd.c:103:struct lightningd', 'wallet/wallet.c:77:struct wallet', 'wallet/txfilter.c:133:struct outpointfilter', 'wallet/txfilter.c:134:struct outpointset']: 706581191 bytes, 2 items
['', 'ccan/ccan/tal/link/link.c:40:struct linkable']: 457580181 bytes, 1 items
['', 'ccan/ccan/tal/link/link.c:40:struct linkable', 'lightningd/log.c:236:struct log_book']: 430307262 bytes, 1 items
['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/chaintopology.c:1061:struct chain_topology']: 48054597 bytes, 1 items
['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/chaintopology.c:1061:struct chain_topology', 'lightningd/chaintopology.c:853:struct block']: 47076437 bytes, 1 items
['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/chaintopology.c:1061:struct chain_topology', 'lightningd/chaintopology.c:853:struct block', 'bitcoin/block.c:210:struct bitcoin_tx *[]']: 47034285 bytes, 1 items
['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/plugin.c:77:struct plugins']: 17569217 bytes, 1 items
['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/plugin.c:77:struct plugins', 'lightningd/plugin.c:246:struct plugin', 'lightningd/plugin.c:1651:char[]']: 16784384 bytes, 5 items
['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/chaintopology.c:1061:struct chain_topology', 'lightningd/chaintopology.c:853:struct block', 'bitcoin/block.c:210:struct bitcoin_tx *[]', 'bitcoin/tx.c:605:struct bitcoin_tx']: 13730127 bytes, 452 items
['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/jsonrpc.c:1189:struct jsonrpc']: 7745890 bytes, 1 items
['', 'ccan/ccan/tal/link/link.c:40:struct linkable', 'lightningd/log.c:236:struct log_book', 'lightningd/log.c:250:struct log_entry[]']: 7340032 bytes, 1 items
['', 'lightningd/lightningd.c:103:struct lightningd', 'lightningd/chaintopology.c:1061:struct chain_topology', 'lightningd/chaintopology.c:853:struct block', 'bitcoin/block.c:210:struct bitcoin_tx *[]', 'bitcoin/tx.c:605:struct bitcoin_tx', 'wally_tal']: 4480328 bytes, 166 items
['', 'tmpctx']: 0 bytes, 1 items
```
We now reach into the uintmap, so this is unnecesary.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Comment on lines +73 to +77
static void log_prefix_drop(struct log_prefix *lp)
{
if (--lp->refcnt == 0)
tal_free(lp);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like Valgrind considered this as invalid behavior?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and it's right: access after free. We need to do this a bit differently, will address tomorrow!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For sure this is a very stupid question, but I'm missing the point of the code that makes valid happy again.

I know, it is stupid (maybe)

Do proper refcounting on log prefixes; previously we kept them around,
which is fine, but the extra notleak() and backtrace in developer mode
could get quite heavy (I ended up with 1G of backtraces!).  This is
mainly due to creating one on every JSONRPC command, and running
clboss.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: lightningd: remove slow memory leak in DEVELOPER builds.
This lets us mark it directly.

Get rid of long-unused "notleaks" member of struct lightningd too.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
This happened in my tal_dump(), and I couldn't see how we ended up
with object having more than one "backtrace".  Adding asserts that we
never added a second backtrace didn't trigger.

Finally I wondered if we were tal_steal() backtraces, and sure enough
we do that blinding in one place: libwally wrapping.  So fix that.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Copy link
Contributor

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can not wait until tomorrow to review this PR, this memory stuff is so cool.

ack 867b7da

Valgrind is happy again, and CI looks like an unrelated failure, some daemon dead during the test.

Wondering if we try to downgrade the number of parallel tests in the CI and see if the memory consumption is the problem?

@ZmnSCPxj
Copy link
Contributor

memory stuff is so cool.

*shivers in 20 years of Assembly, C, and C++ coding*

@rustyrussell rustyrussell merged commit c1f534a into ElementsProject:master Nov 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants