Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Log filtering by node id. #5349

Merged

Conversation

rustyrussell
Copy link
Contributor

No description provided.

@rustyrussell
Copy link
Contributor Author

Ack 9d0e1d8

Trivial rebase on master.

@cdecker
Copy link
Member

cdecker commented Jul 4, 2022

One spurious change that points to a non-existing file, and some musings re efficiency. At some point we'll have reinvented the structured logging ideas from a decade ago, and we'll just let the operator give us an expression to filter by and a printf string to format the various components xD

cdecker
cdecker previously requested changes Jul 4, 2022
Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That spurious change needs to be removed

lightningd/log.c Show resolved Hide resolved
doc/index.rst Outdated
@@ -114,3 +114,4 @@ Core Lightning Documentation
lightning-withdraw <lightning-withdraw.7.md>
lightningd <lightningd.8.md>
lightningd-config <lightningd-config.5.md>
lightningd-json <lightningd-json.7.md>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file does not exist.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, was on a different branch, it didn't get re-regenerated when I switched. Gah! thanks

@rustyrussell rustyrussell force-pushed the guilt/log-filtering branch 2 times, most recently from ab09252 to 7edd0ef Compare July 6, 2022 01:43
@rustyrussell rustyrussell force-pushed the guilt/log-filtering branch 2 times, most recently from bb399dd to 0b5aa93 Compare July 7, 2022 03:20
This expands update-mocks to be able to handle (simple!) missing
symbols which are not functions.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
That's useful for "tell me everything about this node" debugging.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Fixes: ElementsProject#5348
Changelog-Added: lightningd: `log-level=debug:<partial-nodeid>` supported to get debug-level logs for everything about a peer.
Looking at the CI logs, it seems like it took over 5 seconds, so
the unilateral close occurred instead of the expected rejection
of the WIRE_SHUTDOWN reply.  Make it bulletproof.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Since we now log directly, we don't prepend the prefix ourselves, making it really
hard to tell *which* lightningd the log applies to!

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Noticed by log-prefix default (""), which causes bad JSON:

```
    Malformed JSON reply '{"jsonrpc":"2.0","id":0,"result":{...
    "log-prefix":,"log-file":"log"} }
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
…cted.

It actually only sets the prefix for the lightningd core log messages;
the other logs have their own prefix.

Make it a real, process-wide prefix which actually goes in front of the timestamp.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Changelog-Fixed: options: `log-prefix` now correctly prefixes *all* log messages.
Since subd is temporarily unreferenced, and our leak detection goes up
to the highest unreferenced parent, it complains about openingd->conn.

The correct fix is to catch this where we detach the subd:

```
DEBUG:root:{
  "id": 1,
  "result": {
    "leaks": [
      {
        "value": "0x556e0445d8f8",
        "label": "ccan/ccan/io/io.c:91:struct io_conn",
        "backtrace": [
          "ccan/ccan/tal/tal.c:442 (tal_alloc_)",
          "ccan/ccan/io/io.c:91 (io_new_conn_)",
          "lightningd/subd.c:774 (new_subd)",
          "lightningd/subd.c:828 (new_channel_subd_)",
          "lightningd/opening_control.c:872 (peer_start_openingd)",
          "lightningd/peer_control.c:1311 (peer_active)",
          "lightningd/connect_control.c:458 (connectd_msg)",
          "lightningd/subd.c:557 (sd_msg_read)",
          "lightningd/subd.c:357 (read_fds)",
          "ccan/ccan/io/io.c:59 (next_plan)",
          "ccan/ccan/io/io.c:407 (do_plan)",
          "ccan/ccan/io/io.c:417 (io_ready)",
          "ccan/ccan/io/poll.c:453 (io_loop)",
          "lightningd/io_loop_with_timers.c:22 (io_loop_with_timers)",
          "lightningd/lightningd.c:1182 (main)",
          "../csu/libc-start.c:308 (__libc_start_main)"
        ],
        "parents": [
          "lightningd/lightningd.c:107:struct lightningd"
        ]
      }
    ]
  }
}
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Because we expire cache every flush, in DEVELOPER mode that can happen in
just over a second.  If gossipd takes a while to process the gossip,
this can mean we actually forget we received it from the peer.

Easiest fix is to run this test in non-DEVELOPER mode.

```       # With DEVELOPER, this is long enough for gossip flush.
        time.sleep(2)
>       assert not l3.daemon.is_in_log(r'\[OUT\] 0100')
E       AssertionError: assert not '2022-06-30T06:00:31.031Z 022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d59-connectd: [OUT] 0100294834e94aa2407ace65373c84c2a8056d67a6778c22bd43f03a9000d8a01f075c3996ab6c88951bd9805adbfde78292022d7514ffcec37d5e466ef5ebd235c87ffa8a8567904502cd65ae913dbb8014509f442a6e37b00fd59fa3a06811a4965f45df1612d3a32b2566001430cfec4346cc021bd0a1bf6a87417dbc29e0715f16c4f9dc5a01bff07368ba8f54cf9813c3fc8f9f8b8c8005da2a18021f322a9a168af8f82d9deba49ced51bb26a9abc001bbad314a15baccc87c685e9302533c2e86a7b9dfe90bd4cb6be5c2da375be8c4a535a63cacff0544957a34ca865fdb61f9198d19dfd990a0fcbf8de39fa2c0fc54435c16e74df2b49fe3b6e905d599000006226e46111a0b59caaf126043eb5bbf28c34f3a5e332a1fc7b2b73cf188910f0000670000010001022d223620a359a47ff7f7ac447c85c46c923da53389221a0054c11c1e3ca31d590266e4598d1d3c415f572a8488830b60f7e744ed9235eb0b1ba93283b315c0351802e3bd38009866c9da8ec4aa99cc4ea9c6c0dd46df15c61ef0ce1f271291714e5702324266de8403b3ab157a09f1f784d587af61831c998c151bcc21bb74c2b2314b'
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
Caused a crash in CI, reproduced under valgrind by calling
any_channel_by_scid from io_poll_lightningd:

```
==2422524== Conditional jump or move depends on uninitialised value(s)
==2422524==    at 0x12C98D: any_channel_by_scid (channel.c:606)
==2422524==    by 0x14FF75: io_poll_lightningd (lightningd.c:682)
==2422524==    by 0x225FDE: io_loop (poll.c:420)
==2422524==    by 0x14A914: io_loop_with_timers (io_loop_with_timers.c:22)
==2422524==    by 0x150C4E: main (lightningd.c:1193)
==2422524==  Uninitialised value was created by a heap allocation
==2422524==    at 0x483B7F3: malloc (in /usr/lib/x86_64-linux-gnu/valgrind/vgpreload_memcheck-amd64-linux.so)
==2422524==    by 0x234F61: allocate (tal.c:250)
==2422524==    by 0x235522: tal_alloc_ (tal.c:428)
==2422524==    by 0x12B500: new_unsaved_channel (channel.c:203)
==2422524==    by 0x13B77A: json_openchannel_init (dual_open_control.c:2610)
==2422524==    by 0x14C78D: command_exec (jsonrpc.c:630)
==2422524==    by 0x14CD9F: rpc_command_hook_final (jsonrpc.c:765)
==2422524==    by 0x181DDA: plugin_hook_call_ (plugin_hook.c:278)
==2422524==    by 0x14D198: plugin_hook_call_rpc_command (jsonrpc.c:853)
==2422524==    by 0x14D6A0: parse_request (jsonrpc.c:957)
==2422524==    by 0x14DAFE: read_json (jsonrpc.c:1054)
==2422524==    by 0x2231C8: next_plan (io.c:59)
```

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
@rustyrussell
Copy link
Contributor Author

Rebase on master, hoping to fix prototest flake!

Copy link
Collaborator

@vincenzopalazzo vincenzopalazzo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 81a7699

Today looks like no the day of the ssh!

Copy link
Member

@cdecker cdecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 81a7699

@rustyrussell rustyrussell merged commit 7a7e43c into ElementsProject:master Jul 9, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants