Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metrics server is listening on IPv6 interface and not shutting down #5768

Closed
nflaig opened this issue Jul 17, 2023 · 3 comments
Closed

Metrics server is listening on IPv6 interface and not shutting down #5768

nflaig opened this issue Jul 17, 2023 · 3 comments
Labels
meta-bug Issues that identify a bug and require a fix.

Comments

@nflaig
Copy link
Member

nflaig commented Jul 17, 2023

Describe the bug

It looks like Lodestar beacon node metrics server starts listening on IPv6 interface.

devops@Ubuntu-2204-jammy-amd64-base:~/goerli/lodestar$ lsof -i :8008
COMMAND     PID   USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
node    2301537 devops  102u  IPv6 973623086      0t0  TCP localhost.localdomain:8008->localhost.localdomain:56578 (ESTABLISHED)
node    2301537 devops  128u  IPv6 973428666      0t0  TCP *:8008 (LISTEN)
devops@Ubuntu-2204-jammy-amd64-base:~/goerli/lodestar$ sudo netstat -ltnp | grep -w ':8008'
tcp6       0      0 :::8008                 :::*                    LISTEN      2301537/node

It is also no longer possible to shut down the process, even force closing does not help. Had to use kill -9 to close process as just kill didn't do anything.

Jul-17 17:08:54.000[]                 info: Synced - slot: 6092144 - head: 0xe00e…5e94 - exec-block: valid(9361504 0x8772…) - finalized: 0x7710…765c:190377 - peers: 51
^CJul-17 17:09:01.307[]                 info: Stopping gracefully, use Ctrl+C again to force process exit
Jul-17 17:09:06.176[]                 info: Synced - slot: 6092145 - head: (slot -1) 0xe00e…5e94 - exec-block: valid(9361504 0x8772…) - finalized: 0x7710…765c:190377 - peers: 50
^CJul-17 17:09:10.028[]                 info: Forcing process exit
^C^C^C^C^C./lodestar: line 7: 2357120 Killed                  node --trace-deprecation --max-old-space-size=4096 ./packages/cli/bin/lodestar.js "$@

Expected behavior

Metrics server should not be listening on IPv6 and beacon node should cleanly shut down when receiving SIGTERM / SIGINT.

Steps to reproduce

Running with default network and listen options, metrics are enabled

./lodestar beacon \
    --dataDir /home/devops/goerli/data/beacon \
    --metrics \
    --execution.urls http://localhost:8551 \
    --jwt-secret /home/devops/goerli/data/jwtsecret \
    --logLevel info \
    --network goerli \
    --checkpointSyncUrl "https://beaconstate-goerli.chainsafe.io/"

The metrics server listen on IPv6 happens consistently but I am not able to reproduce the process not shutting down all the time.

Additional context

Related PR that likely introduced problem

Operating system

Linux

Lodestar version or commit hash

ec81531

@nflaig nflaig added the meta-bug Issues that identify a bug and require a fix. label Jul 17, 2023
@nflaig
Copy link
Member Author

nflaig commented Jul 18, 2023

@wemeetagain Looks like the metrics server listening on ipv6 is unrelated, also happens when checking out previous commit e3eb055.

Still trying to find out what was going on with the process hanging, I've never seen this before that not even force closing helps.

Maybe it was just an issue on my server, although it happened multiple times. I will try to figure out if this really just started to happen after merging IPv6 changes.

Possibly a worker thread which is not shut down correctly? That could keep the main process alive but not receive any process signals which could explain why process.exit and kill did nothing.

@nflaig
Copy link
Member Author

nflaig commented Jul 18, 2023

@wemeetagain as you suspected in standup this is unrelated to IPv6 changes, which makes sense, would be hard to explain this based on the IPv6 changes done.

That the metrics server is still running was just a coincidence because the shutdown sequence was not fully executed due to #5775 and I did some more testing with --network.useWorker true over the weekend.

@nflaig nflaig closed this as completed Jul 18, 2023
@nflaig
Copy link
Member Author

nflaig commented Jul 18, 2023

Just leaving some more context here:

It looks like by default if no address is provided the metrics server is listening on IPv6 interface (at least on my server)

devops@Ubuntu-2204-jammy-amd64-base:~/goerli$ lsof -i :8008
COMMAND     PID   USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
node    3980719 devops  112u  IPv6 987155205      0t0  TCP localhost.localdomain:8008->localhost.localdomain:39660 (CLOSE_WAIT)
node    3980719 devops  128u  IPv6 987130593      0t0  TCP *:8008 (LISTEN)

Whereas for the REST API server where I set --rest.address "0.0.0.0" is listening on IPv4 as expected

devops@Ubuntu-2204-jammy-amd64-base:~/goerli$ lsof -i :9596
COMMAND     PID   USER   FD   TYPE    DEVICE SIZE/OFF NODE NAME
node    2388951 devops   26u  IPv4 987129616      0t0  TCP localhost.localdomain:47462->localhost.localdomain:9596 (ESTABLISHED)
node    3980719 devops  113u  IPv4 987155117      0t0  TCP Ubuntu-2204-jammy-amd64-base:9596->172.26.0.4:39304 (CLOSE_WAIT)
node    3980719 devops  118u  IPv4 987139371      0t0  TCP Ubuntu-2204-jammy-amd64-base:9596->172.27.0.2:53520 (CLOSE_WAIT)
node    3980719 devops  129u  IPv4 987137350      0t0  TCP *:9596 (LISTEN)

This is a bit surprising to me as the default host/address should be 127.0.0.1. The metrics server is also listening on a network interface which is externally reachable.

We should only expose server externally if listen address is explicitly set to something like 0.0.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
meta-bug Issues that identify a bug and require a fix.
Projects
None yet
Development

No branches or pull requests

1 participant