-
Notifications
You must be signed in to change notification settings - Fork 20.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
StaticNodes / TrustedNodes useless in PoA Setup #23210
Comments
This seems odd to me. Could you post the error message you are getting? Static/trusted nodes are merely suggestions. They should not result in any errors if they are offline. My best guess is that parsing the enode IDs fail, which should be apparent from the error message. |
@karalabe thanks a lot for your help. I know for a fact that the parsing goes well. The message is:
EDIT: Just FYI I have found a workaround - sleeping for 5 seconds before booting geth. This way, it gives 5s for all the containers to spin up. Geth crashes a few times, the containers are restarted, and when a happy coincidence of all the nodes being up happens, things connect. |
Bump - any chance that connecting to peers specified in |
Bump? |
Anyone please? Is it that I didn't phrase the issue correctly, or that there isn't any interest in addressing it? |
any update on this issue? im really interesting to use domain than ip its more flexible when the ip can't be acces |
+1 |
I think this should be fixed, I don't see a reason to refuse to start if TrustedNodes/StaticNodes are offline. |
I tried to repro with the following config, but I couldn't
It never fails to start up, even with NoDiscover set to false, it will try to staticdial and everything. I will close this issue for now. If this still happens to you, please reopen it or open another issue with some more logs and a config so that we can reproduce it. Thanks for submitting! |
@MariusVanDerWijden using the INFO [11-26|10:03:42.816] Starting Geth on Ethereum mainnet...
INFO [11-26|10:03:42.817] Bumping default cache on mainnet provided=1024 updated=4096
Fatal: config.toml, line 81: (p2p.Config.StaticNodes) lookup idontexist.vasconcelos.sh: no such host |
Even fixing the config parsing, the whole |
@MariusVanDerWijden thats why you can't reproduce ➜ go-ethereum git:(v1.14.12) ✗ dig +short myname.com
64.190.63.222 |
Whaat the dns name I choose at random really had a domain behind it? Ah okay I managed to repro it now |
@MariusVanDerWijden I could submit a PR for the dynamic DNS resolving for static/trusted nodes, but I am not sure the ideal implementation would be refactoring all the needed parts or implement a new field(DynamicStaticNodes?) or implement a flag like |
We kinda discussed it yesterday with the team and came to the conclusion that this issue is a low priority for us and does not warrant a bigger refactor. If you feel strongly that this should be fixed, I would suggest to go down the refactoring route over the worker route. |
@MariusVanDerWijden opened #30822 |
Background
We are running a set of sealers and nodes in a docker swarm. Upon restarting the cluster, discovery between nodes doesn't work - so we have to add each node and sealer to each other node and sealer.
Initially (and when we were running Geth 1.8) we had an extra container, in charge of connecting to the HTTP RPC, querying each node, and registering it with all others using their HTTP RPC as well. The reason for this is that docker swarm doesn't guarantee that an IP address will be the same for the same container - so our extra container had a script to "scrape" IP addresses and building
enode://
URLS to distribute across the nodes.Problem Statement
More recently, we updated to Geth 1.10 and were pleasantly surprised when we discovered that
enode://
specifications can now accept DNS names (so we wouldn't need this scraper container). We tried and having anenode://...@dns_name@nodiscover=1
works great.So we decided that instead of having a "scraping service" in an extra container, we would instead connect our cluster using
static-nodes.json
(which didn't work, apparently deprecated) and then using ageth.toml
file.At this point, our TOML file was pretty much looking like this:
Unfortunately, this attempt failed short because Geth refuses to boot if any of the
StaticNodes
/TrustedNodes
is unreachable - so there's a bit of a catch-22 situation here when restarting the whole cluster.Suggestion 1
It would be quite nice to have these
StaticNodes
and / orTrustedNodes
act as a warning rather than aFATAL
error - this way the node would boot, fail to contact the nodes, and retry later on.Current State of Things
We didn't stop just here. We made a script such as this one:
We were hopeful that there would somehow be an option to tell
Geth
to run this script. We discovered that we cannot really do that, unless we usegeth console
orgeth attach
- which doesn't work in our case since we still would need to run this manually after the cluster has started.Suggestion 2
Maybe allow for a script to be executed when
geth
started withoutconsole
/attach
, so thatadmin.addPeer
/admin.addTrustedPeer
could be used.The text was updated successfully, but these errors were encountered: