-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Network stops producing blocks after upgrade from v0.45.x to v0.46.0-rc1 #12041
Comments
RIGOR! Thank you. Anything we can do to help? |
Can you confirm that |
@cmwaters Yes,
|
Here are the complete logs after restarting the node with new binary. Block proposer validator: https://pastebin.com/A5f3ZsY3 Note that in this test the upgrade height was kept at 120. |
Ok so looking a bit deeper at If you move the list of peers from The other thing I can try to do is add logs to check that the addresses are being added in a local testnet |
I wondered about the same things that @cmwaters did-- basically even when a network loses consensus you can often get the nodes to reconnect to one another, and get consensus back. |
@kaustubhkapatral do you mind dumping a copy of the tendermint |
@cmwaters It was the default config file generated with |
Ok, we made some breaking changes to the config file from v0.34 to v0.35 which require the user to adjust their file. You can see the notes in our upgrading document here: https://github.com/tendermint/tendermint/blob/master/UPGRADING.md#config-changes-1. To accompany this, we also made a tool |
Just an FYI -- we've decided to downgrade to v0.34.x for the SDK v0.46 release (including a prioritized mempool). |
@alexanderbez could you share more reasoning for that? If we won't have tendermint 0.35 in SDK v0.46, then it would be great to have a release 0.47 SDK release with tendermint 0.35 |
This discussion is taking place with various teams testing out v0.35 and I just dont have the cognitive bandwidth to recap everything. In short, v0.35 is not stable enough for us to garner confidence to release SDK v0.46 with it. |
what's the ETA for updating tendermint 0.34 to add tx prioritization and Cosmos SDK update? |
We already have the PRs in progress -- we just need Tendermint teams' blessing/approval (which is...not going so great right now). I want to have this released next week at the latest. |
closing as 0.35 isn't used |
Summary of Bug
After completing the software upgrade using the fix and instructions provided here : #12028, a multi node network stops producing blocks once the upgrade handler is applied. All the nodes present in the network lose their p2p connections and do not attempt to dial the node addresses specified in the
persisten_peers
.The log snippet posted above was taken from the validator which was the proposer of the upgrade height + 1 block. It made no attempts to establish a peer connection with the rest of the nodes and stalled at that point.
The log snippet posted above was present in the rest of the validator nodes of the network.
Number of p2p connections of all the nodes were verified using
curl localhost:26657/net_info | jq .result.n_peers
which returned 0 in all cases.The migration using upgrade handler was verified by observing the logs
This issue does not occur on a localnet with a one node network
Version
#12028
Steps to Reproduce
cc @alexanderbez @marbar3778
For Admin Use
The text was updated successfully, but these errors were encountered: