-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug]: [Cosmovisor] Upgrade is applied immediately when using the --upgrade-height flag #19227
Comments
Can you dump the upgrade info here too? And the sdk versions as well. |
@julienrbrt
Pre-upgrade |
Thanks, last question, what gives gaiad status just before add-upgrade ? |
@julienrbrt
|
Weird, then it should be parsed properly: https://github.com/cosmos/cosmos-sdk/blob/main/tools/cosmovisor/scanner.go#167 |
I also got bitten by that issue! |
:/ I am re-prioritizing cosmovisor work. I'll re-investigate this. |
I am observing the same behaviour with dydxv4 cosmos scheduled for Jul 25th 2024, 07:50:07+00:00 UTC (in 10 hours) Anyone happened to know if downgrade is an option? |
I have encountered this issue with the Warden testnet as well.
After executing |
I faced this yesterday on Neutron emergency upgrade on 2 nodes. Might be because I used --force flag as there was already an upgrade-info.json file there. @julienrbrt can we get this prioritised please? as this is quite dangerous |
Seeing this same issue on Cosmos Provider testnet, the issue seems to be random as I have an identical node both used with horcrux one worked fine the other tried applying the upgrade immediately. |
experiencing the same issue on current theta and provider testnet. |
Looking at this again right now! |
@julienrbrt I see a potential problem here, in scanner.go result, err := exec.Command(fw.currentBin, "status").Output() //nolint:gosec // we want to execute the status command
if err != nil {
return 0, err
} // file exist but too early in height
currentHeight, _ := fw.checkHeight()
if currentHeight != 0 && currentHeight < info.Height {
return false
} Avoid launching the node binary like that as much as possible, especially without any command-line options. A more thorough way to get data from the running node is to parse config/config.toml once, find its current RPC endpoint, and send an RPC request each time you need to get the current block height. E.g. some of my shell code: # Function to get the local HTTP RPC endpoint from the node's configuration
# TODO: This is currently limited to Neutron. Cosmos SDK exposes a different "config" command (`config get config rpc.laddr`), will need to change this function to recognize that
get_local_rpc_endpoint() { "$NODE_BINARY" config --home "$NODE_DIR" | jq -er .node | sed s/tcp/http/; }
# Function to send a call to the HTTP RPC endpoint
call_rpc_endpoint() { curl -s --connect-timeout 5 --max-time 20 --retry 5 --retry-delay 0 --retry-max-time 5 "$1/$2" | jq -er .result; }
# Function to get the latest block height via HTTP RPC
get_latest_block_height() { call_rpc_endpoint "$1" "status?" | jq -er .sync_info.latest_block_height; }
# Function to check if we have proper access to the local node
validate_local_node() {
LOCAL_RPC_ENDPOINT=""
# Check if the node's binary is readable and executable and if the node's directory exists and is readable
if [ -d "$NODE_DIR" ] && [ -r "$NODE_DIR" ] && [ -x "$NODE_BINARY" ] && [ -r "$NODE_BINARY" ]; then
LOCAL_RPC_ENDPOINT=$(get_local_rpc_endpoint); [ -z "$LOCAL_RPC_ENDPOINT" ] && print_error_and_exit "node_error"
return 0;
else return 1; fi
}
# Main function to compare heights between the local node and the remote RPC
compare_heights() {
local_height=$(get_latest_block_height "$LOCAL_RPC_ENDPOINT")
remote_height=$(get_latest_block_height "$REMOTE_RPC_ENDPOINT")
{ [ -z "$local_height" ] || [ -z "$remote_height" ]; } && print_error_and_exit "curl_error"
delta=$((remote_height - local_height))
[ "$delta" -le "$NODE_LAG" ] && \
{ printf "Node %s is not lagging, delta with %s is %s, local height is %s, remote height is %s\n" \
"$LOCAL_RPC_ENDPOINT" "$REMOTE_RPC_ENDPOINT" "$delta" "$local_height" "$remote_height"; exit 0; } || \
{ printf "Node %s is lagging, delta with %s is %s, local height is %, remote height is %s!\n" \
"$LOCAL_RPC_ENDPOINT" "$REMOTE_RPC_ENDPOINT" "$delta" "$local_height" "$remote_height"; exit 1; }
} This is significantly faster because it avoids launching another executable, even if for a moment (I'm still doing that but you can use something like a mandatory DAEMON_RPC_ENDPOINT variable instead). It is compatible with nodes that run the RPC server on a different port, which |
@julienrbrt when can we expect cosmovisor v1.6.0 which I assume will include that fix? |
Hey, now: https://github.com/cosmos/cosmos-sdk/releases/tag/tools%2Fcosmovisor%2Fv1.6.0 |
@julienrbrt unfortunately this has not solved the problem. I just tried this on neutron manual upgrade. |
Hi, |
EDIT: Scratch that @julienrbrt I see where i messed up. While i did upgrade cosmovisor I didn't restart the process before running the Thank you. |
Is there an existing issue for this?
What happened?
Cosmovisor restarts with the new binary immediately after running
add-upgrade
with the--upgrade-height
flag.Cosmos SDK Version
Cosmovisor v1.5.0
How to reproduce?
gaiad
cosmovisor add-upgrade v14 gaiad --upgrade-height 300
at any point before reaching block 300.I used this script to run steps 2 and 3:
Upgrade info:
Pre-upgrade
cosmos_sdk_version: v0.45.16
(Gaia v13.0.2)Post-upgrade
cosmos_sdk_version: v0.45.16
(Gaia v14.0.0-rc0)The text was updated successfully, but these errors were encountered: