You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This is the propose to move the version check from leader campaign to startup
we have a case when we lost the leader.
the pd binary is built without version tag
upgrade the 3 pd node to the wrong pd build
after the upgrade completes, all the 3 pd nodes are into the crash loop during compaign leader, and the cluster lost the pd leader and no longer function.
Here is the pd log for panic during leader campaign
{"level":"INFO","time":"2024/03/26 00:12:16.368 +00:00","caller":"versioninfo.go:89","message":"Welcome to Placement Driver (PD)"}
{"level":"INFO","time":"2024/03/26 00:12:16.368 +00:00","caller":"versioninfo.go:90","message":"PD","release-version":"62227fb4c"}
...
{"level":"INFO","time":"2024/03/26 00:43:45.706 +00:00","caller":"server.go:1670","message":"campaign PD leader ok","campaign-leader-name":"pd-1"}
{"level":"FATAL","time":"2024/03/26 00:43:46.950 +00:00","caller":"versioninfo.go:61","message":"version string is illegal","error":"[PD:semver:ErrSemverNewVersion]62227fb4c is not in dotted-tri format: 62227fb4c is not in dotted-tri format","errorVerbose":"[PD:semver:E
rrSemverNewVersion]62227fb4c is not in dotted-tri format: 62227fb4c is not in dotted-tri format\ngithub.com/pingcap/errors.AddStack\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/errors.go:174\ngithub.com/pingcap/errors.(*Error).GenWithSt
ackByCause\n\t/go/pkg/mod/github.com/pingcap/errors@v0.11.5-0.20211224045212-9687c2b0f87c/normalize.go:307\ngithub.com/tikv/pd/pkg/versioninfo.ParseVersion\n\t/mnt/tidb/pd/pkg/versioninfo/versioninfo.go:52\ngithub.com/tikv/pd/pkg/versioninfo.MustParseVersion\n\t/mnt/tid
b/pd/pkg/versioninfo/versioninfo.go:59\ngithub.com/tikv/pd/server.CheckPDVersion\n\t/mnt/tidb/pd/server/util.go:40\ngithub.com/tikv/pd/server.(*Server).campaignLeader\n\t/mnt/tidb/pd/server/server.go:1743\ngithub.com/tikv/pd/server.(*Server).leaderLoop\n\t/mnt/tidb/pd/s
erver/server.go:1639\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1650","stack":"github.com/tikv/pd/pkg/versioninfo.MustParseVersion\n\t/mnt/tidb/pd/pkg/versioninfo/versioninfo.go:61\ngithub.com/tikv/pd/server.CheckPDVersion\n\t/mnt/tidb/pd/server/util.go:40
\ngithub.com/tikv/pd/server.(*Server).campaignLeader\n\t/mnt/tidb/pd/server/server.go:1743\ngithub.com/tikv/pd/server.(*Server).leaderLoop\n\t/mnt/tidb/pd/server/server.go:1639"}
{"level":"WARN","time":"2024/03/26 00:43:51.984 +00:00","caller":"member.go:250","message":"the pd leader has not changed, delete and campaign again","old-pd-leader":"name:\"pd-1\" member_id:1438954984562261702 peer_urls:\"http
s://infra-tidb-pd-shopping-catalog-prod-0a019086.ec2.pin220.com:2380\" client_urls:\"https://infra-tidb-pd-shopping-catalog-prod-0a019086.ec2.pin220.com:2379\" "}
{"level":"INFO","time":"2024/03/26 00:43:51.986 +00:00","caller":"server.go:1632","message":"skip campaigning of pd leader and check later","server-name":"pd-1","etcd-leader-id":626574301973153734,"member-id":143895498456226170
2}
{"level":"INFO","time":"2024/03/26 00:43:52.187 +00:00","caller":"server.go:1632","message":"skip campaigning of pd leader and check later","server-name":"pd-1","etcd-leader-id":626574301973153734,"member-id":143895498456226170
2}
{"level":"INFO","time":"2024/03/26 00:43:52.388 +00:00","caller":"server.go:1632","message":"skip campaigning of pd leader and check later","server-name":"pd-1","etcd-leader-id":626574301973153734,"member-id":143895498456226170
2}
{"level":"INFO","time":"2024/03/26 00:43:58.012 +00:00","caller":"server.go:1607","message":"pd leader has changed, try to re-campaign a pd leader"}
{"level":"INFO","time":"2024/03/26 00:43:58.012 +00:00","caller":"server.go:1632","message":"skip campaigning of pd leader and check later","server-name":"pd-1","etcd-leader-id":626574301973153734,"member-id":143895498456226170
2}
{"level":"INFO","time":"2024/03/26 00:44:04.477 +00:00","caller":"server.go:1607","message":"pd leader has changed, try to re-campaign a pd leader"}
{"level":"INFO","time":"2024/03/26 00:44:04.477 +00:00","caller":"server.go:1644","message":"start to campaign PD leader","campaign-leader-name":"pd-1"}
{"level":"INFO","time":"2024/03/26 00:44:04.481 +00:00","caller":"leadership.go:181","message":"check campaign resp","resp":{"header":{"cluster_id":8850434198915930927,"member_id":16443876602637797343,"revision":55937640,"raft_term":342},"succeeded":true,"responses":[{"
Response":{"ResponsePut":{"header":{"revision":55937640}}}}]}}
{"level":"INFO","time":"2024/03/26 00:44:04.481 +00:00","caller":"server.go:1670","message":"campaign PD leader ok","campaign-leader-name":"pd-1"}
The text was updated successfully, but these errors were encountered:
…7981)
close#7978
Move the release version check before the startup to ensure we can know it as soon as possible.
Signed-off-by: JmPotato <ghzpotato@gmail.com>
Co-authored-by: ti-chi-bot[bot] <108142056+ti-chi-bot[bot]@users.noreply.github.com>
Enhancement Task
This is the propose to move the version check from leader campaign to startup
we have a case when we lost the leader.
Here is the pd log for panic during leader campaign
The text was updated successfully, but these errors were encountered: