-
Notifications
You must be signed in to change notification settings - Fork 726
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: use same initialcluster config to restart joined member #1279
Conversation
server/join.go
Outdated
@@ -73,8 +83,20 @@ func PrepareJoinCluster(cfg *Config) error { | |||
return errors.New("join self is forbidden") | |||
} | |||
|
|||
// Cases with data directory. | |||
filePath := cfg.DataDir + "/join" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
filePath := cfg.DataDir + "/join" | |
filePath := filepath.Join(cfg.DataDir, "join") |
server/join.go
Outdated
@@ -138,7 +161,20 @@ func PrepareJoinCluster(cfg *Config) error { | |||
initialCluster = strings.Join(pds, ",") | |||
cfg.InitialCluster = initialCluster | |||
cfg.InitialClusterState = embed.ClusterStateFlagExisting | |||
return nil | |||
err = os.Mkdir(cfg.DataDir, privateDirMode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to use MkdirAll
here?
PTAL |
server/join.go
Outdated
@@ -103,6 +126,9 @@ func PrepareJoinCluster(cfg *Config) error { | |||
|
|||
existed := false | |||
for _, m := range listResp.Members { | |||
if len(m.Name) == 0 { | |||
return errors.New("exsist a member that the join is not completed") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about there is a member that has not been joined successfully
?
PTAL @CaitinChen
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@disksing there is a member that has not joined successfully
server/join.go
Outdated
return nil | ||
err = os.MkdirAll(cfg.DataDir, privateDirMode) | ||
if err != nil && !os.IsExist(err) { | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WithStack
?
server/join.go
Outdated
return err | ||
} | ||
|
||
for i := 0; i < retryTimes; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we don't need to retry. It's ok to exit directly.
server/join.go
Outdated
} | ||
break | ||
} | ||
return err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
WithStack
?
@disksing are there any plans to put this into the next RC (assuming there is one)? This issue shows up frequently in our DinD setup, so we could also use this to produce a docker image specifically for that setup. |
I think we can include it in RC5. /cc @nolouch |
* server, client: fix hanging problem when etcd failed to start (#1267) * server: use same initialcluster config to restart joined member (#1279) * fix server build * pdctl: cherry pick bugfixes (#1298, #1299, #1308) * server/api: fix the issue about `regions/check` API (#1311) * fix join build * fix pdctl build * fix region test * fix warnings
What problem does this PR solve?
detail in: pingcap/tidb-operator#126
because some problem in etcd when joining a member. it may prepare to join failed. and we dynamically generate the join config for join member. and that's may cause the problem:
What is changed and how it works?
use the same join config after join success.
Check List
Tests