Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

server, client: fix hanging problem when etcd failed to start #1267

Merged
merged 3 commits into from
Oct 12, 2018

Conversation

disksing
Copy link
Contributor

What problem does this PR solve?

Sometimes etcd may get stuck inside when it starts up. At this point, the port of the PD is in listened, but it cannot serve any requests. At the same time, the client (tidb) will also get stuck because the pd-server does not return a message.

What is changed and how it works?

Add context timeout in 2 places:

  1. pd server startEtcd. If etcd is not able to start in time, pd-server will report an error and exit.
  2. pd client initClusterID. If pd server not responds, the client will query next pd server instance.

TODO: update tidb vendor.

Check List

Tests

  • Integration test

Related changes

  • Need to cherry-pick to the release branch
  • Need to update the documentation

@disksing disksing added the needs-cherry-pick-release-2.1 The PR needs to cherry pick to release-2.1 branch. label Oct 11, 2018
@disksing
Copy link
Contributor Author

/run-all-tests

@disksing disksing merged commit a6b3a6c into tikv:master Oct 12, 2018
@disksing disksing deleted the start-etcd-timeout branch October 12, 2018 02:26
disksing added a commit to oh-my-tidb/pd that referenced this pull request Oct 23, 2018
@nolouch nolouch added the needs-cherry-pick-release-2.0 The PR needs to cherry pick to release-2.0 branch. label Nov 13, 2018
disksing added a commit to oh-my-tidb/pd that referenced this pull request Nov 14, 2018
disksing added a commit that referenced this pull request Nov 14, 2018
* server, client: fix hanging problem when etcd failed to start (#1267)

* server: use same initialcluster config to restart joined member (#1279)

* fix server build

* pdctl: cherry pick bugfixes (#1298, #1299, #1308)

* server/api: fix the issue about `regions/check` API (#1311)

* fix join build

* fix pdctl build

* fix region test

* fix warnings
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-cherry-pick-release-2.0 The PR needs to cherry pick to release-2.0 branch. needs-cherry-pick-release-2.1 The PR needs to cherry pick to release-2.1 branch.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants