-
Notifications
You must be signed in to change notification settings - Fork 9.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Doubt when set up two peer urls. #9054
Comments
I try to reproduce the situation with kinds of condition, but failed.......... |
Hi @fallblank, do you mean you can not reproduce the issue or that you can? Can you please post the exact flags/ENV you are passing to etcd to start these nodes so we can try to reproduce? |
I add members one by one to cluster though V2 Member api. |
God! I reproduce it !!! |
@fallblank nice work, have you tried against a more recent version of 3.1.x? The issue could be resolved. But yes please post logs. |
Sorry for late. All logs below. ------dividing line ---------- ------dividing line ---------- |
when I run member 580f92ffca2eedb8 is healthy: got healthy result from https://ETCD-172-10-2-91192-168-2-3-0.example.svc:2379 |
but on the leader node, it print |
After I restart leader node process, all thing goes well, the new node can be add normally. |
Is there a link for relative release note? |
@fallblank Just a general comment, thanks for the logs I hope to review soon. In the meantime try to replicate against latest? v3.1.11 ? |
Duplicate with #8383. Should be fixed in v3.3. Please try https://github.com/coreos/etcd/releases/tag/v3.3.0-rc.0 and reopen if the problem still remains. |
I set up two peer urls on each etcd node in my cluster. For some reason, one peer url can't access by other notes.
Normally etcd can switch to the accessible one, print log like this:
2017-12-20 02:21:18.068065 I | etcdserver/membership: added member a9efae717f838092 [https://ETCD-206-8-2-9160-10-2-31-0.grm-etcd.manage.svc.cluster.grm:11006 https://ETCD-206-8-2-9160-10-2-31-1.grm-etcd.manage.svc.cluster.grm:11006] to cluster a10f21f554347d1a
2017-12-20 02:21:18.068134 I | rafthttp: starting peer a9efae717f838092...
2017-12-20 02:21:18.068181 I | rafthttp: started HTTP pipelining with peer a9efae717f838092
2017-12-20 02:21:18.068522 I | rafthttp: started streaming with peer a9efae717f838092 (writer)
2017-12-20 02:21:18.073874 I | rafthttp: started streaming with peer a9efae717f838092 (writer)
2017-12-20 02:21:18.074419 I | rafthttp: started peer a9efae717f838092
2017-12-20 02:21:18.074449 I | rafthttp: started streaming with peer a9efae717f838092 (stream MsgApp v2 reader)
2017-12-20 02:21:18.074476 I | rafthttp: added peer a9efae717f838092
2017-12-20 02:21:18.074500 I | rafthttp: started streaming with peer a9efae717f838092 (stream Message reader)
2017-12-20 02:21:18.646646 I | rafthttp: peer a9efae717f838092 became active
2017-12-20 02:21:18.646686 I | rafthttp: established a TCP streaming connection with peer a9efae717f838092 (stream Message writer)
2017-12-20 02:21:18.647062 I | rafthttp: established a TCP streaming connection with peer a9efae717f838092 (stream MsgApp v2 writer)
2017-12-20 02:21:21.175899 E | rafthttp: failed to dial a9efae717f838092 on stream MsgApp v2 (dial tcp 160.10.2.31:11006: i/o timeout)
2017-12-20 02:21:21.175947 I | rafthttp: peer a9efae717f838092 became inactive
2017-12-20 02:21:21.293895 I | rafthttp: peer a9efae717f838092 became active
2017-12-20 02:21:21.293949 I | rafthttp: established a TCP streaming connection with peer a9efae717f838092 (stream Message reader)
2017-12-20 02:21:21.299375 I | rafthttp: established a TCP streaming connection with peer a9efae717f838092 (stream MsgApp v2 reader)
From logs above, the member a9efae717f838092 added by member API, will became active finally, though its state change: active--inactive--active.
Abnormally, it didn't change active once, and the log is:
2017-12-16 07:28:36.178307 I | etcdserver/membership: added member 6248f0f145a2047 [https://ETCD-206-8-2-9160-10-2-31-0.grm-etcd.manage.svc.cluster.grm:11006 https://ETCD-206-8-2-9160-10-2-31-1.grm-etcd.manage.svc.cluster.grm:11006] to cluster a10f21f554347d1a
2017-12-16 07:28:36.178383 I | rafthttp: starting peer 6248f0f145a2047...
2017-12-16 07:28:36.178406 I | rafthttp: started HTTP pipelining with peer 6248f0f145a2047
2017-12-16 07:28:36.179127 I | rafthttp: started streaming with peer 6248f0f145a2047 (writer)
2017-12-16 07:28:36.179152 I | rafthttp: started peer 6248f0f145a2047
2017-12-16 07:28:36.179170 I | rafthttp: started streaming with peer 6248f0f145a2047 (writer)
2017-12-16 07:28:36.179184 I | rafthttp: added peer 6248f0f145a2047
2017-12-16 07:28:36.179204 I | rafthttp: started streaming with peer 6248f0f145a2047 (stream MsgApp v2 reader)
2017-12-16 07:28:36.179284 I | rafthttp: started streaming with peer 6248f0f145a2047 (stream Message reader)
2017-12-16 07:28:40.592501 I | rafthttp: peer 6248f0f145a2047 became active
2017-12-16 07:28:40.593396 I | rafthttp: established a TCP streaming connection with peer 6248f0f145a2047 (stream Message writer)
2017-12-16 07:28:40.593565 I | rafthttp: established a TCP streaming connection with peer 6248f0f145a2047 (stream MsgApp v2 writer)
2017-12-16 07:28:40.605012 I | rafthttp: established a TCP streaming connection with peer 6248f0f145a2047 (stream Message reader)
2017-12-16 07:28:40.605340 I | rafthttp: established a TCP streaming connection with peer 6248f0f145a2047 (stream MsgApp v2 reader)
2017-12-16 07:28:41.249490 E | rafthttp: failed to write 6248f0f145a2047 on pipeline (dial tcp 160.10.2.31:11006: i/o timeout)
2017-12-16 07:28:41.249533 I | rafthttp: peer 6248f0f145a2047 became inactive
//other logs, but it don't change to active
I puzzle what condition can make a member be marked inactive, and why 6248f0f145a2047 did't change to active as first example.
And the etcd version is 3.1.0 .
The text was updated successfully, but these errors were encountered: