Skip to content

Commit

Permalink
Stop responding to ping requests before master abdication (#27329)
Browse files Browse the repository at this point in the history
When the current master node is shutting down, it sends a leave request to the other nodes so that they can eagerly start a fresh master election. Unfortunately, it was still possible for the master node that was shutting down to respond to ping requests, possibly influencing the election decision as it still appeared as an active master in the ping responses. This commit ensures that UnicastZenPing does not respond to ping requests once it's been closed. ZenDiscovery.doStop() continues to ensure that the pinging component is first closed before it triggers a master election.

Closes #27328
  • Loading branch information
ywelsch committed Nov 13, 2017
1 parent d4b7f7a commit 70b62a2
Show file tree
Hide file tree
Showing 2 changed files with 15 additions and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -575,7 +575,8 @@ public void handleResponse(UnicastPingResponse response) {

@Override
public void handleException(TransportException exp) {
if (exp instanceof ConnectTransportException || exp.getCause() instanceof ConnectTransportException) {
if (exp instanceof ConnectTransportException || exp.getCause() instanceof ConnectTransportException ||
exp.getCause() instanceof AlreadyClosedException) {
// ok, not connected...
logger.trace((Supplier<?>) () -> new ParameterizedMessage("failed to connect to {}", node), exp);
} else if (closed == false) {
Expand Down Expand Up @@ -608,6 +609,9 @@ class UnicastPingRequestHandler implements TransportRequestHandler<UnicastPingRe

@Override
public void messageReceived(UnicastPingRequest request, TransportChannel channel) throws Exception {
if (closed) {
throw new AlreadyClosedException("node is shutting down");
}
if (request.pingResponse.clusterName().equals(clusterName)) {
channel.sendResponse(handlePingRequest(request));
} else {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -258,6 +258,16 @@ protected Version getVersion() {
assertPingCount(handleD, handleA, 0);
assertPingCount(handleD, handleB, 0);
assertPingCount(handleD, handleC, 3);

zenPingC.close();
handleD.counters.clear();
logger.info("ping from UZP_D after closing UZP_C");
pingResponses = zenPingD.pingAndWait().toList();
// check that node does not respond to pings anymore after the ping service has been closed
assertThat(pingResponses.size(), equalTo(0));
assertPingCount(handleD, handleA, 0);
assertPingCount(handleD, handleB, 0);
assertPingCount(handleD, handleC, 3);
}

public void testUnknownHostNotCached() throws ExecutionException, InterruptedException {
Expand Down

0 comments on commit 70b62a2

Please sign in to comment.