Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Transfer leadership when establishLeadership fails #5247

Merged
merged 16 commits into from
Jun 19, 2019
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions agent/agent.go
Original file line number Diff line number Diff line change
Expand Up @@ -1131,6 +1131,7 @@ func (a *Agent) consulConfig() (*consul.Config, error) {
}

// Setup the loggers
base.LogLevel = a.config.LogLevel
base.LogOutput = a.LogOutput

// This will set up the LAN keyring, as well as the WAN and any segments
Expand Down
3 changes: 3 additions & 0 deletions agent/consul/config.go
Original file line number Diff line number Diff line change
Expand Up @@ -147,6 +147,9 @@ type Config struct {
// leader election.
ReconcileInterval time.Duration

// LogLevel is the level of the logs to write. Defaults to "INFO".
LogLevel string

// LogOutput is the location to write logs to. If this is not set,
// logs will go to stderr.
LogOutput io.Writer
Expand Down
35 changes: 28 additions & 7 deletions agent/consul/leader.go
Original file line number Diff line number Diff line change
Expand Up @@ -77,7 +77,10 @@ func (s *Server) monitorLeadership() {
leaderLoop.Add(1)
go func(ch chan struct{}) {
defer leaderLoop.Done()
s.leaderLoop(ch)
err := s.leaderLoop(ch)
if err != nil {
s.leadershipTransfer()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if leadership transfer fails? We exit the leader loop anyway and no longer act as the leader, but does raft still think we are the leader? That seems like a bad case to be in - roughly the same as the bug this is meant to be fixing although we at least made an attempt at telling raft we didn't want to be leader any more.

I can think of a couple of possible options here:

  • Make a method in Raft lib that allows the leader to "StepDown" - it will stop running heartbeats etc. but not stop being a voter - basically, force a leader into follower state.
  • Keep trying the leadership Transfer indefinitely because the cluster is broken until it works anyway.
  • Keep retrying for a limited length of time and then crash the whole process to force a step down.

Copy link
Member Author

@hanshasselberg hanshasselberg Jun 13, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a couple of changes to my PR, so that leaderLoop is only left if leadershipTransfer was successful. I think that mitigates the issues you are talking about.

What happens now is that in case leadershipTransfer fails, the agent stays in leaderLoop and waits until ReconcileInterval to retry to establishLeadership (and to transferLeadership in case it fails).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is OK although ReconcileInterval is 60 seconds which seems a lot to wait around for when we know the cluster is down. 🤔

Should we instead consider looping indefinitely retrying every 5 seconds or something?

I think this is way better than before and eventually it should recover so it's good just wondering if it's easy to make that better quicker?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reset the interval to 5 seconds when transfer leadership fails so that it retries establishLeadership again faster than before.

}
}(weAreLeaderCh)
s.logger.Printf("[INFO] consul: cluster leadership acquired")

Expand Down Expand Up @@ -126,9 +129,23 @@ func (s *Server) monitorLeadership() {
}
}

func (s *Server) leadershipTransfer() {
retryCount := 3
for i := 0; i < retryCount; i++ {
freddygv marked this conversation as resolved.
Show resolved Hide resolved
future := s.raft.LeadershipTransfer()
if err := future.Error(); err != nil {
s.logger.Printf("[ERR] consul: failed to transfer leadership attempt %d/%d: %v", i, retryCount, err)
} else {
s.logger.Printf("[ERR] consul: successfully transferred leadership attempt %d/%d", i, retryCount)
break
}

}
}

// leaderLoop runs as long as we are the leader to run various
// maintenance activities
func (s *Server) leaderLoop(stopCh chan struct{}) {
func (s *Server) leaderLoop(stopCh chan struct{}) error {
// Fire a user event indicating a new leader
payload := []byte(s.config.NodeName)
for name, segment := range s.LANSegments() {
Expand Down Expand Up @@ -178,7 +195,7 @@ RECONCILE:
if err := s.revokeLeadership(); err != nil {
s.logger.Printf("[ERR] consul: failed to revoke leadership: %v", err)
}
goto WAIT
return err
Copy link
Member Author

@hanshasselberg hanshasselberg Jun 7, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the first usecase: we couldn't establish leadership and instead of retrying it after interval in WAIT, an error is returned which leads to a raft leadership transfer.

}
establishedLeader = true
defer func() {
Expand All @@ -204,7 +221,7 @@ WAIT:
// down.
select {
case <-stopCh:
return
return nil
default:
}

Expand All @@ -213,17 +230,21 @@ WAIT:
for {
select {
case <-stopCh:
return
return nil
case <-s.shutdownCh:
return
return nil
case <-interval:
goto RECONCILE
case member := <-reconcileCh:
s.reconcileMember(member)
case index := <-s.tombstoneGC.ExpireCh():
go s.reapTombstones(index)
case errCh := <-s.reassertLeaderCh:
errCh <- reassert()
err := reassert()
errCh <- err
freddygv marked this conversation as resolved.
Show resolved Hide resolved
if err != nil {
return err
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the second usecase: we couldn't reassert and instead of retrying it after interval, an error is returned which leads to a raft leadership transfer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm, when does this happen? I don't recall the specifics and wonder if by failing and stepping down we might end up causing new leadership stability issues?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// reassertLeaderCh is used to signal the leader loop should re-run
// leadership actions after a snapshot restore.

This happens after snapshot restore. Since the agent revokes leadership and immediately tries to establish it again, there is the possibility that it fails. When it does we are in the same situation as above - raft thinks this agent is a leader, but consul disagrees.

}
}
}
}
Expand Down
8 changes: 7 additions & 1 deletion agent/consul/server.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ import (
"github.com/hashicorp/consul/sentinel"
"github.com/hashicorp/consul/tlsutil"
"github.com/hashicorp/consul/types"
"github.com/hashicorp/go-hclog"
"github.com/hashicorp/raft"
raftboltdb "github.com/hashicorp/raft-boltdb"
"github.com/hashicorp/serf/serf"
Expand Down Expand Up @@ -540,7 +541,12 @@ func (s *Server) setupRaft() error {

// Make sure we set the LogOutput.
s.config.RaftConfig.LogOutput = s.config.LogOutput
s.config.RaftConfig.Logger = s.logger
raftLogger := hclog.New(&hclog.LoggerOptions{
Name: "raft",
Level: hclog.LevelFromString(s.config.LogLevel),
Output: s.config.LogOutput,
})
s.config.RaftConfig.Logger = raftLogger
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Raft logging changed and we need to provide the hclogger now.


// Versions of the Raft protocol below 3 require the LocalID to match the network
// address of the transport.
Expand Down
21 changes: 6 additions & 15 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,6 @@ replace github.com/hashicorp/consul/sdk => ./sdk
require (
github.com/Azure/go-ansiterm v0.0.0-20170929234023-d6e3b3328b78 // indirect
github.com/Azure/go-autorest v10.15.3+incompatible // indirect
github.com/DataDog/datadog-go v0.0.0-20160329135253-cc2f4770f4d6 // indirect
github.com/Jeffail/gabs v1.1.0 // indirect
github.com/Microsoft/go-winio v0.4.3 // indirect
github.com/NYTimes/gziphandler v1.0.1
Expand All @@ -18,16 +17,13 @@ require (
github.com/SermoDigital/jose v0.0.0-20180104203859-803625baeddc // indirect
github.com/StackExchange/wmi v0.0.0-20180116203802-5d049714c4a6 // indirect
github.com/armon/circbuf v0.0.0-20150827004946-bbbad097214e
github.com/armon/go-metrics v0.0.0-20180917152333-f0300d1749da
github.com/armon/go-metrics v0.0.0-20190430140413-ec5e00d3c878
github.com/armon/go-radix v0.0.0-20180808171621-7fddfc383310
github.com/asaskevich/govalidator v0.0.0-20180319081651-7d2e70ef918f // indirect
github.com/beorn7/perks v0.0.0-20180321164747-3a771d992973 // indirect
github.com/bitly/go-hostpool v0.0.0-20171023180738-a3a6125de932 // indirect
github.com/bmizerany/assert v0.0.0-20160611221934-b7ed37b82869 // indirect
github.com/boltdb/bolt v1.3.1 // indirect
github.com/cenkalti/backoff v2.1.1+incompatible // indirect
github.com/circonus-labs/circonus-gometrics v0.0.0-20161109192337-d17a8420c36e // indirect
github.com/circonus-labs/circonusllhist v0.0.0-20161110002650-365d370cc145 // indirect
github.com/containerd/continuity v0.0.0-20181203112020-004b46473808 // indirect
github.com/coredns/coredns v1.1.2
github.com/denisenkom/go-mssqldb v0.0.0-20180620032804-94c9c97e8c9f // indirect
Expand Down Expand Up @@ -58,12 +54,11 @@ require (
github.com/hashicorp/go-checkpoint v0.0.0-20171009173528-1545e56e46de
github.com/hashicorp/go-cleanhttp v0.5.1
github.com/hashicorp/go-discover v0.0.0-20190403160810-22221edb15cd
github.com/hashicorp/go-hclog v0.0.0-20180402200405-69ff559dc25f // indirect
github.com/hashicorp/go-hclog v0.9.1
github.com/hashicorp/go-memdb v0.0.0-20180223233045-1289e7fffe71
github.com/hashicorp/go-msgpack v0.5.4
github.com/hashicorp/go-msgpack v0.5.5
github.com/hashicorp/go-multierror v1.0.0
github.com/hashicorp/go-plugin v0.0.0-20180331002553-e8d22c780116
github.com/hashicorp/go-retryablehttp v0.0.0-20180531211321-3b087ef2d313 // indirect
github.com/hashicorp/go-rootcerts v1.0.0
github.com/hashicorp/go-sockaddr v1.0.0
github.com/hashicorp/go-syslog v1.0.0
Expand All @@ -76,7 +71,7 @@ require (
github.com/hashicorp/mdns v1.0.1 // indirect
github.com/hashicorp/memberlist v0.1.4
github.com/hashicorp/net-rpc-msgpackrpc v0.0.0-20151116020338-a14192a58a69
github.com/hashicorp/raft v1.0.1-0.20190409200437-d9fe23f7d472
github.com/hashicorp/raft v1.1.0
github.com/hashicorp/raft-boltdb v0.0.0-20150201200839-d1e82c1ec3f1
github.com/hashicorp/serf v0.8.2
github.com/hashicorp/vault v0.10.3
Expand All @@ -89,7 +84,6 @@ require (
github.com/kr/text v0.1.0
github.com/lib/pq v0.0.0-20180523175426-90697d60dd84 // indirect
github.com/lyft/protoc-gen-validate v0.0.0-20180911180927-64fcb82c878e // indirect
github.com/matttproud/golang_protobuf_extensions v1.0.1 // indirect
github.com/miekg/dns v1.0.14
github.com/mitchellh/cli v1.0.0
github.com/mitchellh/copystructure v0.0.0-20160804032330-cdac8253d00f
Expand All @@ -103,13 +97,10 @@ require (
github.com/opencontainers/image-spec v1.0.1 // indirect
github.com/opencontainers/runc v0.1.1 // indirect
github.com/ory/dockertest v3.3.4+incompatible // indirect
github.com/pascaldekloe/goe v0.0.0-20180627143212-57f6aae5913c
github.com/pascaldekloe/goe v0.1.0
github.com/patrickmn/go-cache v0.0.0-20180527043350-9f6ff22cfff8 // indirect
github.com/pkg/errors v0.8.1
github.com/prometheus/client_golang v0.0.0-20180328130430-f504d69affe1
github.com/prometheus/client_model v0.0.0-20171117100541-99fa1f4be8e5 // indirect
github.com/prometheus/common v0.0.0-20180326160409-38c53a9f4bfc // indirect
github.com/prometheus/procfs v0.0.0-20180408092902-8b1c2da0d56d // indirect
github.com/prometheus/client_golang v0.9.2
github.com/ryanuber/columnize v0.0.0-20160712163229-9b3edd62028f
github.com/ryanuber/go-glob v0.0.0-20170128012129-256dc444b735 // indirect
github.com/shirou/gopsutil v0.0.0-20181107111621-48177ef5f880
Expand Down
Loading