Allow raft TrailingLogs to be configured. #6186

banks · 2019-07-22T14:20:55Z

This fixes pathalogical cases where the write throughput and snapshot size are both so large that more than 10k log entries are written in the time it takes to restore the snapshot from disk. In this case followers that restart can never catch up with leader replication again and enter a loop of constantly downloading a full snapshot and restoring it only to find that snapshot is already out of date and the leader has truncated its logs so a new snapshot is sent etc.

In general if you need to adjust this, you are probably abusing Consul for purposes outside its design envelope and should reconsider your usage to reduce data size and/or write volume.

Notes

This is the minimal change to allow that situation to be recovered from. There are nicer solutions we can also add later with Raft library changes, e.g. dynamically deciding how much to truncate based on current recovery progress but those changes are much more involved to test and reason about as well as backport to existing versions. This provides a low-risk and simple change we can potentially backport to older versions in case users are stuck on those and hitting this situation in production with no other way to resolve it.

Questions

Should we document this config? If we do I think we should warn against it's use like the other raft tunables but this seems especially like a crutch to work around deployments that are using Consul in cases it's not well suited for. On the other hand, if you are in that situation not being able to self-discover that a solution exists also seems kinda silly.

This fixes pathalogical cases where the write thoughput and snapshot size are both so large that more than 10k log entries are written in the time it takes to restore the snapshot from disk. In this case followers that restart can never catch up with leader replication again and enter a loop of constantly downloading a full snapshot and restoring it only to find that snapshot is already out of date and the leader has truncated its logs so a new snapshot is sent etc. In general if you need to adjust this, you are probably abusing Consul for purposes outside its design envelope and should reconsider your usage to reduce data size and/or write volume.

mkeeler

This all looks good. Should we also set the previous 10k limit here:

consul/agent/config/default.go

Line 233 in 7753b97

leader_lease_timeout = "` + raft.LeaderLeaseTimeout.String() + `"

freddygv · 2019-07-22T15:06:14Z

This should probably be documented with a warning. Here's a potential one based on your original comment:

This should only be adjusted when followers cannot catch up to the leader due to a large snapshot size and high write throughput. However, consider reducing write throughput or the amount of data stored on Consul first. Consul is likely under a load it was not designed to handle.

It seems unlikely that knowing this setting exists would encourage people to get into the situation where they need it. If they're already in that situation, posting the warning with alternative courses of action may help get them out of it.

By the way, I think @mkeeler mentioned that we should also improve the logging around installing snapshots as well. That way users can diagnose the situation if it does come up.

banks · 2019-07-22T20:31:26Z

@mkeeler i looked but right now we don't default any of the other raft configs - we just conditionally set them only if they are non-zero in the consul config so accept Raft's defaults implicitly. I.e. I did the same as we do for -raft-snapshot-threshold and friends.

banks · 2019-07-22T20:45:42Z

I added docs. I also realised the other raft tunables are documented as CLI flags out of habit but are actually not valid CLI flags so I've moved them, preserving the old anchor names too.

freddygv

LGTM, just one small comment inline

website/source/docs/agent/options.html.md

pearkes

Makes sense as part of 1.5.3 👍

banks requested a review from a team July 22, 2019 14:20

mkeeler reviewed Jul 22, 2019

View reviewed changes

Add docs

240e8f4

freddygv approved these changes Jul 22, 2019

View reviewed changes

website/source/docs/agent/options.html.md Outdated Show resolved Hide resolved

Update website/source/docs/agent/options.html.md

f0a8d5a

pearkes approved these changes Jul 22, 2019

View reviewed changes

pearkes added this to the 1.5.3 milestone Jul 22, 2019

banks merged commit f38da47 into master Jul 23, 2019

banks deleted the trailing-logs branch July 23, 2019 14:20

banks mentioned this pull request Jul 23, 2019

Backport/1.2.5 trailing logs #6197

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow raft TrailingLogs to be configured. #6186

Allow raft TrailingLogs to be configured. #6186

banks commented Jul 22, 2019

mkeeler left a comment •

edited

Loading

freddygv commented Jul 22, 2019 •

edited

Loading

banks commented Jul 22, 2019

banks commented Jul 22, 2019

freddygv left a comment

pearkes left a comment

Allow raft TrailingLogs to be configured. #6186

Allow raft TrailingLogs to be configured. #6186

Conversation

banks commented Jul 22, 2019

Notes

Questions

mkeeler left a comment • edited Loading

Choose a reason for hiding this comment

freddygv commented Jul 22, 2019 • edited Loading

banks commented Jul 22, 2019

banks commented Jul 22, 2019

freddygv left a comment

Choose a reason for hiding this comment

pearkes left a comment

Choose a reason for hiding this comment

mkeeler left a comment •

edited

Loading

freddygv commented Jul 22, 2019 •

edited

Loading