Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kv: detect lease transfer and back off in DistSender #32877

Merged

Conversation

ajwerner
Copy link
Contributor

@ajwerner ajwerner commented Dec 5, 2018

This PR address a problem which could lead to very long stalls in range
throughput when a lease transfer occurs when under load. As soon as the
current lease holder begins a lease transfer, it rejects all future requests
to the range with a NotLeaseHolderError which contains the new lease
information. As soon as this happens, the new lease holder immediately begins
receiving requests but is not able to service those requests until it processes
the raft command that makes it the lease hold. Until it applies that command, it
returns NotLeaseHolderError with the previous lease information. Prior to this
change, the DistSender would immediately retry the request at the node indicated
in the most recent NotLeaseHolderError it has received. This leads to a tight
loop of requests bouncing between the current lease holder and the new lease
holder which is unaware of the pending transfer (as observed in #22837) . The
amount of load generated by this traffic can grind raft progress to a complete
halt, with the author observing multi-minute durations for the new node to
process a raft Ready and hundreds of milliseconds to process a single command.
Fortunately, the DistSender can detect when this situation is occurring and can
back off accordingly.

This change detects that a replica is in the midst of a lease transfer by
noticing that it continues to receive NotLeaseHolderErrors without observing
new lease sequence number. In this case, the DistSender backs off exponentially
until it succeeds, fails, or observes a new lease sequence.

Fixes #22837, Fixes #32367

Release note: None

@ajwerner ajwerner requested review from tbg, nvanbenschoten and a team December 5, 2018 23:16
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm: but @tbg should give this a pass as well because he just reworked some of this.

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained


pkg/kv/dist_sender.go, line 97 at r1 (raw file):

		Unit:        metric.Unit_COUNT,
	}
	metaDistSenderInTransferBackoffsErrCount = metric.Metadata{

Might want to work the word "lease" in here.


pkg/kv/dist_sender_test.go, line 648 at r1 (raw file):

		},
	}
	for i, c := range []struct {

These tests are nice.

This PR address a problem which could lead to very long stalls in range
throughput when a lease transfer occurs when under load. As soon as the
current lease holder begins a lease transfer, it rejects all future requests
to the range with a NotLeaseHolderError which contains the new lease
information. As soon as this happens, the new lease holder immediately begins
receiving requests but is not able to service those requests until it processes
the raft command that makes it the lease hold. Until it applies that command, it
returns NotLeaseHolderError with the previous lease information. Prior to this
change, the DistSender would immediately retry the request at the node indicated
in the most recent NotLeaseHolderError it has received. This leads to a tight
loop of requests bouncing between the current lease holder and the new lease
holder which is unaware of the pending transfer (as observed in cockroachdb#22837) . The
amount of load generated by this traffic can grind raft progress to a complete
halt, with the author observing multi-minute durations for the new node to
process a raft Ready and hundreds of milliseconds to process a single command.
Fortunately, the DistSender can detect when this situation is occurring and can
back off accordingly.

This change detects that a replica is in the midst of a lease transfer by
noticing that it continues to receive NotLeaseHolderErrors without observing
new lease sequence number. In this case, the DistSender backs off exponentially
until it succeeds, fails, or observes a new lease sequence.

Fixes cockroachdb#22837, Fixes cockroachdb#32367

Release note: None
@ajwerner ajwerner force-pushed the ajwerner/dist-sender-backoff-in-transfer branch from 11e78d2 to 89d349a Compare December 6, 2018 21:15
Copy link
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale)


pkg/kv/dist_sender.go, line 97 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Might want to work the word "lease" in here.

Done.

Copy link
Member

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Kind of amazing this managed t oback things up for minutes. You were running with a large number of clients, right?

Reviewed 2 of 2 files at r2.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)

Copy link
Contributor Author

@ajwerner ajwerner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFYR! Concurrency was set to 4096

Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (and 1 stale)

@ajwerner
Copy link
Contributor Author

ajwerner commented Dec 7, 2018

bors r+

craig bot pushed a commit that referenced this pull request Dec 7, 2018
32877: kv: detect lease transfer and back off in DistSender r=ajwerner a=ajwerner

This PR address a problem which could lead to very long stalls in range
throughput when a lease transfer occurs when under load. As soon as the
current lease holder begins a lease transfer, it rejects all future requests
to the range with a NotLeaseHolderError which contains the new lease
information. As soon as this happens, the new lease holder immediately begins
receiving requests but is not able to service those requests until it processes
the raft command that makes it the lease hold. Until it applies that command, it
returns NotLeaseHolderError with the previous lease information. Prior to this
change, the DistSender would immediately retry the request at the node indicated
in the most recent NotLeaseHolderError it has received. This leads to a tight
loop of requests bouncing between the current lease holder and the new lease
holder which is unaware of the pending transfer (as observed in #22837) . The
amount of load generated by this traffic can grind raft progress to a complete
halt, with the author observing multi-minute durations for the new node to
process a raft Ready and hundreds of milliseconds to process a single command.
Fortunately, the DistSender can detect when this situation is occurring and can
back off accordingly.

This change detects that a replica is in the midst of a lease transfer by
noticing that it continues to receive NotLeaseHolderErrors without observing
new lease sequence number. In this case, the DistSender backs off exponentially
until it succeeds, fails, or observes a new lease sequence.

Fixes #22837, Fixes #32367

Release note: None

Co-authored-by: Andrew Werner <ajwerner@cockroachlabs.com>
@craig
Copy link
Contributor

craig bot commented Dec 7, 2018

Build succeeded

@craig craig bot merged commit 89d349a into cockroachdb:master Dec 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants