-
Notifications
You must be signed in to change notification settings - Fork 4k
kvserver: enable loosely coupled raft log truncation #76902
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
sumeerbhola
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first two commits are from #76215 and #76793
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @nvanbenschoten, and @tbg)
e569ce7 to
1345618
Compare
sumeerbhola
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased. The non test changes are tiny.
Most of the test diffs are indentation changes due to running the kvserver server-side tests both with and without loosely-coupled truncation. The client-side kvserver tests are only run with whatever is the default truncation behavior.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @nvanbenschoten, and @tbg)
tbg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't reviewed the tests yet, have a suggestion on how to simplify the prod code. Will review tests first thing tomorrow.
Reviewed 3 of 13 files at r1, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @nvanbenschoten, and @sumeerbhola)
pkg/kv/kvserver/raft_log_truncator.go, line 394 at r1 (raw file):
if runTruncation { if err := t.stopper.RunAsyncTask( t.ambientCtx, "raft-log-truncation", t.durabilityAdvanced); err != nil {
Consider pulling the t.mu.runningTruncation = false out. It is nice to have it all in one place.
Near the top of the method:
runTruncation := false
done := func() {}
t.mu.Lock()
[...]
runTruncation = true
t.mu.runningTruncation = true
done := func() {
t.mu.Lock()
if !t.mu.runningTruncation { panic("...") }
t.mu.runningTruncation = false
t.mu.Unlock()
}
[...]
for runTruncation {
if err := t.stopper.RunAsyncTask(...., func(ctx context.Context) {
defer done()
for {
t.durabilityAdvanced(ctx)
t.mu.Lock()
queued := t.mu.queuedDurabilityCB
t.mu.queuedDurabilityCB = false
t.mu.Unlock()
if !queued { return }
}
}); err != nil {
done()
return
}
}
That way runningTruncation is only accessed in this method.
pkg/kv/kvserver/raft_log_truncator.go, line 435 at r1 (raw file):
t.tryEnactTruncations(ctx, rangeID, reader) select { case <-shouldQuiesce:
Why do you need this? Were you seeing hangs? tryEnactTruncations should also be sensitive to quiescence so this seems superfluous.
pkg/kv/kvserver/raft_log_truncator.go, line 443 at r1 (raw file):
} } t.mu.Lock()
This part would go away with my other suggestion.
1345618 to
c0095c5
Compare
sumeerbhola
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR!
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @nvanbenschoten, and @tbg)
pkg/kv/kvserver/raft_log_truncator.go, line 394 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Consider pulling the
t.mu.runningTruncation = falseout. It is nice to have it all in one place.Near the top of the method:
runTruncation := false done := func() {} t.mu.Lock() [...] runTruncation = true t.mu.runningTruncation = true done := func() { t.mu.Lock() if !t.mu.runningTruncation { panic("...") } t.mu.runningTruncation = false t.mu.Unlock() } [...] for runTruncation { if err := t.stopper.RunAsyncTask(...., func(ctx context.Context) { defer done() for { t.durabilityAdvanced(ctx) t.mu.Lock() queued := t.mu.queuedDurabilityCB t.mu.queuedDurabilityCB = false t.mu.Unlock() if !queued { return } } }); err != nil { done() return } }That way
runningTruncationis only accessed in this method.
Much better! Done.
pkg/kv/kvserver/raft_log_truncator.go, line 435 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
Why do you need this? Were you seeing hangs?
tryEnactTruncationsshould also be sensitive to quiescence so this seems superfluous.
I saw a test fail with a pebble.tableCacheShard leak, which suggests the test shutdown without waiting for this goroutine that was holding open a pebble.Iterator. It seemed wise to not have stray goroutines that are not linked to a stopper, hence the stopper plumbing.
This particular change to check shouldQuiesce was just to play nice -- my reading of the stopper contract was that it waits for all such async tasks after closing this channel. So may as well check after processing each range.
pkg/kv/kvserver/raft_log_truncator.go, line 443 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
This part would go away with my other suggestion.
Done
tbg
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 10 of 13 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @nvanbenschoten, and @sumeerbhola)
pkg/kv/kvserver/raft_log_truncator.go, line 435 at r1 (raw file):
hence the stopper plumbing.
Sorry, that wasn't the question. Was just asking about checking quiescing here. It doesn't seem necessary, since this won't block shutdown. Mild preference for removal to keep the code clean.
pkg/kv/kvserver/testdata/raft_log_truncator, line 405 at r2 (raw file):
# Test case to ensure that truncator is reading only flushed engine state when # deciding to enact truncations. create-replica id=3 trunc-index=20 last-log-entry=30
It's odd to see last-log-entry but then the log ends at 29. I think this should be last-log-entry=29 (i.e. make it inclusive).
pkg/kv/kvserver/testdata/raft_log_truncator, line 438 at r2 (raw file):
# Move RaftAppliedState enough to allow first truncation, but don't flush that # change. write-raft-applied-index id=3 raft-applied-index=22 no-flush=true
What ensures that a flush doesn't accidentally happen?
The const override that disabled loosely coupled truncation is removed. There is additional testing of both the end-to-end path and the truncator. Informs cockroachdb#36262 Release note: None
c0095c5 to
1bb58e5
Compare
sumeerbhola
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @erikgrinaker, @nvanbenschoten, and @tbg)
pkg/kv/kvserver/raft_log_truncator.go, line 435 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
hence the stopper plumbing.
Sorry, that wasn't the question. Was just asking about checking quiescing here. It doesn't seem necessary, since this won't block shutdown. Mild preference for removal to keep the code clean.
I've left the code in place and added
// Check if the stopper is quiescing. This isn't strictly necessary, but
// if there are a huge number of ranges that need to be truncated, this
// will cause us to stop faster.
pkg/kv/kvserver/testdata/raft_log_truncator, line 405 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
It's odd to see
last-log-entrybut then the log ends at29. I think this should belast-log-entry=29(i.e. make it inclusive).
Done
pkg/kv/kvserver/testdata/raft_log_truncator, line 438 at r2 (raw file):
Previously, tbg (Tobias Grieger) wrote…
What ensures that a flush doesn't accidentally happen?
The initial memtable size is 256KB, and doubles for each new memtable. Even that initial size is much larger than anything we do in this test between explicit flushes.
I've added a comment to the test
// The initial engine memtable size is 256KB, and doubles for each new
// memtable. Even the initial size is much larger than anything we do
// in this test between explicit flushes. Hence we can rely on the
// fact that no-flush will actually be respected, and we won't
// encounter an unexpected flush.
|
bors r=tbg |
|
bors r=tbg |
|
Build failed (retrying...): |
|
Build failed (retrying...): |
|
Build succeeded: |
The const override that disabled loosely coupled truncation
is removed. There is additional testing of both the end-to-end
path and the truncator.
Informs #36262
Release note: None