-
Notifications
You must be signed in to change notification settings - Fork 4.1k
storagebase: add CheckIngestedStats helper, call after all bulk-ingest ops #35251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2a95af5 to
01ab19b
Compare
01ab19b to
c588ed2
Compare
This helper is intended to be called on the span(s) in which a caller has been doing bulk-ingestion, to trigger consistency checks on the range(s) in that span. This might be a good idea on its own but also has the side effect of fixing up any MVCC stats inaccuracies that might have been introduced by estimates during ingestion. It is probably not critical to actually call this: eventually the consistency queue will call anyway and fixup those stats if needed. However, if we know we'll call it right away -- specifically before we mark the operation that was ingesting as completed and turn those ranges over to real traffic that might expect correct-ish stats -- we might be able to make coarser, cheaper estimates, for instance during SST ingestion where the current, accurate stats recomputation is expensive. See cockroachdb#35231. Release note: None
c588ed2 to
c75fdef
Compare
nvb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @danhhz and @dt)
pkg/ccl/importccl/read_import_proc.go, line 710 at r1 (raw file):
// If somehow we only ingested one key... if ingestSpan.Key.Equal(ingestSpan.EndKey) {
How will this happen? We should never have an ingestSpan where Key == EndKey.
pkg/ccl/importccl/sst_writer_proc.go, line 195 at r1 (raw file):
log.Errorf(ctx, "failed to scatter span %s: %s", roachpb.PrettyPrintKey(nil, end), pErr) } if k, cur := sst.span.Key, ingestSpan.Key; cur == nil || k.Compare(cur) < 0 {
Here and above: can we use Span.Combine?
pkg/storage/storagebase/bulk_adder.go, line 50 at r1 (raw file):
return nil } if ingested.Key.Equal(ingested.EndKey) {
Again, is this possible?
dt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @danhhz, @dt, and @nvanbenschoten)
pkg/ccl/importccl/read_import_proc.go, line 710 at r1 (raw file):
Previously, nvanbenschoten (Nathan VanBenschoten) wrote…
How will this happen? We should never have an
ingestSpanwhereKey == EndKey.
I thought I answered that in the comment on the check, but if the ingestion job ingested exactly one key (e.g. IMPORT of a single row)?
nvb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @danhhz and @dt)
pkg/ccl/importccl/read_import_proc.go, line 710 at r1 (raw file):
Previously, dt (David Taylor) wrote…
I thought I answered that in the comment on the check, but if the ingestion job ingested exactly one key (e.g. IMPORT of a single row)?
Then shouldn't EndKey be either nil or Key.Next already? Span.EndKey should always be exclusive.
|
Closing this in favor of a core-driven change here, since it sounds like there're plans for how to actually clear |
This helper is intended to be called on the span(s) in which a caller
has been doing bulk-ingestion, to trigger consistency checks on the
range(s) in that span. This might be a good idea on its own but also has
the side effect of fixing up any MVCC stats inaccuracies that might
have been introduced by estimates during ingestion.
It is probably not critical to actually call this: eventually the
consistency queue will call anyway and fixup those stats if needed.
However, if we know we'll call it right away -- specifically before we
mark the operation that was ingesting as completed and turn those ranges
over to real traffic that might expect correct-ish stats -- we might be
able to make coarser, cheaper estimates, for instance during SST
ingestion where the current, accurate stats recomputation is expensive.
See #35231.
Release note: None