-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
backup: always write to cloud storage in processor #68468
Conversation
@pbardea should we also drop |
Yep, that sounds reasonable. I would think that 16 MB should be small enough for a sane default? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r1, 2 of 2 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aliher1911)
Some revisions here that probably deserve a re-review (sorry!):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice bug spotting! LGTM with nits
3417466
to
c6d79da
Compare
Locality-partitioned backups were intended to reduce cross-regional file writes. Thus, if the processor is writing a file, it should do so to the region that matches its locality. Due to PartitionSpans, this should generally be the same node and thus same region as the leaseholder, or at least where the leaseholder was during planning. Release note: none.
Release note: none.
Release note: none.
The BACKUP processor should pick its target file size -- at which it closes a file and opens a new one -- independently from the KV's target Export file size, which has factors like request duration, memory footprint, etc that may influence it but are not relevant to the backup file as it is streamed our to cloud storage. Release note: none.
This removes the bulkio.backup.proxy_file_writes.enabled setting, and just always does the behavior of it being true. Release note (sql change): the setting bulkio.backup.proxy_file_writes.enabled is no longer needed to enable proxied writes which are now the default.
Release note: none.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @dt and @nvanbenschoten)
pkg/ccl/backupccl/backup_processor.go, line 386 at r6 (raw file):
FileSummaries: make([]RowCount, 0), } for i, file := range res.Files {
Maybe it is worth adding a comment that we always expect a single file here and loop is kept for historic reasons? I think there are few checks here like completedSpans below that give an impression that there could be more that a single entry so comment at least in one place could make it easier to understand.
With header.TargetBytes = 1 we expect each request to return at most one file, but the loop, kept here for legacy reasons may suggest otherwise. Logging a warning should help document what we expect while also informing us if we're mistaken. Release note: none.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 2 stale) (waiting on @aliher1911, @dt, and @nvanbenschoten)
pkg/ccl/backupccl/backup_processor.go, line 386 at r6 (raw file):
Previously, aliher1911 (Oleg) wrote…
Maybe it is worth adding a comment that we always expect a single file here and loop is kept for historic reasons? I think there are few checks here like completedSpans below that give an impression that there could be more that a single entry so comment at least in one place could make it easier to understand.
added a logged warning if len > 1 (figure that both documents expectation as a comment would, but also informs us if they're incorrect?)
TFTRs! bors r+ |
Build succeeded: |
Release note (sql change): the setting bulkio.backup.proxy_file_writes.enabled is no longer needed to enable proxied writes which are now the default.
Also, while I'm here, de-couple the target file size used by the processor (which streams its writes out to cloud storage) from the target file size of ExportRequest (which is now always in-memory, and twice over at that, on both the KV and SQL side of the RPC).