-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
xtrabackup: Better support for large datasets #5065
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks good so far
e7421f0
to
904e3aa
Compare
I added some more changes to hopefully fix this for long restores too. I'll retry the large test DB with both this and #5066 in a custom build. I also changed how we search for the replication position in the xtrabackup log since in my first test it found |
This is needed for long-running backups so that the xtrabackup process doesn't block after the write buffer fills up. It's also nice for checking in on progress during a long upload. Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
904e3aa
to
b56856b
Compare
The backup side of this worked well on our 250G/shard test DB, which takes 1.5hrs to back up. I'll report back once I've tested the restore side. |
Direct write didn't use Infof() so there was no timestamp. Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
To avoid requiring 2x disk space upon restore. Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
09e709d
to
af13447
Compare
This should be ready for review now. I ended up broadening the scope of this PR to generally supporting large backups/restores with xtrabackup. I've tested it on our 250GB/shard keyspace and it passed. The data striping seems to get us back to parity with backup/restore times for the same keyspace using the built-in backup engine (which compresses and uploads each file in the data dir independently). Before adding striping, xtrabackup was between 2x and 8x slower on my test keyspace because decompression and upload/download were single-threaded. The xbstream format could technically support parallel compression/decompression on its own without striping, but not without extra disk space to store the compressed and decompressed content of a given file at the same time. You either risk running out of disk space and failing to restore, or you run with extra disk space that's only used during restore and is wasted otherwise. Also, without striping, even parallel xbstream would still bottleneck into a single destination file upload/download. Blob stores like S3 and GCS get better throughput across multiple files than for a single file. |
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
Signed-off-by: Anthony Yeh <enisoc@planetscale.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work! LGTM.
Changes to improve the Vitess xtrabackup integration for use with large datasets:
move-back
instead ofcopy-back
so the disk doesn't need 2x the space to restore. We download backups from remote storage on every restore, so there's no need to keep a copy of the original downloaded files on local disk.Fixes #5063
Signed-off-by: Anthony Yeh enisoc@planetscale.com