-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Integrated BlobDB for backup/restore support #8129
Conversation
@akankshamahajan15 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
This unfortunately means you have to read an entire blob file in the database to compute the checksum to do an incremental backup, i.e. when the file might already be backed up. Perhaps I can chat with you & @ltamasi about possible options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've decided to get this moving even though there might be some follow-up, perhaps based on further discussion with Levi. (It seems like blob files are now essentially table files, immutable k-v data, without the benefits of calling them table files or handy features of our sst file formats like table properties. Why?)
Nevertheless, looking pretty good with some revisions requested.
Even without input from Levi, I suspect we can build a proposal to add extensible metadata (perhaps building on the "table properties" framework) to blob files in an already forward-compatible way. If I understand correctly, it should be OK for blob files to have arbitrary unreferenced K-V entries. We could pick some random 128-bit value as a special key, and if it is the first key in the blob file, then it holds some extra metadata, including db session id. For SST files, table properties come at the end so they can include summary statistics, but with blob files there's no apparent way to iterate backwards from the end, especially in the presence of compressed values. So putting this extra metadata at the beginning would be somewhat limiting but could at least solve some problems like db session id.
Especially considering I'm touching some of these same files, I don't think it's worth blocking this PR for sorting all of that out. (If we're able to solve the limitations before release, we can make those revisions.)
Thanks for the PR @akankshamahajan15 and thanks for taking a look @pdillinger ! I'll review this next week. Regarding some of the points above: we have a dedicated blob file format because the requirements for blob files and table files are very different. We always read blobs from blob files using an offset that's known in advance, which makes a flat file format well suited for this case. On the other hand, most of the features that the SST format provides, including the index block, filter block, even the binary searchable data blocks would constitute unnecessary overhead if we were to use them for blobs since we never look up blobs in blob files by key. Having said that, we do have plans to improve the blob file format, including adding the db session id so the new backup file naming scheme can be used. |
SST is not a format, but a collection of formats with different trade-offs and requirements. They all represent immutable and enumerable key-value data. When I run sst_dump --help, all of the available operations seem applicable to blob files, moreso than some of the sst file formats that e.g. do not support compression. So why can't blob data be in a format that we call an SST file format, so that we don't have to update all this RocksDB infrastructure to make special cases for blob files that are not really special cases, except in the inner workings of the LSM? Are you planning to make (integrated) blob files appendable rather than just immutable? I know that backup support (as in this PR) would need significant re-work to make that work, unless you always close out blob files on taking a checkpoint. It might make the job easier if you're planning another FileType for not-yet-closed blob files, but that would further call into question why to keep closed blob files as a separate FileType from sst files. |
And that's where the similarities with blob files end. One huge difference, which we've touched on earlier, is that tables/SSTs (any kind) expose an interface where you can look up a value (or multiple values) by key. Blob files don't support that and thus it makes no sense to shoehorn them into the table framework.
They're conceptually different from tables and are actually not even part of the LSM tree, see above.
No, we have no such plans. |
By "inner workings of LSM" I mean rocksdb::Version. Best I can tell, it does not use FileType to distinguish files, except some cases looking for kDescriptorFile in version_set.cc. We use FileType to configure checksum handoff, but if both blob and sst files are written by compaction, it seems likely you would use the same setting for both. So what would be the complication if blob files were kTableFile? (In this hypothetical, think of kTableFile as kKvDataFile.) The only cases I could find where that might cause issues, because we might "go backwards" from a file to its Version metadata, are these: DBImpl::PurgeObsoleteFiles - Seems like we can keep JobContext as it is but unify the set of file numbers between both sst and blob files, so that we can detect if a kTableFile is unreferenced. Does not appear to be a challenge. DBImpl::DeleteFile - Looks up Version metadata for named file. In case of kTableFile, could check for blob metadata if fail to find lsm/sst metadata. Does not appear to be a challenge. Can you convince me that having separate FileType for blob files solves more problems than it creates? I don't know that we can necessarily reverse course. I'm just trying to understand the design decision that made work like this PR necessary. I am particularly concerned about designs that introduce cross-cutting concerns throughout the codebase because they create compounding complexity for further development. The concern of whether a k-v data file is sorted/indexed and in the LSM or is auxiliary to the LSM seems like one that could be hidden by default. https://en.wikipedia.org/wiki/Separation_of_concerns |
OK now I'm not sure what's bothering you: that blob files are not SSTs or that they have their own Anyways, once again, a blob file is not yet another implementation of the table concept; it is a different concept altogether, and thus forcibly merging the two would be poor OO modeling. There are many many fundamental differences between the two (like how tables participate in the LSM tree while blobs do not, how they are handled during compaction, how tables are deleted in one shot vs how blob files are garbage collected gradually) but from the standpoint of file formats, the key difference is that they represent different data structures. A table is a dictionary (actually, a sorted dictionary) that maps keys to values, supports efficient lookups by key, as well as forward and backward iteration etc. A blob file, on the other hand, is more like a vector of values that you query using an exact byte range. In general, it does not have to support key lookups or iteration at all. You mentioned that SST is a "collection of formats"; actually, on the long run, I can see blob files becoming another collection of formats. For example, we could have an alternative format that is not sorted or one that does not even contain the keys (only the values) without touching anything else in the system. The handling of the two types is analogous only in low-level code that deals with them as "just files in a file system" (like the places you mentioned, or backup/restore). My rough estimate is that this is ~10% of the code dealing with tables/blob files. Designing around this ~10% at the expense of the ~90% would be the tail wagging the dog. P.S. FWIW, both the blob file format and the |
Running db stress test currently (enable backup in blob file) and will upload the changes in separate PR. |
e395ada
to
7a88b1b
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
Perhaps the level of copy-pasting test code to deal with covering the distinction bothers me more than the distinction itself. Supposing we get a third kind of immutable k-v data file, more copy-pasting of tests bothers me more than a few
Was backward compatibility with stackable BlobDB a goal of integrated BlobDB? |
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
9345694
to
2737368
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
Updated test suite. Removed individual Integrated blobdb test units and enabled |
@akankshamahajan15 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
2737368
to
2c5d5f0
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
@akankshamahajan15 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @akankshamahajan15 ! Looks good, just some minor comments, please see below
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
38b18e8
to
b189bc8
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Test Plan: Add new test units Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:
b189bc8
to
25d1a2b
Compare
@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing. |
@akankshamahajan15 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
@akankshamahajan15 merged this pull request in d52b520. |
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Pull Request resolved: facebook/rocksdb#8129 Test Plan: Add new test units Reviewed By: ltamasi Differential Revision: D27408510 Pulled By: akankshamahajan15 fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Pull Request resolved: facebook/rocksdb#8129 Test Plan: Add new test units Reviewed By: ltamasi Differential Revision: D27408510 Pulled By: akankshamahajan15 fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Pull Request resolved: facebook/rocksdb#8129 Test Plan: Add new test units Reviewed By: ltamasi Differential Revision: D27408510 Pulled By: akankshamahajan15 fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Pull Request resolved: facebook/rocksdb#8129 Test Plan: Add new test units Reviewed By: ltamasi Differential Revision: D27408510 Pulled By: akankshamahajan15 fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Pull Request resolved: facebook/rocksdb#8129 Test Plan: Add new test units Reviewed By: ltamasi Differential Revision: D27408510 Pulled By: akankshamahajan15 fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Pull Request resolved: facebook/rocksdb#8129 Test Plan: Add new test units Reviewed By: ltamasi Differential Revision: D27408510 Pulled By: akankshamahajan15 fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Pull Request resolved: facebook/rocksdb#8129 Test Plan: Add new test units Reviewed By: ltamasi Differential Revision: D27408510 Pulled By: akankshamahajan15 fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Pull Request resolved: facebook/rocksdb#8129 Test Plan: Add new test units Reviewed By: ltamasi Differential Revision: D27408510 Pulled By: akankshamahajan15 fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Pull Request resolved: facebook/rocksdb#8129 Test Plan: Add new test units Reviewed By: ltamasi Differential Revision: D27408510 Pulled By: akankshamahajan15 fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Pull Request resolved: facebook/rocksdb#8129 Test Plan: Add new test units Reviewed By: ltamasi Differential Revision: D27408510 Pulled By: akankshamahajan15 fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Add support for blob files for backup/restore like table files. Since DB session ID is currently not supported for blob files (there is no place to store it in the header), so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId. Pull Request resolved: facebook/rocksdb#8129 Test Plan: Add new test units Reviewed By: ltamasi Differential Revision: D27408510 Pulled By: akankshamahajan15 fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e Signed-off-by: Changlong Chen <levisonchen@live.cn>
Summary: Add support for blob files for backup/restore like table files.
Since DB session ID is currently not supported for blob files (there is no place to store it in
the header), so for blob files uses the
kLegacyCrc32cAndFileSize naming scheme even if
share_files_with_checksum_naming is set to kUseDbSessionId.
Test Plan: Add new test units
Reviewers:
Subscribers:
Tasks:
Tags: