Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrated BlobDB for backup/restore support #8129

Closed

Conversation

akankshamahajan15
Copy link
Contributor

Summary: Add support for blob files for backup/restore like table files.
Since DB session ID is currently not supported for blob files (there is no place to store it in
the header), so for blob files uses the
kLegacyCrc32cAndFileSize naming scheme even if
share_files_with_checksum_naming is set to kUseDbSessionId.

Test Plan: Add new test units

Reviewers:

Subscribers:

Tasks:

Tags:

@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@pdillinger
Copy link
Contributor

so for blob files uses the kLegacyCrc32cAndFileSize naming scheme even if share_files_with_checksum_naming is set to kUseDbSessionId.

This unfortunately means you have to read an entire blob file in the database to compute the checksum to do an incremental backup, i.e. when the file might already be backed up. Perhaps I can chat with you & @ltamasi about possible options

@pdillinger pdillinger self-requested a review March 31, 2021 22:50
Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decided to get this moving even though there might be some follow-up, perhaps based on further discussion with Levi. (It seems like blob files are now essentially table files, immutable k-v data, without the benefits of calling them table files or handy features of our sst file formats like table properties. Why?)

Nevertheless, looking pretty good with some revisions requested.

Even without input from Levi, I suspect we can build a proposal to add extensible metadata (perhaps building on the "table properties" framework) to blob files in an already forward-compatible way. If I understand correctly, it should be OK for blob files to have arbitrary unreferenced K-V entries. We could pick some random 128-bit value as a special key, and if it is the first key in the blob file, then it holds some extra metadata, including db session id. For SST files, table properties come at the end so they can include summary statistics, but with blob files there's no apparent way to iterate backwards from the end, especially in the presence of compressed values. So putting this extra metadata at the beginning would be somewhat limiting but could at least solve some problems like db session id.

Especially considering I'm touching some of these same files, I don't think it's worth blocking this PR for sorting all of that out. (If we're able to solve the limitations before release, we can make those revisions.)

HISTORY.md Outdated Show resolved Hide resolved
include/rocksdb/utilities/backupable_db.h Outdated Show resolved Hide resolved
include/rocksdb/utilities/backupable_db.h Outdated Show resolved Hide resolved
include/rocksdb/utilities/backupable_db.h Show resolved Hide resolved
utilities/backupable/backupable_db.cc Outdated Show resolved Hide resolved
utilities/backupable/backupable_db_test.cc Outdated Show resolved Hide resolved
utilities/backupable/backupable_db_test.cc Outdated Show resolved Hide resolved
utilities/backupable/backupable_db_test.cc Outdated Show resolved Hide resolved
utilities/backupable/backupable_db_test.cc Outdated Show resolved Hide resolved
@ltamasi
Copy link
Contributor

ltamasi commented Apr 1, 2021

Thanks for the PR @akankshamahajan15 and thanks for taking a look @pdillinger ! I'll review this next week.

Regarding some of the points above: we have a dedicated blob file format because the requirements for blob files and table files are very different. We always read blobs from blob files using an offset that's known in advance, which makes a flat file format well suited for this case. On the other hand, most of the features that the SST format provides, including the index block, filter block, even the binary searchable data blocks would constitute unnecessary overhead if we were to use them for blobs since we never look up blobs in blob files by key. Having said that, we do have plans to improve the blob file format, including adding the db session id so the new backup file naming scheme can be used.

@pdillinger
Copy link
Contributor

Regarding some of the points above

SST is not a format, but a collection of formats with different trade-offs and requirements. They all represent immutable and enumerable key-value data. When I run sst_dump --help, all of the available operations seem applicable to blob files, moreso than some of the sst file formats that e.g. do not support compression.

So why can't blob data be in a format that we call an SST file format, so that we don't have to update all this RocksDB infrastructure to make special cases for blob files that are not really special cases, except in the inner workings of the LSM?

Are you planning to make (integrated) blob files appendable rather than just immutable? I know that backup support (as in this PR) would need significant re-work to make that work, unless you always close out blob files on taking a checkpoint. It might make the job easier if you're planning another FileType for not-yet-closed blob files, but that would further call into question why to keep closed blob files as a separate FileType from sst files.

@ltamasi
Copy link
Contributor

ltamasi commented Apr 2, 2021

SST is not a format, but a collection of formats with different trade-offs and requirements. They all represent immutable and enumerable key-value data.

And that's where the similarities with blob files end. One huge difference, which we've touched on earlier, is that tables/SSTs (any kind) expose an interface where you can look up a value (or multiple values) by key. Blob files don't support that and thus it makes no sense to shoehorn them into the table framework.

So why can't blob data be in a format that we call an SST file format, so that we don't have to update all this RocksDB infrastructure to make special cases for blob files that are not really special cases, except in the inner workings of the LSM?

They're conceptually different from tables and are actually not even part of the LSM tree, see above.

Are you planning to make (integrated) blob files appendable rather than just immutable? I know that backup support (as in this PR) would need significant re-work to make that work, unless you always close out blob files on taking a checkpoint. It might make the job easier if you're planning another FileType for not-yet-closed blob files, but that would further call into question why to keep closed blob files as a separate FileType from sst files.

No, we have no such plans.

@pdillinger
Copy link
Contributor

By "inner workings of LSM" I mean rocksdb::Version. Best I can tell, it does not use FileType to distinguish files, except some cases looking for kDescriptorFile in version_set.cc. We use FileType to configure checksum handoff, but if both blob and sst files are written by compaction, it seems likely you would use the same setting for both.

So what would be the complication if blob files were kTableFile? (In this hypothetical, think of kTableFile as kKvDataFile.) The only cases I could find where that might cause issues, because we might "go backwards" from a file to its Version metadata, are these:

DBImpl::PurgeObsoleteFiles - Seems like we can keep JobContext as it is but unify the set of file numbers between both sst and blob files, so that we can detect if a kTableFile is unreferenced. Does not appear to be a challenge.

DBImpl::DeleteFile - Looks up Version metadata for named file. In case of kTableFile, could check for blob metadata if fail to find lsm/sst metadata. Does not appear to be a challenge.

Can you convince me that having separate FileType for blob files solves more problems than it creates?

I don't know that we can necessarily reverse course. I'm just trying to understand the design decision that made work like this PR necessary. I am particularly concerned about designs that introduce cross-cutting concerns throughout the codebase because they create compounding complexity for further development. The concern of whether a k-v data file is sorted/indexed and in the LSM or is auxiliary to the LSM seems like one that could be hidden by default. https://en.wikipedia.org/wiki/Separation_of_concerns

@ltamasi
Copy link
Contributor

ltamasi commented Apr 5, 2021

OK now I'm not sure what's bothering you: that blob files are not SSTs or that they have their own FileType constant but I think the former pretty much implies the latter?

Anyways, once again, a blob file is not yet another implementation of the table concept; it is a different concept altogether, and thus forcibly merging the two would be poor OO modeling. There are many many fundamental differences between the two (like how tables participate in the LSM tree while blobs do not, how they are handled during compaction, how tables are deleted in one shot vs how blob files are garbage collected gradually) but from the standpoint of file formats, the key difference is that they represent different data structures. A table is a dictionary (actually, a sorted dictionary) that maps keys to values, supports efficient lookups by key, as well as forward and backward iteration etc. A blob file, on the other hand, is more like a vector of values that you query using an exact byte range. In general, it does not have to support key lookups or iteration at all. You mentioned that SST is a "collection of formats"; actually, on the long run, I can see blob files becoming another collection of formats. For example, we could have an alternative format that is not sorted or one that does not even contain the keys (only the values) without touching anything else in the system.

The handling of the two types is analogous only in low-level code that deals with them as "just files in a file system" (like the places you mentioned, or backup/restore). My rough estimate is that this is ~10% of the code dealing with tables/blob files. Designing around this ~10% at the expense of the ~90% would be the tail wagging the dog.

P.S. FWIW, both the blob file format and the kBlobFile constant date back to 2017, and yeah, they're not going anywhere.

@akankshamahajan15
Copy link
Contributor Author

Running db stress test currently (enable backup in blob file) and will upload the changes in separate PR.

@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing.

@pdillinger
Copy link
Contributor

Perhaps the level of copy-pasting test code to deal with covering the distinction bothers me more than the distinction itself. Supposing we get a third kind of immutable k-v data file, more copy-pasting of tests bothers me more than a few || type == kAnotherDataFile.

... and the kBlobFile constant date back to 2017, and yeah, they're not going anywhere.

Was backward compatibility with stackable BlobDB a goal of integrated BlobDB?

HISTORY.md Outdated Show resolved Hide resolved
include/rocksdb/utilities/backupable_db.h Outdated Show resolved Hide resolved
utilities/backupable/backupable_db.cc Outdated Show resolved Hide resolved
utilities/backupable/backupable_db_test.cc Outdated Show resolved Hide resolved
utilities/backupable/backupable_db_test.cc Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing.

@akankshamahajan15
Copy link
Contributor Author

akankshamahajan15 commented Apr 7, 2021

Updated test suite. Removed individual Integrated blobdb test units and enabled enable_blob_files by default in constructor of backupable test.
PS: I had to change one test FileSizeForIncrement. Rest of the tests are passing with blob files enabled.

@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@ltamasi ltamasi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @akankshamahajan15 ! Looks good, just some minor comments, please see below

HISTORY.md Outdated Show resolved Hide resolved
include/rocksdb/file_system.h Outdated Show resolved Hide resolved
utilities/backupable/backupable_db.cc Outdated Show resolved Hide resolved
utilities/backupable/backupable_db_test.cc Outdated Show resolved Hide resolved
utilities/backupable/backupable_db_test.cc Outdated Show resolved Hide resolved
@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing.

Summary: Add support for blob files for backup/restore like table files.
Since DB session ID is currently not supported for blob files (there is no place to store it in
the header), so for blob files uses the
kLegacyCrc32cAndFileSize naming scheme even if
share_files_with_checksum_naming is set to kUseDbSessionId.

Test Plan: Add new test units

Reviewers:

Subscribers:

Tasks:

Tags:
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
Summary:

Test Plan:

Reviewers:

Subscribers:

Tasks:

Tags:
@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@akankshamahajan15 merged this pull request in d52b520.

levichen94 pushed a commit to bytedance/terarkdb that referenced this pull request Jun 18, 2021
Summary:
Add support for blob files for backup/restore like table files.
    Since DB session ID is currently not supported for blob files (there is no place to store it in
    the header), so for blob files uses the
    kLegacyCrc32cAndFileSize naming scheme even if
    share_files_with_checksum_naming is set to kUseDbSessionId.

Pull Request resolved: facebook/rocksdb#8129

Test Plan: Add new test units

Reviewed By: ltamasi

Differential Revision: D27408510

Pulled By: akankshamahajan15

fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e
Signed-off-by: Changlong Chen <levisonchen@live.cn>
levichen94 pushed a commit to bytedance/terarkdb that referenced this pull request Jun 18, 2021
Summary:
Add support for blob files for backup/restore like table files.
    Since DB session ID is currently not supported for blob files (there is no place to store it in
    the header), so for blob files uses the
    kLegacyCrc32cAndFileSize naming scheme even if
    share_files_with_checksum_naming is set to kUseDbSessionId.

Pull Request resolved: facebook/rocksdb#8129

Test Plan: Add new test units

Reviewed By: ltamasi

Differential Revision: D27408510

Pulled By: akankshamahajan15

fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e
Signed-off-by: Changlong Chen <levisonchen@live.cn>
levichen94 pushed a commit to bytedance/terarkdb that referenced this pull request Jun 18, 2021
Summary:
Add support for blob files for backup/restore like table files.
    Since DB session ID is currently not supported for blob files (there is no place to store it in
    the header), so for blob files uses the
    kLegacyCrc32cAndFileSize naming scheme even if
    share_files_with_checksum_naming is set to kUseDbSessionId.

Pull Request resolved: facebook/rocksdb#8129

Test Plan: Add new test units

Reviewed By: ltamasi

Differential Revision: D27408510

Pulled By: akankshamahajan15

fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e
Signed-off-by: Changlong Chen <levisonchen@live.cn>
levichen94 pushed a commit to bytedance/terarkdb that referenced this pull request Jun 18, 2021
Summary:
Add support for blob files for backup/restore like table files.
    Since DB session ID is currently not supported for blob files (there is no place to store it in
    the header), so for blob files uses the
    kLegacyCrc32cAndFileSize naming scheme even if
    share_files_with_checksum_naming is set to kUseDbSessionId.

Pull Request resolved: facebook/rocksdb#8129

Test Plan: Add new test units

Reviewed By: ltamasi

Differential Revision: D27408510

Pulled By: akankshamahajan15

fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e
Signed-off-by: Changlong Chen <levisonchen@live.cn>
levichen94 pushed a commit to bytedance/terarkdb that referenced this pull request Jun 18, 2021
Summary:
Add support for blob files for backup/restore like table files.
    Since DB session ID is currently not supported for blob files (there is no place to store it in
    the header), so for blob files uses the
    kLegacyCrc32cAndFileSize naming scheme even if
    share_files_with_checksum_naming is set to kUseDbSessionId.

Pull Request resolved: facebook/rocksdb#8129

Test Plan: Add new test units

Reviewed By: ltamasi

Differential Revision: D27408510

Pulled By: akankshamahajan15

fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e
Signed-off-by: Changlong Chen <levisonchen@live.cn>
mm304321141 pushed a commit to bytedance/terarkdb that referenced this pull request Jun 23, 2021
Summary:
Add support for blob files for backup/restore like table files.
    Since DB session ID is currently not supported for blob files (there is no place to store it in
    the header), so for blob files uses the
    kLegacyCrc32cAndFileSize naming scheme even if
    share_files_with_checksum_naming is set to kUseDbSessionId.

Pull Request resolved: facebook/rocksdb#8129

Test Plan: Add new test units

Reviewed By: ltamasi

Differential Revision: D27408510

Pulled By: akankshamahajan15

fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e
Signed-off-by: Changlong Chen <levisonchen@live.cn>
mm304321141 pushed a commit to bytedance/terarkdb that referenced this pull request Jun 23, 2021
Summary:
Add support for blob files for backup/restore like table files.
    Since DB session ID is currently not supported for blob files (there is no place to store it in
    the header), so for blob files uses the
    kLegacyCrc32cAndFileSize naming scheme even if
    share_files_with_checksum_naming is set to kUseDbSessionId.

Pull Request resolved: facebook/rocksdb#8129

Test Plan: Add new test units

Reviewed By: ltamasi

Differential Revision: D27408510

Pulled By: akankshamahajan15

fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e
Signed-off-by: Changlong Chen <levisonchen@live.cn>
levichen94 pushed a commit to bytedance/terarkdb that referenced this pull request Jun 24, 2021
Summary:
Add support for blob files for backup/restore like table files.
    Since DB session ID is currently not supported for blob files (there is no place to store it in
    the header), so for blob files uses the
    kLegacyCrc32cAndFileSize naming scheme even if
    share_files_with_checksum_naming is set to kUseDbSessionId.

Pull Request resolved: facebook/rocksdb#8129

Test Plan: Add new test units

Reviewed By: ltamasi

Differential Revision: D27408510

Pulled By: akankshamahajan15

fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e
Signed-off-by: Changlong Chen <levisonchen@live.cn>
levichen94 pushed a commit to bytedance/terarkdb that referenced this pull request Jun 25, 2021
Summary:
Add support for blob files for backup/restore like table files.
    Since DB session ID is currently not supported for blob files (there is no place to store it in
    the header), so for blob files uses the
    kLegacyCrc32cAndFileSize naming scheme even if
    share_files_with_checksum_naming is set to kUseDbSessionId.

Pull Request resolved: facebook/rocksdb#8129

Test Plan: Add new test units

Reviewed By: ltamasi

Differential Revision: D27408510

Pulled By: akankshamahajan15

fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e
Signed-off-by: Changlong Chen <levisonchen@live.cn>
levichen94 pushed a commit to bytedance/terarkdb that referenced this pull request Sep 14, 2021
Summary:
Add support for blob files for backup/restore like table files.
    Since DB session ID is currently not supported for blob files (there is no place to store it in
    the header), so for blob files uses the
    kLegacyCrc32cAndFileSize naming scheme even if
    share_files_with_checksum_naming is set to kUseDbSessionId.

Pull Request resolved: facebook/rocksdb#8129

Test Plan: Add new test units

Reviewed By: ltamasi

Differential Revision: D27408510

Pulled By: akankshamahajan15

fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e
Signed-off-by: Changlong Chen <levisonchen@live.cn>
levichen94 pushed a commit to bytedance/terarkdb that referenced this pull request Sep 14, 2021
Summary:
Add support for blob files for backup/restore like table files.
    Since DB session ID is currently not supported for blob files (there is no place to store it in
    the header), so for blob files uses the
    kLegacyCrc32cAndFileSize naming scheme even if
    share_files_with_checksum_naming is set to kUseDbSessionId.

Pull Request resolved: facebook/rocksdb#8129

Test Plan: Add new test units

Reviewed By: ltamasi

Differential Revision: D27408510

Pulled By: akankshamahajan15

fbshipit-source-id: b27434d189a639ef3e6ad165c61a143a2daaf06e
Signed-off-by: Changlong Chen <levisonchen@live.cn>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants