Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce options.check_flush_compaction_key_order #7467

Closed
wants to merge 8 commits into from

Conversation

siying
Copy link
Contributor

@siying siying commented Sep 30, 2020

Summary:
Introduce an new option options.check_flush_compaction_key_order, by default set to true, which checks key order of flush and compaction, and fail the operation if the order is violated.
Also did minor refactor hash checking code, which consolidates the hashing logic to a vlidation class, where the key ordering logic is added.

Test Plan: Add unit tests to validate the check can catch reordering in flush and compaction, and can be properly disabled.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@riversand963 riversand963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @siying for the PR. LGTM with a few minor comments.

paranoid_hash_ =
Hash64(value.data(), value.size(), paranoid_hash_);
}
if (enable_order_check_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that if we enable hash check, then we should also enable order check since its overhead is relatively low.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The order check is likely to be more expensive than hash check. It needs to make a copy to every key, which is not cheap.

meta.file_creation_time = current_time;
sub_compact->outputs.emplace_back(
std::move(meta), cfd->internal_comparator(),
/*enable_order_check*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/*enable_order_check=*/

db/builder.cc Outdated
uint64_t paranoid_hash = 0;
OutputValidator output_validator(
internal_comparator,
/*enable_order_check*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/*enable_order_check=*/

ASSERT_OK(db_->SetOptions({{"check_flush_compaction_key_order", "false"}}));
DBImpl* dbi = static_cast_with_check<DBImpl>(db_);

SyncPoint::GetInstance()->SetCallBack(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe clear all callbacks after the test.

icmp_(_icmp), enable_order_check_(_enable_order_check), enable_hash_(_enable_hash) {}
Status Add(const Slice& key, const Slice& value) {
if (enable_hash_) {
// Generate a rolling 64-bit hash of the key and values
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious, any plan to make the hash algorithm configurable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The hash is not exposed to users, unless the hash is not good enough I don't see a reason to make it configerable.

@riversand963
Copy link
Contributor

hmm, the test failures seem legit and need be fixed.

db/builder.cc Outdated
}
s = it->status();
if (s.ok() && check_hash != paranoid_hash) {
if (s.ok() && output_validator.GetHash() != file_validator.GetHash()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be better to move this check inside the OutputValidator, so that others can be added later.

}
if (enable_order_check_) {
TEST_SYNC_POINT_CALLBACK("OutputValidator::Add:order_check", /*arg=*/nullptr);
if (key.size() < 8) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any chance of making "8" a constant? Much easier to find and understand...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do.

Comment on lines 19 to 20
bool _enable_order_check, bool _enable_hash) :
icmp_(_icmp), enable_order_check_(_enable_order_check), enable_hash_(_enable_hash) {}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the coding standard only used an underscore prefix when assigning to structs, not classes?

Comment on lines 21 to 22
Status Add(const Slice& key, const Slice& value) {
if (enable_hash_) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move the implementation into a C++ file instead of a header and clean up the includes?

@@ -375,6 +375,7 @@ class MemTableIterator : public InternalIterator {
PERF_COUNTER_ADD(next_on_memtable_count, 1);
assert(Valid());
iter_->Next();
TEST_SYNC_POINT_CALLBACK("MemTableIterator::Next:0", iter_);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just curious as to what this has to do with this change?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find a good way to fake a memtable that generates out of order keys. If there is a better way, will do it.

Status s;
delete db_;
db_ = nullptr;
s = DestroyDB(dbname_, options);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this status be asserted?

Comment on lines 1292 to 1293
sub_compact->AddToBuilder(kv.first.Encode(), kv.second)
.PermitUncheckedError();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you need to check this Status to see out of order keys?

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@ajkr ajkr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome!

sub_compact->AddToBuilder(kv.first.Encode(), kv.second,
paranoid_file_checks_);
// Range tombstone is not supported by output validator yet.
sub_compact->builder->Add(kv.first.Encode(), kv.second);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fixes the problem with paranoid_file_checks + DeleteRange(), right? @mrambacher would it solve the problem to extract this into a separate PR that we can easily backport to 6.12 and 6.13?

Summary:
Introduce an new option options.check_flush_compaction_key_order, by default set to true, which checks key order of flush and compaction, and fail the operation if the order is violated.
Also did minor refactor hash checking code, which consolidates the hashing logic to a vlidation class, where the key ordering logic is added.

Test Plan: Add unit tests to validate the check can catch reordering in flush and compaction, and can be properly disabled.
@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@siying has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 7508175.

codingrhythm pushed a commit to SafetyCulture/rocksdb that referenced this pull request Mar 5, 2021
Summary:
Introduce an new option options.check_flush_compaction_key_order, by default set to true, which checks key order of flush and compaction, and fail the operation if the order is violated.
Also did minor refactor hash checking code, which consolidates the hashing logic to a vlidation class, where the key ordering logic is added.

Pull Request resolved: facebook#7467

Test Plan: Add unit tests to validate the check can catch reordering in flush and compaction, and can be properly disabled.

Reviewed By: riversand963

Differential Revision: D24010683

fbshipit-source-id: 8dd6292d2cda8006054e9ded7cfa4bf405f0527c
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants