Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track WAL in MANIFEST: add option track_and_verify_wals_in_manifest #7275

Closed
wants to merge 27 commits into from
Closed

Track WAL in MANIFEST: add option track_and_verify_wals_in_manifest #7275

wants to merge 27 commits into from

Conversation

ghost
Copy link

@ghost ghost commented Aug 18, 2020

This option determines whether WALs will be tracked in MANIFEST and verified on recovery.

Test Plan:
db_options_test
options_test

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@@ -22,6 +22,7 @@ struct ImmutableDBOptions {
bool create_missing_column_families;
bool error_if_exists;
bool paranoid_checks;
bool check_wal;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIt: per our team discussion, maybe it's better to name it should_check_wal? cc @pdillinger

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On further reflection (re naming in #7214), I am more critical of re-using verb member variable names in classes (as accessors) than having them in structs to begin with. This struct already has a lot of verb names. ¯_(ツ)_/¯

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I do think it's worth thinking about a better name. verify_wals? verify_wals_with_manifest?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think check_wal delivers the meaning, there are other "checks", such as "paranoid_checks", "check_sst_file_size", etc.

Copy link
Author

@ghost ghost Oct 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@riversand963 @pdillinger does track_and_verify_wals_in_manifest sound good to you?

// No matter whether this is true or false, the WAL information are always
// tracked by MANIFEST.
// Default: false
bool check_wal = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't this be true by default if we always pay the ongoing cost of tracking, and the only check is during recovery?

When / why would someone want to use false?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the user does not even sync WALs, then the user might skip checking the WALs since there is no guarantee of WAL's existence or completeness on recovery.

@@ -374,6 +374,14 @@ struct DBOptions {
// Default: true
bool paranoid_checks = true;

// If true, check on-disk WALs against WAL information stored in MANIFEST
// during recovery.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"check" is not very descriptive to me. How about something like "If true, fail recovery if there is an inconsistency between WAL information in MANIFEST and actual WALs on disk"?

What sort of corruption might this catch that's not otherwise caught? (Don't we always want to report corruption?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, do we quietly ignore extra/untracked WALs or fail in that case?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing or corrupted WAL will be reported

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still not clear to me how extra/untracked WALs will be handled. Can a DB switch between using track_and_verify_wals_in_manifest=true and false?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, switches between true and false can happen. The verification will only verify those tracked WALs.

@ghost
Copy link
Author

ghost commented Oct 6, 2020

@riversand963 actually, I think Peter's suggestion makes sense. Since we always track WAL in MANIFEST, it doesn't make too much sense to disable checking WALs during recovery. This PR may be unnecessary. What's your thought?

@ghost ghost removed request for ajkr and ltamasi October 6, 2020 03:58
@riversand963
Copy link
Contributor

@riversand963 actually, I think Peter's suggestion makes sense. Since we always track WAL in MANIFEST, it doesn't make too much sense to disable checking WALs during recovery. This PR may be unnecessary. What's your thought?

It sounds a good idea to always check WAL.
Another question, would some users want to disable WAL tracking sometimes for performance considerations? If so, the wal verification logic should keep this in mind.

@ghost
Copy link
Author

ghost commented Oct 7, 2020

After offline discussion, we'll keep this check_wal option, if it's false, then WALs are not tracked in MANIFEST and won't be checked on recovery.

@ghost ghost changed the title Track WAL in MANIFEST: add option check_wal Track WAL in MANIFEST: add option track_and_verify_wals_in_manifest Oct 7, 2020
@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

Copy link
Contributor

@pdillinger pdillinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even though the new name is kind of long, I think it's clearer and therefore better.

I don't know this area well enough to endorse or reject the overall strategy, but my previous concerns are sufficiently addressed. Some additional comments / clarification requested.

@@ -374,6 +374,14 @@ struct DBOptions {
// Default: true
bool paranoid_checks = true;

// If true, check on-disk WALs against WAL information stored in MANIFEST
// during recovery.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's still not clear to me how extra/untracked WALs will be handled. Can a DB switch between using track_and_verify_wals_in_manifest=true and false?

// during recovery.
//
// Default: false
bool track_and_verify_wals_in_manifest = false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this option is not currently connected to anything (we don't have diff stacks unfortunately), please drop in something like

// FIXME(cheng): This option is part of a work in progress and does not yet work

in case through some mixup or unexpected absence from work, a release only has some of your changes

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@cheng-chang has updated the pull request. You must reimport the pull request before landing.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Cheng-Chang has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@Cheng-Chang merged this pull request in 12b78e4.

codingrhythm pushed a commit to SafetyCulture/rocksdb that referenced this pull request Mar 5, 2021
…acebook#7275)

Summary:
This option determines whether WALs will be tracked in MANIFEST and verified on recovery.

Pull Request resolved: facebook#7275

Test Plan:
db_options_test
options_test

Reviewed By: pdillinger

Differential Revision: D23181418

Pulled By: cheng-chang

fbshipit-source-id: 5dd1cdc166f3dfc1c93c094df4a2f7734e3b4547
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants