Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable secondary test with sst truncation deletion; API clarification #13395

Closed
wants to merge 1 commit into from

Conversation

hx235
Copy link
Contributor

@hx235 hx235 commented Feb 12, 2025

Context/Summary:
Secondary DB relies on open file descriptor of the shared SST file in primary DB to continue being able to read the file even if that file is deleted in the primary DB. However, this won't work if the file is truncated instead of deleted, which triggers an "truncated block read" corruption in stress test on secondary db reads. Truncation can happen if RocksDB implementation of SSTFileManager and bytes_max_delete_chunk>0 are used. This PR is to disable such testing combination in stress test and clarify the related API.

Test:

  • Manually repro-ed with below UT. I'm in favor of not including this UT in the codebase as it should be self-evident from the API comment now about the incompatiblity. Secondary DB is in a direction of being replaced by Follower so we should minimize edge-case tests for code with no functional change for a to-be-replaced functionality.
TEST_F(DBSecondaryTest, IncompatibleWithPrimarySSTTruncation) {
  Options options;
  options.env = env_;
  options.disable_auto_compactions = true;
  options.sst_file_manager.reset(NewSstFileManager(
      env_, nullptr /*fs*/, "" /*trash_dir*/, 2024000 /*rate_bytes_per_sec*/,
      true /*delete_existing_trash*/, nullptr /*status*/,
      0.25 /*max_trash_db_ratio*/, 1129 /*bytes_max_delete_chunk*/));
  Reopen(options);

  ASSERT_OK(Put("key1", "old_value"));
  ASSERT_OK(Put("key2", "old_value"));
  ASSERT_OK(Flush());
  ASSERT_OK(Put("key1", "new_value"));
  ASSERT_OK(Put("key3", "new_value"));
  ASSERT_OK(Flush());

  Options options1;
  options1.env = env_;
  options1.max_open_files = -1;
  Reopen(options);
  OpenSecondary(options1);
  ASSERT_OK(db_secondary_->TryCatchUpWithPrimary());

  ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->SetCallBack(
      "DeleteScheduler::DeleteTrashFile:Fsync", [&](void*) {
        std::string value;
        Status s = db_secondary_->Get(ReadOptions(), "key2", &value);
        assert(s.IsCorruption());
        assert(s.ToString().find("truncated block read") !=
            std::string::npos);
      });
  ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->EnableProcessing();

  ASSERT_OK(db_->CompactRange(CompactRangeOptions(), nullptr, nullptr));

  ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->DisableProcessing();
  ROCKSDB_NAMESPACE::SyncPoint::GetInstance()->ClearAllCallBacks();
}
  • Monitor future stress test

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@jowlyzhang jowlyzhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for the quick fix.

// WARNING: Secondary databases cannot read shared SST files that have been
// truncated in the primary database. To avoid compatibility issues, users
// should refrain from using features in the primary database that can cause
// truncation, such as setting `bytes_max_delete_chunk > 0` when invoking
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: rate_bytes_per_sec is another setting that enables slow deletion:

if (rate_bytes_per_sec_.load() <= 0 ||

These two both need to be > 0 in order for ftruncate to be used for file deletion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@hx235 hx235 force-pushed the fix_secondary_truncate branch from edc6e55 to 2bae302 Compare February 12, 2025 17:51
@facebook-github-bot
Copy link
Contributor

@hx235 has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@hx235 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

@hx235 merged this pull request in f6b2cdd.

Copy link
Contributor

@archang19 archang19 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks for making this change! hopefully the secondary crash tests are more quiet

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants