Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RefreshOptions method to support live config changes #294

Merged
merged 11 commits into from
Feb 11, 2023
108 changes: 107 additions & 1 deletion db/db_impl/db_impl.cc
Original file line number Diff line number Diff line change
Expand Up @@ -277,6 +277,8 @@ DBImpl::DBImpl(const DBOptions& options, const std::string& dbname,
periodic_task_functions_.emplace(
PeriodicTaskType::kRecordSeqnoTime,
[this]() { this->RecordSeqnoToTimeMapping(); });
periodic_task_functions_.emplace(PeriodicTaskType::kRefreshOptions,
[this]() { this->RefreshOptions(); });
#endif // ROCKSDB_LITE

versions_.reset(new VersionSet(dbname_, &immutable_db_options_, file_options_,
Expand Down Expand Up @@ -837,6 +839,15 @@ Status DBImpl::StartPeriodicTaskScheduler() {
return s;
}
}
if (mutable_db_options_.refresh_options_sec > 0) {
udi-speedb marked this conversation as resolved.
Show resolved Hide resolved
Status s = periodic_task_scheduler_.Register(
PeriodicTaskType::kRefreshOptions,
periodic_task_functions_.at(PeriodicTaskType::kRefreshOptions),
mutable_db_options_.refresh_options_sec);
if (!s.ok()) {
return s;
}
}

Status s = periodic_task_scheduler_.Register(
PeriodicTaskType::kFlushInfoLog,
Expand Down Expand Up @@ -1135,6 +1146,82 @@ void DBImpl::FlushInfoLog() {
LogFlush(immutable_db_options_.info_log);
}

#ifndef ROCKSDB_LITE
// Periodically checks to see if the new options should be loaded into the
// process. log.
void DBImpl::RefreshOptions() {
if (shutdown_initiated_) {
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not under the lock of the db's mutex_. Should you (to verify the shutdown_initiated_ doesn't become true after you check)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the same code as under the other PeriodicTasks (like RefreshStats). I am not positive of the rules for the mutex_, but followed the other examples.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I do hope that there is no issue with this approach that everybody in relying on and just copying. It would be preferable to verify that there is no issue with this approach.

std::string new_options_file = mutable_db_options_.refresh_options_file;
if (new_options_file.empty()) {
new_options_file = "Options.new";
}
if (new_options_file[0] != kFilePathSeparator) {
new_options_file = NormalizePath(immutable_db_options_.db_paths[0].path +
kFilePathSeparator + new_options_file);
}
TEST_SYNC_POINT("DBImpl::RefreshOptions::Start");
Status s = fs_->FileExists(new_options_file, IOOptions(), nullptr);
TEST_SYNC_POINT_CALLBACK("DBImpl::RefreshOptions::FileExists", &s);
if (!s.ok()) {
return;
}
ROCKS_LOG_INFO(immutable_db_options_.info_log,
"Refreshing Options from file: %s\n",
new_options_file.c_str());

ConfigOptions cfg_opts;
cfg_opts.ignore_unknown_options = true;
cfg_opts.mutable_options_only = true;
RocksDBOptionsParser op;
s = op.Parse(cfg_opts, new_options_file, fs_.get());
TEST_SYNC_POINT_CALLBACK("DBImpl::RefreshOptions::Parse", &s);
if (!s.ok()) {
ROCKS_LOG_WARN(immutable_db_options_.info_log,
"Failed to parse Options file (%s): %s\n",
new_options_file.c_str(), s.ToString().c_str());
} else if (!op.db_opt_map()->empty()) {
s = SetDBOptions(*(op.db_opt_map()));
TEST_SYNC_POINT_CALLBACK("DBImpl::RefreshOptions::SetDBOptions", &s);
if (!s.ok()) {
ROCKS_LOG_WARN(immutable_db_options_.info_log,
"Failed to refresh DBOptions, Aborting: %s\n",
s.ToString().c_str());
}
}
if (s.ok()) {
int idx = 0;
for (const auto& cf_opt_map : *(op.cf_opt_maps())) {
if (!cf_opt_map.empty()) {
const auto& cf_name = (*op.cf_names())[idx];
auto cfd = versions_->GetColumnFamilySet()->GetColumnFamily(cf_name);
udi-speedb marked this conversation as resolved.
Show resolved Hide resolved
if (cfd == nullptr) {
ROCKS_LOG_WARN(immutable_db_options_.info_log,
"RefreshOptions failed locating CF: %s\n",
cf_name.c_str());
} else if (!cfd->IsDropped()) {
s = SetCFOptionsImpl(cfd, cf_opt_map);
TEST_SYNC_POINT_CALLBACK("DBImpl::RefreshOptions::SetCFOptions", &s);
if (!s.ok()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does failure to update a cf's options mean we end up with a partial update?
We may have updated some of the cf-s and we'll skip the rest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If updating the DBOptions fails, the entire process fails.

If any of the ColumnFamilyOptions updates fails, the DBOptions will be updated (if changed) and any other ColumnFamilyOptions will still be updated. It is extremely difficult to roll back to the previous version of the options, as you cannot tell if any background task (such as compaction) started during the refresh options process. Because of this, the code prints an error if the options upgrade fails but continues anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the short answer is yes?
Why not validate the options before applying them with the column family?
There is a static ValidateOptions() method for a CF.

ROCKS_LOG_WARN(immutable_db_options_.info_log,
"Failed to refresh CFOptions for CF %s: %s\n",
cf_name.c_str(), s.ToString().c_str());
}
}
}
idx++;
}
}
s = fs_->DeleteFile(new_options_file, IOOptions(), nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case we have failed deleting this file (which may or may not be a valid options file that we have just successfully updated or failed to update), does this mean we will repeat the process as long as the file is there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we fail to delete the file, we will repeat the process. If we wish, we can add a follow-up PR to check the file modification time or something like that to make sure it is newer than the last time we tried. However, this requires the addition of "state" (to store the last modification time) that I did not want to add at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Please open a separate issue to handle this in the future.

TEST_SYNC_POINT_CALLBACK("DBImpl::RefreshOptions::DeleteFile", &s);
ROCKS_LOG_INFO(immutable_db_options_.info_log,
"RefreshOptions Complete, deleting options file %s: %s\n",
new_options_file.c_str(), s.ToString().c_str());
TEST_SYNC_POINT("DBImpl::RefreshOptions::Complete");
}
#endif // ROCKSDB_LITE

Status DBImpl::TablesRangeTombstoneSummary(ColumnFamilyHandle* column_family,
int max_entries_to_print,
std::string* out_str) {
Expand Down Expand Up @@ -1186,7 +1273,14 @@ Status DBImpl::SetOptions(
cfd->GetName().c_str());
return Status::InvalidArgument("empty input");
}
return SetCFOptionsImpl(cfd, options_map);
#endif // ROCKSDB_LITE
}

#ifndef ROCKSDB_LITE
Status DBImpl::SetCFOptionsImpl(
ColumnFamilyData* cfd,
const std::unordered_map<std::string, std::string>& options_map) {
MutableCFOptions new_options;
Status s;
Status persist_options_status;
Expand Down Expand Up @@ -1235,8 +1329,8 @@ Status DBImpl::SetOptions(
}
LogFlush(immutable_db_options_.info_log);
return s;
#endif // ROCKSDB_LITE
}
#endif // ROCKSDB_LITE

Status DBImpl::SetDBOptions(
const std::unordered_map<std::string, std::string>& options_map) {
Expand Down Expand Up @@ -1344,6 +1438,18 @@ Status DBImpl::SetDBOptions(
new_options.stats_persist_period_sec);
}
}
if (s.ok()) {
if (new_options.refresh_options_sec == 0) {
s = periodic_task_scheduler_.Unregister(
PeriodicTaskType::kRefreshOptions);
} else {
s = periodic_task_scheduler_.Register(
udi-speedb marked this conversation as resolved.
Show resolved Hide resolved
PeriodicTaskType::kRefreshOptions,
periodic_task_functions_.at(PeriodicTaskType::kRefreshOptions),
new_options.refresh_options_sec);
}
}

mutex_.Lock();
if (!s.ok()) {
return s;
Expand Down
11 changes: 10 additions & 1 deletion db/db_impl/db_impl.h
Original file line number Diff line number Diff line change
Expand Up @@ -1190,6 +1190,11 @@ class DBImpl : public DB {
// record current sequence number to time mapping
void RecordSeqnoToTimeMapping();

#ifndef ROCKSDB_LITE
// Checks if the options should be updated
void RefreshOptions();
#endif // ROCKSDB_LITE

// Interface to block and signal the DB in case of stalling writes by
// WriteBufferManager. Each DBImpl object contains ptr to WBMStallInterface.
// When DB needs to be blocked or signalled by WriteBufferManager,
Expand Down Expand Up @@ -1781,7 +1786,11 @@ class DBImpl : public DB {
ColumnFamilyHandle** handle);

Status DropColumnFamilyImpl(ColumnFamilyHandle* column_family);

#ifndef ROCKSDB_LITE
Status SetCFOptionsImpl(
ColumnFamilyData* cfd,
const std::unordered_map<std::string, std::string>& options_map);
#endif // ROCKSDB_LITE
// Delete any unneeded files and stale in-memory entries.
void DeleteObsoleteFiles();
// Delete obsolete files and log status and information of file deletion
Expand Down
Loading