Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add RefreshOptions method to support live config changes #294

Merged
merged 11 commits into from
Feb 11, 2023

Conversation

mrambacher
Copy link
Contributor

Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file.

Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing.

Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored.

Currently tested manually via db_bench. Unit tests will be forthcoming.

Allows for configuration changes to be updated by reading an Options file periodically.
@udi-speedb
Copy link
Contributor

udi-speedb commented Dec 14, 2022

@mrambacher - Is there an issue for this PR?
If not, please create one.
Either way, please link the issue with this PR. Thanks

@mrambacher - Could you please handle?

@udi-speedb
Copy link
Contributor

Why is this feature not available in RocksDB Lite?
Actually, for my edification, what should be the capabilities / restrictions on RocksDB lite in general?
Are you aware of any resource specifying this?

@udi-speedb
Copy link
Contributor

I think we should add support for this feature in db_stress and crash testing

options/db_options.cc Show resolved Hide resolved
@@ -130,6 +130,8 @@ struct MutableDBOptions {
uint64_t delete_obsolete_files_period_micros;
unsigned int stats_dump_period_sec;
unsigned int stats_persist_period_sec;
unsigned int refresh_options_sec;
std::string refresh_options_file;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again. I know that all of the other options are not initialized at the declaration point and are initialized in the ctor-initializer list, but why not init here and guarantee initialization in all ctor-s without having to add the line or in case somebody forgets?

include/rocksdb/options.h Show resolved Hide resolved
void DBImpl::RefreshOptions() {
if (shutdown_initiated_) {
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are not under the lock of the db's mutex_. Should you (to verify the shutdown_initiated_ doesn't become true after you check)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exactly the same code as under the other PeriodicTasks (like RefreshStats). I am not positive of the rules for the mutex_, but followed the other examples.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I do hope that there is no issue with this approach that everybody in relying on and just copying. It would be preferable to verify that there is no issue with this approach.

db/db_impl/db_impl.cc Show resolved Hide resolved
db/db_impl/db_impl.cc Show resolved Hide resolved
cf_name.c_str());
} else if (!cfd->IsDropped()) {
s = SetCFOptionsImpl(cfd, cf_opt_map);
if (!s.ok()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does failure to update a cf's options mean we end up with a partial update?
We may have updated some of the cf-s and we'll skip the rest.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If updating the DBOptions fails, the entire process fails.

If any of the ColumnFamilyOptions updates fails, the DBOptions will be updated (if changed) and any other ColumnFamilyOptions will still be updated. It is extremely difficult to roll back to the previous version of the options, as you cannot tell if any background task (such as compaction) started during the refresh options process. Because of this, the code prints an error if the options upgrade fails but continues anyway.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the short answer is yes?
Why not validate the options before applying them with the column family?
There is a static ValidateOptions() method for a CF.

idx++;
}
}
s = fs_->DeleteFile(new_options_file, IOOptions(), nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case we have failed deleting this file (which may or may not be a valid options file that we have just successfully updated or failed to update), does this mean we will repeat the process as long as the file is there?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, if we fail to delete the file, we will repeat the process. If we wish, we can add a follow-up PR to check the file modification time or something like that to make sure it is newer than the last time we tried. However, this requires the addition of "state" (to store the last modification time) that I did not want to add at this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Please open a separate issue to handle this in the future.

db/db_impl/db_impl.cc Show resolved Hide resolved
@mrambacher
Copy link
Contributor Author

Why is this feature not available in RocksDB Lite? Actually, for my edification, what should be the capabilities / restrictions on RocksDB lite in general? Are you aware of any resource specifying this?

The PeriodicTaskScheduler is not available in LITE mode. I do not know why. I do not know how things were decided LITE or not.

- Add command line flag to db_stress_tool and db_bench

- Set the default to check for options updates every hour (3600 seconds).
void DBImpl::RefreshOptions() {
if (shutdown_initiated_) {
return;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I do hope that there is no issue with this approach that everybody in relying on and just copying. It would be preferable to verify that there is no issue with this approach.

db/db_impl/db_impl.cc Show resolved Hide resolved
cf_name.c_str());
} else if (!cfd->IsDropped()) {
s = SetCFOptionsImpl(cfd, cf_opt_map);
if (!s.ok()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the short answer is yes?
Why not validate the options before applying them with the column family?
There is a static ValidateOptions() method for a CF.

idx++;
}
}
s = fs_->DeleteFile(new_options_file, IOOptions(), nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. Please open a separate issue to handle this in the future.

db/db_options_test.cc Outdated Show resolved Hide resolved
db/db_options_test.cc Outdated Show resolved Hide resolved
// Test with a file that contains no DBOptions section
ASSERT_OK(CreateFile(fs, options.refresh_options_file,
"[CFOptions \"default\"]\n", false));
TEST_SYNC_POINT("DBOptionsTest::WaitForUpdates");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to see that a file that contains just an empty DBOptions + CFOptions passes parsing and then test files that contains just one of them.

db/db_options_test.cc Outdated Show resolved Hide resolved

SyncPoint::GetInstance()->DisableProcessing();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few comments about the unit tests:

  • You are using the name Options.New as the file name in the test while the default is Options.new. It might be preferable to pick a name that will be different also in case insensitive file systems (Windows?).
  • I suggest adding tests for the following cases:
    • That the file is indeed deleted after a refresh cycle?
    • The specified file in not found during a refresh cycle (refresh_options_file is not empty)
    • refresh_options_file is empty but the file by the default name Options.new is found as expected and updates the options
    • The file specified is a binary file
    • Multiple refreshes, each time with a different contents of the same file
    • Multiple refreshes, the name of the file changes
    • Having options for a CF that doesn't exist
    • An invalid file name (invalid characters in the path)

#ifndef ROCKSDB_LITE
options.refresh_options_sec = FLAGS_refresh_options_sec;
options.refresh_options_file = FLAGS_refresh_options_file;
#endif // ROCKSDB_LITE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will allow the db_stress user to set a refresh rate & file. However, to actually stress test this feature we need to be able to actually create options files to be used by the production code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we need such a tool. I hope to work on one in the near future under a separate PR (for general use as well as this stress test).

- Add tests for different/default refresh options file;
- Added check that options file is deleted on completion
- Add tests with multiple refreshes of the same file
@mrambacher mrambacher self-assigned this Jan 6, 2023
udi-speedb
udi-speedb previously approved these changes Jan 17, 2023
@Guyme Guyme linked an issue Jan 18, 2023 that may be closed by this pull request
@Yuval-Ariel
Copy link
Contributor

@mrambacher , it seems the checks have failed.. how is this in QA?

@udi-speedb
Copy link
Contributor

@Yuval-Ariel I have already reviewed this. Has anything changed that needs another review?

@Yuval-Ariel
Copy link
Contributor

@udi-speedb all the approved reviews were removed for some reason.. and the merging needs at least one approval.

@udi-speedb
Copy link
Contributor

Oh, I see. OK. Evil @maxb-io modified (per my request) the policy so that whenever a push is made to an already approved PR, the approval is revoked and the PR needs to be approved again. This is to verify that if anyone updates an already approved PR (usually just squashing / rebasing), it will not go unnoticed and retain its approved state in case an (not less evil) person pushed real content.

udi-speedb
udi-speedb previously approved these changes Feb 7, 2023
- Missed periodic task after changing the default
- Options settable missed as does not run on the Mac
- Fixed race condition on writing/checking files in db_options_test
@Yuval-Ariel Yuval-Ariel merged commit b76dd95 into speedb-io:main Feb 11, 2023
ayulas pushed a commit that referenced this pull request Feb 26, 2023
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file.

Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing.

Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored.

Currently tested manually via db_bench. Unit tests will be forthcoming.
Yuval-Ariel pushed a commit that referenced this pull request May 1, 2023
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file.

Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing.

Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored.

Currently tested manually via db_bench. Unit tests will be forthcoming.
Yuval-Ariel pushed a commit that referenced this pull request May 4, 2023
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file.

Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing.

Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored.

Currently tested manually via db_bench. Unit tests will be forthcoming.
udi-speedb pushed a commit that referenced this pull request Nov 13, 2023
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file.

Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing.

Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored.

Currently tested manually via db_bench. Unit tests will be forthcoming.
udi-speedb pushed a commit that referenced this pull request Nov 15, 2023
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file.

Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing.

Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored.

Currently tested manually via db_bench. Unit tests will be forthcoming.
udi-speedb pushed a commit that referenced this pull request Dec 4, 2023
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file.

Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing.

Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored.

Currently tested manually via db_bench. Unit tests will be forthcoming.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Live Config Changes
3 participants