-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add RefreshOptions method to support live config changes #294
Conversation
Allows for configuration changes to be updated by reading an Options file periodically.
@mrambacher - Is there an issue for this PR? @mrambacher - Could you please handle? |
Why is this feature not available in RocksDB Lite? |
I think we should add support for this feature in db_stress and crash testing |
@@ -130,6 +130,8 @@ struct MutableDBOptions { | |||
uint64_t delete_obsolete_files_period_micros; | |||
unsigned int stats_dump_period_sec; | |||
unsigned int stats_persist_period_sec; | |||
unsigned int refresh_options_sec; | |||
std::string refresh_options_file; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again. I know that all of the other options are not initialized at the declaration point and are initialized in the ctor-initializer list, but why not init here and guarantee initialization in all ctor-s without having to add the line or in case somebody forgets?
void DBImpl::RefreshOptions() { | ||
if (shutdown_initiated_) { | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are not under the lock of the db's mutex_. Should you (to verify the shutdown_initiated_ doesn't become true after you check)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exactly the same code as under the other PeriodicTasks (like RefreshStats). I am not positive of the rules for the mutex_, but followed the other examples.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I do hope that there is no issue with this approach that everybody in relying on and just copying. It would be preferable to verify that there is no issue with this approach.
cf_name.c_str()); | ||
} else if (!cfd->IsDropped()) { | ||
s = SetCFOptionsImpl(cfd, cf_opt_map); | ||
if (!s.ok()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does failure to update a cf's options mean we end up with a partial update?
We may have updated some of the cf-s and we'll skip the rest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If updating the DBOptions fails, the entire process fails.
If any of the ColumnFamilyOptions updates fails, the DBOptions will be updated (if changed) and any other ColumnFamilyOptions will still be updated. It is extremely difficult to roll back to the previous version of the options, as you cannot tell if any background task (such as compaction) started during the refresh options process. Because of this, the code prints an error if the options upgrade fails but continues anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the short answer is yes?
Why not validate the options before applying them with the column family?
There is a static ValidateOptions() method for a CF.
idx++; | ||
} | ||
} | ||
s = fs_->DeleteFile(new_options_file, IOOptions(), nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case we have failed deleting this file (which may or may not be a valid options file that we have just successfully updated or failed to update), does this mean we will repeat the process as long as the file is there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if we fail to delete the file, we will repeat the process. If we wish, we can add a follow-up PR to check the file modification time or something like that to make sure it is newer than the last time we tried. However, this requires the addition of "state" (to store the last modification time) that I did not want to add at this point.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Please open a separate issue to handle this in the future.
The PeriodicTaskScheduler is not available in LITE mode. I do not know why. I do not know how things were decided LITE or not. |
- Add command line flag to db_stress_tool and db_bench - Set the default to check for options updates every hour (3600 seconds).
void DBImpl::RefreshOptions() { | ||
if (shutdown_initiated_) { | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. I do hope that there is no issue with this approach that everybody in relying on and just copying. It would be preferable to verify that there is no issue with this approach.
cf_name.c_str()); | ||
} else if (!cfd->IsDropped()) { | ||
s = SetCFOptionsImpl(cfd, cf_opt_map); | ||
if (!s.ok()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, the short answer is yes?
Why not validate the options before applying them with the column family?
There is a static ValidateOptions() method for a CF.
idx++; | ||
} | ||
} | ||
s = fs_->DeleteFile(new_options_file, IOOptions(), nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Please open a separate issue to handle this in the future.
db/db_options_test.cc
Outdated
// Test with a file that contains no DBOptions section | ||
ASSERT_OK(CreateFile(fs, options.refresh_options_file, | ||
"[CFOptions \"default\"]\n", false)); | ||
TEST_SYNC_POINT("DBOptionsTest::WaitForUpdates"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to see that a file that contains just an empty DBOptions + CFOptions passes parsing and then test files that contains just one of them.
|
||
SyncPoint::GetInstance()->DisableProcessing(); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments about the unit tests:
- You are using the name Options.New as the file name in the test while the default is Options.new. It might be preferable to pick a name that will be different also in case insensitive file systems (Windows?).
- I suggest adding tests for the following cases:
- That the file is indeed deleted after a refresh cycle?
- The specified file in not found during a refresh cycle (refresh_options_file is not empty)
- refresh_options_file is empty but the file by the default name Options.new is found as expected and updates the options
- The file specified is a binary file
- Multiple refreshes, each time with a different contents of the same file
- Multiple refreshes, the name of the file changes
- Having options for a CF that doesn't exist
- An invalid file name (invalid characters in the path)
#ifndef ROCKSDB_LITE | ||
options.refresh_options_sec = FLAGS_refresh_options_sec; | ||
options.refresh_options_file = FLAGS_refresh_options_file; | ||
#endif // ROCKSDB_LITE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will allow the db_stress user to set a refresh rate & file. However, to actually stress test this feature we need to be able to actually create options files to be used by the production code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that we need such a tool. I hope to work on one in the near future under a separate PR (for general use as well as this stress test).
- Add tests for different/default refresh options file; - Added check that options file is deleted on completion - Add tests with multiple refreshes of the same file
@mrambacher , it seems the checks have failed.. how is this in QA? |
@Yuval-Ariel I have already reviewed this. Has anything changed that needs another review? |
@udi-speedb all the approved reviews were removed for some reason.. and the merging needs at least one approval. |
Oh, I see. OK. Evil @maxb-io modified (per my request) the policy so that whenever a push is made to an already approved PR, the approval is revoked and the PR needs to be approved again. This is to verify that if anyone updates an already approved PR (usually just squashing / rebasing), it will not go unnoticed and retain its approved state in case an (not less evil) person pushed real content. |
- Missed periodic task after changing the default - Options settable missed as does not run on the Mac - Fixed race condition on writing/checking files in db_options_test
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file. Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing. Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored. Currently tested manually via db_bench. Unit tests will be forthcoming.
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file. Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing. Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored. Currently tested manually via db_bench. Unit tests will be forthcoming.
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file. Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing. Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored. Currently tested manually via db_bench. Unit tests will be forthcoming.
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file. Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing. Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored. Currently tested manually via db_bench. Unit tests will be forthcoming.
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file. Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing. Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored. Currently tested manually via db_bench. Unit tests will be forthcoming.
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file. Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing. Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored. Currently tested manually via db_bench. Unit tests will be forthcoming.
Added a method and options to support reading configuration changes to options periodically while a process is running. Every "refresh_options_sec", the periodic task scheduler runs a task to check if the "refresh_options_file" exists. If it does, the file is loaded as an options file and the mutable options from the file are updated in the running process (via SetDBOptions and SetOptions). Once updated, the refresh file is deleted. If an error occurs, it will be written to the log file.
Note that the file is read as an Options file and must be a valid Options file. This means that the file must have a DBOptions section and at least one ColumnFamilyOptions section. Only options that are mutable can be in this file -- immutable options will cause an error and abort the processing.
Additionally, note that TableFactory options must be specified as part of the ColumnFamilyOptions. For example, to change the block size, the "table_factory.block_size" option should be set in the appropriate ColumnFamilyOptions. Options set in a TableFactory section will be ignored.
Currently tested manually via db_bench. Unit tests will be forthcoming.