-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add an option to trigger flush when the number of range deletions reach a threshold #11358
Conversation
2287f58
to
857b8eb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for making the change and it's very helpful for our use case (#11407)!
@@ -170,6 +172,11 @@ size_t MemTable::ApproximateMemoryUsage() { | |||
} | |||
|
|||
bool MemTable::ShouldFlushNow() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to log the reason for flush? Maybe consider adding the reason to FlushReason
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let me see if i can identify where to generate the flush reason.
857b8eb
to
b44ef52
Compare
hi @yao-xiao-github ,@cbi42 I did the requested changes. I couldn't figure out where to set the FlushReason. |
b44ef52
to
e7949f0
Compare
Thanks for making the changes. I also tried to expose the FlushReason but didn't find a good way. All the memtable related flushes seem to be put under the same reason. @cbi42 Could you provide some hints? Or is it possible to put info log here? |
Hi - I also don't see an easy way to add FlushReason for this change, especially because ShouldFlushNow() was exclusively used to check if a memtable was full. Maybe we can just log the number of range tombstones here for now: Line 883 in a5909f8
Also could you add a unit test to check that flush is triggered correctly based on the new option? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't have perms to +1 this PR but LGTM.
Patched a 8.1.1 checkout then ran against a dataset that was stuck in recovery, OOM'ing every time because too many range deletes (Yao's case). Ran patched binary with all defaults and still OOM. Then I set memtable_max_range_deletions=1000, recompiled and retried. Lots of flushes but was able to open the database.
include/rocksdb/options.h
Outdated
@@ -331,6 +331,11 @@ struct ColumnFamilyOptions : public AdvancedColumnFamilyOptions { | |||
// Default: nullptr | |||
std::shared_ptr<SstPartitionerFactory> sst_partitioner_factory = nullptr; | |||
|
|||
// Automatic flush after range deletions count in memtable hits this limit. | |||
// helps with workloads having lot of range deletes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggestion: If making another edit, you might say more than 'helps'; it is a guard against accumulating too many range deletes in memtable such that you get into a state where you cannot flush w/o OOM'ing.
2222b8e
to
a1e333c
Compare
@cbi42 Could you please take another look? Thanks! |
Hi @cbi42. Apparently GitHub forbids opening PRs for a maintainer to edit on repos associated with an organization. At @vrdhn's request I did make you a contributor on the repo. Let me know if that works for you. Otherwise, we can create a new PR from a repo associated with a personal account instead. |
0a99266
to
956cb69
Compare
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@vrdhn has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@vrdhn has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
db/memtable.h
Outdated
if (memtable_max_range_deletions_ > 0 && | ||
memtable_max_range_deletions_reached_ == false && | ||
val >= (uint64_t)memtable_max_range_deletions_) { | ||
memtable_max_range_deletions_reached_ = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like a data race. What about comparing num_range_deletes_ against memtable_max_range_deletions_ in ShouldFlushNow()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related (similar scenario where the flag started with one value and could only ever flip once to the other value): #4801
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to check num_range_deletes_
against memtable_max_range_deletions_
in ShouldFlushNow().
4c6d625
to
bdfe899
Compare
@vrdhn has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@vrdhn has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
@vrdhn has updated the pull request. You must reimport the pull request before landing. |
@cbi42 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
…ch a threshold (#11358) Summary: Add a mutable column family option `memtable_max_range_deletions`. When non-zero, RocksDB will try to flush the current memtable after it has at least `memtable_max_range_deletions` range deletions. Java API is added and crash test is updated accordingly to randomly enable this option. Pull Request resolved: facebook/rocksdb#11358 Test Plan: * New unit test: `DBRangeDelTest.MemtableMaxRangeDeletions` * Ran crash test `python3 ./tools/db_crashtest.py whitebox --simple --memtable_max_range_deletions=20` and saw logs showing flushed memtables usually with 20 range deletions. Reviewed By: ajkr Differential Revision: D46582680 Pulled By: cbi42 fbshipit-source-id: f23d6fa8d8264ecf0a18d55c113ba03f5e2504da
…ch a threshold (#11358) Summary: Add a mutable column family option `memtable_max_range_deletions`. When non-zero, RocksDB will try to flush the current memtable after it has at least `memtable_max_range_deletions` range deletions. Java API is added and crash test is updated accordingly to randomly enable this option. Pull Request resolved: facebook/rocksdb#11358 Test Plan: * New unit test: `DBRangeDelTest.MemtableMaxRangeDeletions` * Ran crash test `python3 ./tools/db_crashtest.py whitebox --simple --memtable_max_range_deletions=20` and saw logs showing flushed memtables usually with 20 range deletions. Reviewed By: ajkr Differential Revision: D46582680 Pulled By: cbi42 fbshipit-source-id: f23d6fa8d8264ecf0a18d55c113ba03f5e2504da
Add a mutable column family option
memtable_max_range_deletions
. When non-zero, RocksDB will try to flush the current memtable after it has at leastmemtable_max_range_deletions
range deletions. Java API is added and crash test is updated accordingly to randomly enable this option.Test plan:
DBRangeDelTest.MemtableMaxRangeDeletions
python3 ./tools/db_crashtest.py whitebox --simple --memtable_max_range_deletions=20
and saw logs showing flushed memtables usually with 20 range deletions.