-
Notifications
You must be signed in to change notification settings - Fork 6.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for FastLRUCache in stress and crash tests. #10081
Conversation
db_stress_tool/db_stress_gflags.cc
Outdated
@@ -346,6 +346,10 @@ DEFINE_int32( | |||
DEFINE_bool(use_clock_cache, false, | |||
"Replace default LRU block cache with clock cache."); | |||
|
|||
DEFINE_bool( | |||
use_fast_lru_cache, false, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be better to use an enum to specify the desired cache type, rather than separate boolean flags for fast_lru_cache and clock_cache. I know currently we don't test the clock cache because its broken, but when we fix it it would be hard to enforce the mutual exclusion in db_crashtest.py.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should also resolve interaction with --secondary_cache_uri option while FastLRU doesn't support secondary cache. Even if nothing fails, I'm worried about diluting the effectiveness of the secondary cache testing if half the time we aren't actually testing with secondary cache.
I noticed almost this exact problem the other day and fixed it for db_bench and cache_bench in #8998. It would be trivial to do the same work for db_stress as well. |
OK, yes I think that #8998 makes sense as the eventual way to configure Cache by string for db_stress, db_bench, etc. but we need something temporary to unblock some other work. |
For the time being, a good option would be to sanitize in db_crashtest.py. We already do that for a bunch of options in |
include/rocksdb/cache_type.h
Outdated
@@ -0,0 +1,39 @@ | |||
// Copyright (c) 2011-present, Facebook, Inc. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created a separate file for cache types and conversions because they will soon be used in cache bench and db bench.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
include/rocksdb
is the public API. FastLRUCache is not intended to be exposed in the public API (at least not yet) and an enum like this in the public API runs counter to the idea of Cache being Customizable (eventually): that custom implementations are, as much as possible, given equal treatment to built-in implementations.
Even if we make this an internal-only enum, I don't see what an enum gets us if we're always just doing string -> if/else -> enum -> switch/if-else on that enum, when we could just do if-else on the string.
It looks like we already have a Cache::CreateFromString function that I think we should be using here, except we need the id=
feature from #10084. Let me see if I can fast-track #10084 to untangle the parallel development mess here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pdillinger Are you suggesting waiting for #10084 before merging this PR, or change this PR to use string directly instead of enum and use Cache::CreateFromString later when that's ready?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anand1976 I talked to Peter and we agreed to do the latter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@guidotag has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Stress tests can run with the experimental FastLRUCache. Crash tests randomly choose between LRUCache and FastLRUCache.
Since only LRUCache supports a secondary cache, we validate the
--secondary_cache_uri
and--cache_type
flags---when--secondary_cache_uri
is set, the--cache_type
is set tolru_cache
.Test plan:
make -j24 CRASH_TEST_EXT_ARGS=—duration=960 blackbox_crash_test_with_atomic_flush
. The cache type should sometimes befast_lru_cache
.make -j24 CRASH_TEST_EXT_ARGS="--duration=960 --secondary_cache_uri=x" blackbox_crash_test_with_atomic_flush
multiple times. The test will always be aborted (which is okay). Check that the cache type is alwayslru_cache
.