-
Notifications
You must be signed in to change notification settings - Fork 466
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow to scan keys with the suffix #2222
Conversation
@baobaomaomeng You can add a config to set the optimal number. We can talk about the default value later. |
I've now added some tests,add support hsacn zscan and SSCAN, and the final step is to add configuration options. I'm not sure how to add configuration options, nor do I know what a good default number of search times would be. I feel like I need some help in this area. |
@baobaomaomeng That's great, thank you for your efforts. Regarding how to add configuration items, you can refer to this PR. The default number may be any number(such as 100,000). We can have the Reviewers confirm the final value, this modification is quite simple. |
One point that confuses me is why the limit in the scan operator is not a configuration item, and now I need to add the maximum number to scan at once into the configuration items. |
@baobaomaomeng Hi, can you resolve the conflicts? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good work! Thank you for your contribution.
Can you please resolve the code conflicts first? I will continue to review the code after resolving the conflicts.
src/commands/scan_base.h
Outdated
@@ -83,6 +89,7 @@ class CommandScanBase : public Commander { | |||
std::string cursor_; | |||
std::string prefix_; | |||
int limit_ = 20; | |||
int pm_ = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you use enum instead of it?
Can you change and rename it to prefix_mode_
?
src/commands/cmd_server.cc
Outdated
@@ -878,7 +878,7 @@ class CommandScan : public CommandScanBase { | |||
|
|||
std::vector<std::string> keys; | |||
std::string end_key; | |||
auto s = redis_db.Scan(key_name, limit_, prefix_, &keys, &end_key); | |||
auto s = redis_db.Scan(key_name, limit_, srv->GetConfig()->max_scan_num, prefix_, &keys, &end_key, pm_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code looks ugly. Can we have some good way to deal with config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no clue about this, do you have any suggestions?
src/storage/redis_db.cc
Outdated
if (cnt_false >= scan_false_limit) { | ||
break; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not put those codes to begin?
if (!scan_match_check()) {
cnt_false++;
if (cnt_false < scan_false_limit) {
continue;
} else {
break;
}
}
src/storage/redis_db.cc
Outdated
if (keys->empty() && cnt_false >= scan_false_limit) { | ||
end_cursor->append(user_key); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be a situation where some keys have been found, but cnt_false >= scan_false_limit
.
src/storage/redis_db.cc
Outdated
rocksdb::Status Database::Scan(const std::string &cursor, uint64_t scan_success_limit, uint64_t scan_false_limit, | ||
const std::string &fix, std::vector<std::string> *keys, std::string *end_cursor, | ||
const int pm) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you change the param fix
to a more meaningful name? Can you keep the name the same as the header file?
98cff58
to
65a8cd5
Compare
.gitignore
Outdated
@@ -40,9 +40,13 @@ version.h | |||
.idea | |||
.vscode | |||
.cache | |||
.gitignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why it's ignored?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sorry, that was my oversight.
.gitignore
Outdated
test | ||
testdb | ||
|
||
build | ||
cmake-build-* | ||
debug-build | ||
test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicated test
@@ -187,7 +187,27 @@ func ScanTest(t *testing.T, rdb *redis.Client, ctx context.Context) { | |||
require.Equal(t, []string{"1", "10", "foo", "foobar"}, keys) | |||
}) | |||
|
|||
t.Run("ZSCAN with PATTERN", func(t *testing.T) { | |||
t.Run("HSCAN with subkey PATTERN", func(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems we need more test cases
@@ -39,14 +41,20 @@ class CommandScanBase : public Commander { | |||
|
|||
return ParseAdditionalFlags<true>(parser); | |||
} | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please don't introduce random changes.
.gitignore
Outdated
testdb | ||
|
||
build | ||
cmake-build-* | ||
debug-build | ||
test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
duplicate line, it has been added in previous section.
@@ -152,6 +153,7 @@ Config::Config() { | |||
{"pidfile", true, new StringField(&pidfile, kDefaultPidfile)}, | |||
{"max-io-mb", false, new IntField(&max_io_mb, 0, 0, INT_MAX)}, | |||
{"max-bitmap-to-string-mb", false, new IntField(&max_bitmap_to_string_mb, 16, 0, INT_MAX)}, | |||
{"max-scan-num", false, new IntField(&max_scan_num, 2, 0, INT_MAX)}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems 2
is too smaller to be as the max_scan_num
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's just for test easy,jihuayu say they will determine the final value.
src/storage/redis_db.cc
Outdated
@@ -309,10 +314,12 @@ rocksdb::Status Database::Keys(const std::string &prefix, std::vector<std::strin | |||
return rocksdb::Status::OK(); | |||
} | |||
|
|||
rocksdb::Status Database::Scan(const std::string &cursor, uint64_t limit, const std::string &prefix, | |||
std::vector<std::string> *keys, std::string *end_cursor, RedisType type) { | |||
rocksdb::Status Database::Scan(const std::string &cursor, uint64_t scan_success_limit, uint64_t scan_false_limit, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scan_matched_limt
and scan_mismatched_limit
should be a better name in this scenario.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The parameter of the SCAN command is too long now, it's time to group the scan parameters into a struct.
src/storage/redis_db.cc
Outdated
rocksdb::Status Database::Scan(const std::string &cursor, uint64_t scan_success_limit, uint64_t scan_false_limit, | ||
const std::string &match_str, std::vector<std::string> *keys, std::string *end_cursor, | ||
RedisType type, const MatchType match_mode) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some questions about your design.
But before that, could you give a little introduction about your general design and supported patterns in the PR description?
Then we can have a detailed review on your code.
Refer to Redis glob-style patterns:
Supported glob-style patterns:
h?llo
matches hello, hallo and hxllo
h*llo
matches hllo and heeeello
h[ae]llo
matches hello and hallo, but not hillo
h[^e]llo
matches hallo, hbllo, ... but not hello
h[a-b]llo
matches hallo and hbllo
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just added two matching patterns.
such as
*llo matches ello,hello,eeeello
l matches l,ll,llo,ello,hello
And,i add scan_false_time limit to config,beacasue if we use suffix,substring match, we need scan one by one, and the potentially long time may be unacceptable. so I add this argument in scan,zsan,sscan,hscan command。I found that the three scan methods to get the last scanned key are all different, which is not convenient for me to modify,so i add a string value in CommandSubkeyScanBase to represented the last key that was scanned。
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been thinking about how to design a scalable scan, but I don't have any ideas. This might be a bit difficult for me. I think I need some guidance on the thought process right now.
In my opinion, we need a well-designed solution for implementing Redis glob-style patterns that is scalable, rather than just handling specific cases. If the design is not scalable, we may have to remove all modifications when aiming to support the complete pattern cases. |
Thank you for your review, but this is not consistent with the goal I communicated at the beginning. I now need to rewrite my code, and I think I need to consider how to write it.Could you give me some reference suggestions? |
closed by #2608. |
After modifying the code, I'm encountering some strange issues. I don't understand why this test is failing.
At the same time, there's a bit of a performance issue. I'm not sure about the optimal number of scans for substring matching and suffix matching, I haven't set a limit yet. In the next couple of days, I'll try to add support for HSCAN, SSCAN, and ZSCAN.