Add command parser #1032

PragmaTwice · 2022-10-23T15:20:49Z

Partially follows #598.

The current CommandParser is a prototype that adds only a few methods that are really usable at the moment (but does not affect it being merged).

To demonstrate the use of CommandParser, I rewrote the Parse implementation for a few commands in redis_cmd.cc. It is easy to see that the amount of code is massively (multiplied) reduced, and the parsing logic can be expressed in just a few lines of code, especially when complex-syntax commands are encountered.

Other changes: macro GET_OR_RET is added to status.h to simplify status-related code flow via statement expression.

…ocks into command-parser

torwig · 2022-10-23T18:00:25Z

tests/gocase/unit/type/strings/strings_test.go

-		require.ErrorContains(t, rdb.Do(ctx, "SET", "foo", "bar", "exat", "0").Err(), "invalid expire time")
-		require.ErrorContains(t, rdb.Do(ctx, "SET", "foo", "bar", "pxat", "1234xyz").Err(), "not an integer")
-		require.ErrorContains(t, rdb.Do(ctx, "SET", "foo", "bar", "pxat", "0").Err(), "invalid expire time")
+		require.ErrorContains(t, rdb.Do(ctx, "SET", "foo", "bar", "ex", "1234xyz").Err(), "non-integer")


@PragmaTwice Really nice job!
These error messages (and others like ERR wrong number of arguments) were written that way to be consistent with the Redis protocol.

127.0.0.1:6379> set foo bar ex 1234tyg (error) ERR value is not an integer or out of range 127.0.0.1:6379> set foo bar ex 0 (error) ERR invalid expire time in 'set' command 127.0.0.1:6379>

So I'm not sure if it's correct to change them.

I think it is hard to keep all error messages same as redis (and there are currently lots of different error message between redis and kvrocks, including the two error message in your comments), and it may make the develop of kvrocks more and unnecessarily complex. And actually I think there is nearly no difference between "syntax error" and "wrong number of arguments", or "encounter non-integer characters" and "not a integer".

Yes, you are right, it's not easy to keep identical error messages. Since Redis doesn't have error codes, I was wondering if Redis-clients parse error messages to get something useful from them or just signal error/no-error? Does error message be considered a part of Redis-protocol?

I think it is not possible to be compatible to all redis error, since there are already many errors that are different than redis, some of which are kvrocks-only error. So if they parse them, they cannot get the right message. And redis does not guarantee that they will keep old error message in new version, so I do not think it is necessary to keep error message identical to redis.

src/commands/command_parser.h

src/commands/redis_cmd.cc

…into command-parser

tisonkun

Comments inline. After a walk-through, I still have some concerns about the macro trick. The rest generally looks good to me.

src/common/status.h

git-hulk · 2022-10-25T09:45:30Z

@PragmaTwice After looking through the PR, I was a bit worry that it needs to take some time for most developers(include myself) to understand how to use it and how it works. And for Redis command arguments, there're only three argument types:

string
bool, like NX/EX/PX and so on
number(int/float) like TTL and score

So I'm wondering if we can simplify the parser API like below:

while(token = parse.next()) {
 switch tolower(token):
 case "ex":
    status = parser.expect<int>(&ttl)
 case "px":
    status = parser.expect<int64_t>(&ttl_ms)  
 ...
}

So that users can only care about the next token and what's next is expected.

PragmaTwice · 2022-10-25T10:02:43Z

@PragmaTwice After looking through the PR, I was a bit worry that it needs to take some time for most developers(include myself) to understand how to use it and how it works. And for Redis command arguments, there're only three argument types:

string

bool, like NX/EX/PX and so on

number(int/float) like TTL and score

So I'm wondering if we can simplify the parser API like below:
while(token = parse.next()) {
 switch tolower(token):
 case "ex":
    status = parser.expect<int>(&ttl)
 case "px":
    status = parser.expect<int64_t>(&ttl_ms)  
 ...
}
So that users can only concern token and what's next is expected.

I think, there are lots of problem we need to handle in the sample code:

We cannot always move next: for example, to parse (EX v1) | (PX v2) | v3, we need first peek the token (EX or PX), then we can move next, otherwise we may lose v3. For a parser, moving next at every step will severely damage its parsing ability.
We need a method to forward error: this is where the sample code is idealized, error handling needs to be abstracted
We need a method to prevent different flags in the same layer: for example, to parse [EX a | PX b] | [X | Y], we need to reject something like EX v PX v, X Y or EX v X PX v, and accept EX v EX v, EX v X or Y PX v.
If we still need a pattern like this:

int v;
int real_v;
status = parser.expectInt(&v);
if(!status) return ...;
real_v = handle(v);

rather than which this PR provides:

auto real_v = handle(GET_OR_RET(parser.TakeInt()));

then I think we may be hard to use many abstraction provided in modern C++.

Simplifying code means doing good abstraction, and of course good abstraction has a learning cost, but I still feel that the current abstraction is intuitive:

parser.Good(): to check if there is still element remain to parse
parser.EatICaseFlag(str, flag): to match a specific flag token, move next while sucessful. It can be learned from this example.
parser.TakeInt() or parser.TakeStr(): to eat a new integer or string

And I think there is a big question:

If we still want to extract a token and process it in a manual way (and handle every condition manually), then I think we do not need a parsing framework which formalize our command parsing procedure by some parsing techniques.

git-hulk · 2022-10-25T10:39:07Z

@PragmaTwice Thanks for your explanation.

We cannot always move next: for example, to parse (EX v1) | (PX v2) | v3, we need first peek the token (EX or PX), then we can move next, otherwise we may lose v3. For a parser, moving next at every step will severely damage its parsing ability.

Yes, I got your point. What if we use the parser to iterator all tokens instead of only flags. I will take ZADD command as example:

while(token = parser.next()) {
  case "NX":
    _flags = nx;
  case "INCR":
    _flags = incr;
  default:
    break;
}
while(parse.has_next()) {
   status = parser.expected<double>()
   parse.next()
   status = parser.expected<string>()
}

We need a method to forward error: this is where the sample code is idealized, error handling needs to be abstracted

Yes, it's just a rough idea which didn't think carefully.

We need a method to prevent different flags in the same layer: for example, to parse [EX a | PX b] | [X | Y], we need to reject something like EX v PX v, X Y or EX v X PX v, and accept EX v EX v, EX v X or Y PX v.

In my option, the parser should only care about how to iterator and the type(or range) is right. For whether those flags are exclusive or not, it'd better to handle outside the parser, or the parser will become more and more complex.

Simplifying code means doing good abstraction, and of course good abstraction has a learning cost

Agreed, what I think is if we have more intuitive way to achieve this, so that developers can use it with less learn cost.

PragmaTwice · 2022-10-25T10:53:02Z

In my option, the parser should only care about how to iterator and the type(or range) is right. For whether those flags are exclusive or not, it'd better to handle outside the parser, or the parser will become more and more complex.

In this PR, I added only about 5 lines of code to successfully solve this problem (it is so common in redis command, almost in every redis command with a optional flag), and simplified the code hugely (remove SO MANY duplicated code related to this logic). So I do not think it is unnecessary in the parsing framework.

I think a parser should care about every parsing logic, because every logic is related to whether the parser should move next or hold on.

git-hulk · 2022-10-25T11:45:52Z

In my option, the parser should only care about how to iterator and the type(or range) is right. For whether those flags are exclusive or not, it'd better to handle outside the parser, or the parser will become more and more complex.

In this PR, I added only about 5 lines of code to successfully solve this problem (it is so common in redis command, almost in every redis command with a optional flag), and simplified the code hugely (remove SO MANY duplicated code related to this logic). So I do not think it is unnecessary in the parsing framework.

I think a parser should care about every parsing logic, because every logic is related to whether the parser should move next or hold on.

Yes, the parsing framework truly removes many duplicate codes. My proposition is whether we can reduce the learning cost if we expect all commands depend on it. And for the parsing framework should care about every logic or not, I have no the solid reason now, so I think we can leave as it be.

PragmaTwice · 2022-10-25T12:32:50Z

@git-hulk There is an example in unit tests which parses some command in the syntax [ HELLO i1 v1 | HI v2 ] [X i2 | Y] (where i1 i2 are integers and v1 v2 are strings), and I think it demonstrate how to use the CommandParser well. I think from this example, the interface provided by the current framework can be quickly understood.

git-hulk · 2022-10-26T01:14:33Z

mand in the syntax [ HELLO i1 v1 | HI v2 ] [X i2 | Y] (where i1 i2 are integers and v1 v2 are strings), and I think it demonstrate how to use the CommandParser well. I think from this example, the interface provided by the current framework can be quickly understood.

@PragmaTwice Thank you! I'll have another pass again.

git-hulk · 2022-10-26T14:45:25Z

To be honest, I'm still a bit hard understanding the implementation well(maybe I should learn more about C++ templates), especially in the part about the exclusive flag. I'm very happy to see this push forward if other folks feel good.

PragmaTwice · 2022-10-27T14:47:24Z

To be honest, I'm still a bit hard understanding the implementation well(maybe I should learn more about C++ templates), especially in the part about the exclusive flag. I'm very happy to see this push forward if other folks feel good.

I think if the API is clear, intuitive and easy to understand enough, then maybe developers do not need to care and understand the implementation details.
A classic redis command parsing scenario is demonstrated in this code snippet below, and it can be seen that the parsing process is relatively intuitive.

https://github.com/apache/incubator-kvrocks/blob/85ae20ddff43bb71e8370b7a2b19c51c746b6871/tests/cppunit/command_parser_test.cc#L35-L50

PragmaTwice · 2022-10-30T08:02:42Z

Hi everyone, any new thoughts on this PR?

tisonkun

I think it's good to go as long as @PragmaTwice you'll drive the development of the command parsing effort - perhaps the one filed as #794.

If anyone who later works on this domain has further thoughts, it's viable to make an enhancement proposal. This change is not a one-way decision.

torwig

LGTM.
@PragmaTwice Thank you for your effort. Maybe later today I'll have a chance to use the new parser in action.

tanruixiang

LGTM. After understanding how to use it I think it is concise enough.

PragmaTwice · 2022-10-30T14:20:08Z

Thanks all. Merging...

PragmaTwice added 5 commits October 23, 2022 15:41

init command parser

70ec928

fix errors

9aebbcb

remove not-in-use method

672cecf

more commands

96bba43

fix

3673d31

PragmaTwice requested review from git-hulk, ShooterIT and caipengbo October 23, 2022 15:20

PragmaTwice added 3 commits October 23, 2022 23:21

Merge branch 'unstable' into command-parser

cafd72d

add unit test

f3e510a

Merge branch 'command-parser' of github.com:PragmaTwice/incubator-kvr…

3951c21

…ocks into command-parser

PragmaTwice requested review from tanruixiang and tisonkun October 23, 2022 15:40

PragmaTwice added 2 commits October 24, 2022 00:05

fix

48c5057

fix error

dfcc5cb

torwig reviewed Oct 23, 2022

View reviewed changes

add parse move unittest

05b6891

This was referenced Oct 24, 2022

Tracking issue for new command parsing framework #794

Closed

Flaky test on expire precision #983

Closed

PragmaTwice added 2 commits October 24, 2022 23:23

Merge branch 'unstable' of https://github.com/apache/incubator-kvrocks …

e8bc110

…into command-parser

Merge branch 'unstable' into command-parser

85ae20d

tisonkun reviewed Oct 25, 2022

View reviewed changes

src/common/status.h Show resolved Hide resolved

PragmaTwice requested review from torwig and tisonkun October 27, 2022 15:25

tisonkun approved these changes Oct 30, 2022

View reviewed changes

torwig approved these changes Oct 30, 2022

View reviewed changes

Merge branch 'unstable' into command-parser

f925a67

tanruixiang approved these changes Oct 30, 2022

View reviewed changes

PragmaTwice merged commit b83212e into apache:unstable Oct 30, 2022

tanruixiang mentioned this pull request Nov 13, 2022

feat: add support for Redis sintercard command #1116

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add command parser #1032

Add command parser #1032

PragmaTwice commented Oct 23, 2022 •

edited

Loading

torwig Oct 23, 2022

PragmaTwice Oct 24, 2022 •

edited

Loading

torwig Oct 27, 2022

PragmaTwice Oct 28, 2022

tisonkun left a comment

git-hulk commented Oct 25, 2022 •

edited

Loading

PragmaTwice commented Oct 25, 2022 •

edited

Loading

git-hulk commented Oct 25, 2022 •

edited

Loading

PragmaTwice commented Oct 25, 2022 •

edited

Loading

git-hulk commented Oct 25, 2022

PragmaTwice commented Oct 25, 2022 •

edited

Loading

git-hulk commented Oct 26, 2022

git-hulk commented Oct 26, 2022

PragmaTwice commented Oct 27, 2022

PragmaTwice commented Oct 30, 2022

tisonkun left a comment

torwig left a comment

tanruixiang left a comment

PragmaTwice commented Oct 30, 2022

Add command parser #1032

Add command parser #1032

Conversation

PragmaTwice commented Oct 23, 2022 • edited Loading

torwig Oct 23, 2022

Choose a reason for hiding this comment

PragmaTwice Oct 24, 2022 • edited Loading

Choose a reason for hiding this comment

torwig Oct 27, 2022

Choose a reason for hiding this comment

PragmaTwice Oct 28, 2022

Choose a reason for hiding this comment

tisonkun left a comment

Choose a reason for hiding this comment

git-hulk commented Oct 25, 2022 • edited Loading

PragmaTwice commented Oct 25, 2022 • edited Loading

git-hulk commented Oct 25, 2022 • edited Loading

PragmaTwice commented Oct 25, 2022 • edited Loading

git-hulk commented Oct 25, 2022

PragmaTwice commented Oct 25, 2022 • edited Loading

git-hulk commented Oct 26, 2022

git-hulk commented Oct 26, 2022

PragmaTwice commented Oct 27, 2022

PragmaTwice commented Oct 30, 2022

tisonkun left a comment

Choose a reason for hiding this comment

torwig left a comment

Choose a reason for hiding this comment

tanruixiang left a comment

Choose a reason for hiding this comment

PragmaTwice commented Oct 30, 2022

PragmaTwice commented Oct 23, 2022 •

edited

Loading

PragmaTwice Oct 24, 2022 •

edited

Loading

git-hulk commented Oct 25, 2022 •

edited

Loading

PragmaTwice commented Oct 25, 2022 •

edited

Loading

git-hulk commented Oct 25, 2022 •

edited

Loading

PragmaTwice commented Oct 25, 2022 •

edited

Loading

PragmaTwice commented Oct 25, 2022 •

edited

Loading