-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
evalengine: numbers #9623
evalengine: numbers #9623
Conversation
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
expected := | ||
"[[VARBINARY(\"\\x01\") INT64(1)] " + | ||
"[VARBINARY(\"Hello Gopher!\") INT64(3)] " + | ||
"[VARBINARY(\"\\xa5\") INT64(2)] " + | ||
"[VARBINARY(\"\\xc2l\\xaa\\x1a^\\xb9@\\x96Қ\\x1b\\xec\") INT64(4)]]" | ||
assert.Equal(t, expected, fmt.Sprintf("%v", result.Rows)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mattlord I've cherry picked your test from #9616 here. I did have to change the expected output. Two notes:
-
Because of the context they're in, the hex values and numbers are always coerced to
VARBINARY
during evaluation, so their original types are lost. AFAICT this matches MySQL's behavior -- I hope it won't be a problem in practice. -
Since the resulting rows are
order by id
, they should come as1, 3, 2, 4
, since that's the proper ordering of the values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Internal transformations should be fine. I did a quick manual test with this branch and the results match up with what I had seen with PlanValue before:
$ mysql customer -e "select id, hex_keyspace_id, shard from customer.binary_vdx where id = 0x997901"
+------+-----------------+-------+
| id | hex_keyspace_id | shard |
+------+-----------------+-------+
| �y | 997901 | 80- |
+------+-----------------+-------+
$ mysql customer -e "select id, hex_keyspace_id, shard from customer.binary_vdx where id = x'997901'"
+------+-----------------+-------+
| id | hex_keyspace_id | shard |
+------+-----------------+-------+
| �y | 997901 | 80- |
+------+-----------------+-------+
$ mysql customer -e "select id, hex_keyspace_id, shard from customer.binary_md5_vdx where id = 0x997901"
+------+----------------------------------+-------+
| id | hex_keyspace_id | shard |
+------+----------------------------------+-------+
| �y | 2d2da332b59f56aeb235ecb36d639440 | -80 |
+------+----------------------------------+-------+
vitess@003f6cbe061b:/vt/local$ mysql customer -e "select id, hex_keyspace_id, shard from customer.binary_md5_vdx where id = x'997901'"
+------+----------------------------------+-------+
| id | hex_keyspace_id | shard |
+------+----------------------------------+-------+
| �y | 2d2da332b59f56aeb235ecb36d639440 | -80 |
+------+----------------------------------+-------+
$ mysql customer -e "select id, hex_keyspace_id, shard from customer.xxhash_vdx where id = 0x997901"
+------+------------------+-------+
| id | hex_keyspace_id | shard |
+------+------------------+-------+
| �y | ac29dcaa4a0b4527 | 80- |
+------+------------------+-------+
vitess@003f6cbe061b:/vt/local$ mysql customer -e "select id, hex_keyspace_id, shard from customer.xxhash_vdx where id = x'997901'"
+------+------------------+-------+
| id | hex_keyspace_id | shard |
+------+------------------+-------+
| �y | ac29dcaa4a0b4527 | 80- |
+------+------------------+-------+
THANK YOU! 😍
YES to release-13.0. |
Signed-off-by: Vicent Marti <vmg@strn.cat>
Signed-off-by: Vicent Marti <vmg@strn.cat>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
Description
Heyyooo you thought that tagging RC1 would stop me BUT I CANT BE STOPPED.
Here's a battery of bug fixes (no new functionality) to the way the evaluation engine handles numbers. This is divided in broadly two fixes:
Hexadecimal literals were not being handled at all
This was an oversight because they've been supported by the parser forever, and the evaluation engine would just report them as unsupported. This was causing a lot of plans for gen4 to show up as unsupported, when the plan was actually fully supported except for the hex identifier.
Implementing support for these literals was not trivial, because their semantics change whenever they're used in a numeric or non-numeric context, but once I got them working properly it greatly increased my confidence in a lot more numerical operations which we were supported but which I hadn't tested well enough yet. I've expanded these tests significantly and the results are good.
As a result of these changes, several query plans that were previously unsupported are now supported. 👌
Negative numbers
"But I thought we already handled negative numbers?" Not quite. The
evalengine
was actually blissfully unaware of negative numbers, because of an optimization in thesqlparser
which was not quite correct.A helper method on
ast_funcs.go
in the parser would modify the resulting AST for the parse, so that when a numeric literal was preceded by a-
, the-
would be inserted as part of the literal (instead of emitting an unary "minus" operator). Likewise, when two unary minus operators appeared in a row, the same helper inast_funcs.go
would just optimize them away (since they cancel each other).It turns out that neither of these optimizations is valid at all. The unary minus operator needs to appear in the AST for evaluation to be valid because the unary minus operator can change the types of the numerical identifier it applies to. Using
-
before a large enoughBIGINT
can cause it to be promoted to decimal if it's negative is too small to fit in the data type, and likewise for unsigneds, floats, and all kinds of janky stuff.So, this PR removes that optimization and makes it so
sqlparser
always emits unary minus nodes for all literals, which now we always handle gracefully.Interestingly, this has also fixed several other query plans which were previously failing, more specifically, query plans that involved literals with very large numbers. We now handle those properly, reducing the amount of unsupported plans in our tests and improving the quality of some of the generated plans.
The only remaining question is whether this needs a backport to the RC1 branch. My hunch is "yes", it fixes real bugs that have been here forever and that would be nice to ship in the upcoming release. It increases the amount of supported plans for gen4, and it increases our confidence in the behavior of the new
evalengine
.I'll discuss this further on today's open hour.
Related Issue(s)
Checklist
Deployment Notes