Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use decoded hex string when calculating the keyspace ID #9277

Merged
merged 4 commits into from
Nov 29, 2021

Conversation

mattlord
Copy link
Contributor

@mattlord mattlord commented Nov 23, 2021

Description

This is a follow-up to #9118

There we normalized the incoming x'A1' or 0xA1 string literal token that represents binary data as hex so that we can pass it on to MySQL as a BindVar.

However, this means that when calculating the Keyspace ID in the vindex functions we need to decode the value again (as we do when parsing the incoming query) so that we're doing the keyspace range check using a decoded representation of the bytes that end up being stored in the column in MySQL.

This is accomplished here and here by properly managing the HexVal and HexNum BindVars:

  • The "raw" value (Value.val / Value.Raw() and its variants) is "internal" and represents what was seen in the SQL query (e.g. x'997902' or 0x997902)
  • The ToBytes value (Value.ToBytes()) returns the bytes that mysql would return when querying the column (e.g. �y).
    For example:
    $ mysql -u root -h 127.0.0.1 --binary-as-hex=false mydb -e "drop table if exists varbt; create table varbt (c1 varbinary(100)); insert into varbt values (x'997902'),(0x997902); select c1, hex(c1) from varbt"
    +------+---------+
    | c1   | hex(c1) |
    +------+---------+
    | �y   | 997902  |
    | �y   | 997902  |
    +------+---------+
    

We then see the exact same results for the vindex functions using the x'A1' format seen before normalizing hex values.

🆕 With this PR we also add support for using the 0xA1 format with vindexes which did NOT work before:

  • 11.0:

    $ mysql customer -e "select id, hex_keyspace_id, shard from customer.binary where id = 0x997901"
    ERROR 1105 (HY000) at line 1: unsupported: where clause for vindex function must be of the form id = <val> (rhs is not a value)
    
    $ mysql customer -e "select id, hex_keyspace_id, shard from customer.binary_md5 where id = 0x997901"
    ERROR 1105 (HY000) at line 1: unsupported: where clause for vindex function must be of the form id = <val> (rhs is not a value)
    
    $ mysql customer -e "select id, hex_keyspace_id, shard from customer.xxhash where id = 0x997901"
    ERROR 1105 (HY000) at line 1: unsupported: where clause for vindex function must be of the form id = <val> (rhs is not a value)
    
  • This branch:

    $ mysql customer -e "select id, hex_keyspace_id, shard from customer.binary where id = 0x997901"
    +------+-----------------+-------+
    | id   | hex_keyspace_id | shard |
    +------+-----------------+-------+
    | �y  | 997901          | 80-   |
    +------+-----------------+-------+
    
    $ mysql customer -e "select id, hex_keyspace_id, shard from customer.binary_md5 where id = 0x997901"
    +------+----------------------------------+-------+
    | id   | hex_keyspace_id                  | shard |
    +------+----------------------------------+-------+
    | �y  | 2d2da332b59f56aeb235ecb36d639440 | -80   |
    +------+----------------------------------+-------+
    
    $ mysql customer -e "select id, hex_keyspace_id, shard from customer.xxhash where id = 0x997901"
    +------+------------------+-------+
    | id   | hex_keyspace_id  | shard |
    +------+------------------+-------+
    | �y  | ac29dcaa4a0b4527 | 80-   |
    +------+------------------+-------+
    

So now for vindexes we are properly treating x'A1' and 0xA1 as equivalent. This is proper because while the string literal in the SQL query differs the same bytes are stored in MySQL for both. You can see a complete user test here.

Related Issue(s)

Fixes: #9275

Checklist

@mattlord mattlord self-assigned this Nov 23, 2021
@mattlord mattlord force-pushed the BinaryVindexHexVal branch 4 times, most recently from 6d5e75d to b92b7dc Compare November 23, 2021 05:04
@mattlord mattlord changed the title Use decoded hex value when calculating the keyspace ID Use decoded hex string when calculating the keyspace ID Nov 23, 2021
@mattlord
Copy link
Contributor Author

mattlord commented Nov 23, 2021

Example vindex function results from 11.0 before we started normalizing hex values (after this PR main produces the same results):

$ mysql customer -e "select id, hex_keyspace_id, shard from customer.binary where id = x'997901'"
+------+-----------------+-------+
| id   | hex_keyspace_id | shard |
+------+-----------------+-------+
| �y  | 997901          | 80-   |
+------+-----------------+-------+

$ mysql customer -e "select id, hex_keyspace_id, shard from customer.binary_md5 where id = x'997901'"
+------+----------------------------------+-------+
| id   | hex_keyspace_id                  | shard |
+------+----------------------------------+-------+
| �y  | 2d2da332b59f56aeb235ecb36d639440 | -80   |
+------+----------------------------------+-------+

$ mysql customer -e "select id, hex_keyspace_id, shard from customer.xxhash where id = x'997901'"
+------+------------------+-------+
| id   | hex_keyspace_id  | shard |
+------+------------------+-------+
| �y  | ac29dcaa4a0b4527 | 80-   |
+------+------------------+-------+

$ mysql customer -e "select id, hex_keyspace_id, shard from customer.xxhash where id = \"hi there\""
+----------+------------------+-------+
| id       | hex_keyspace_id  | shard |
+----------+------------------+-------+
| hi there | 8b42c06c4e1344e4 | 80-   |
+----------+------------------+-------+

@mattlord mattlord force-pushed the BinaryVindexHexVal branch 2 times, most recently from c2ec4d6 to ac36806 Compare November 23, 2021 17:09
@mattlord mattlord force-pushed the BinaryVindexHexVal branch 2 times, most recently from c9195fd to 133b0b1 Compare November 23, 2021 17:34
go/sqltypes/value.go Outdated Show resolved Hide resolved
go/sqltypes/value.go Outdated Show resolved Hide resolved
go/sqltypes/value.go Outdated Show resolved Hide resolved
go/vt/sqlparser/ast_funcs.go Outdated Show resolved Hide resolved
@mattlord mattlord force-pushed the BinaryVindexHexVal branch 4 times, most recently from f5672be to ea394c9 Compare November 23, 2021 19:31
@mattlord mattlord force-pushed the BinaryVindexHexVal branch 8 times, most recently from a5aeaa6 to 61cb238 Compare November 23, 2021 23:18
Copy link
Member

@frouioui frouioui left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks good to me! :)

Before merging, let's see if 61cb238 resolves @harshit-gangal 's comment

go/sqltypes/value.go Outdated Show resolved Hide resolved
go/sqltypes/value_test.go Outdated Show resolved Hide resolved
go/sqltypes/value_test.go Outdated Show resolved Hide resolved
@mattlord mattlord force-pushed the BinaryVindexHexVal branch 3 times, most recently from 5f2d970 to 55dcac1 Compare November 24, 2021 18:58
@mattlord
Copy link
Contributor Author

mattlord commented Nov 24, 2021

Here's a full user test:

make docker_local && ./docker/local/run.sh

./201_customer_tablets.sh ; ./202_move_tables.sh ; sleep 10; ./203_switch_reads.sh ; ./204_switch_writes.sh ; ./205_clean_commerce.sh ; ./301_customer_sharded.sh ; ./302_new_shards.sh ; ./303_reshard.sh ; sleep 10; ./304_switch_reads.sh ; ./305_switch_writes.sh ; ./306_down_shard_0.sh

vtctlclient GetVSchema customer > /tmp/customer.json

vtctlclient ApplyVSchema -vschema "$(jq '. | .vindexes += {"binary":{"type":"binary"}} | .tables += {"btest":{"columnVindexes":[{"column":"id","name":"binary"}]}}' < /tmp/customer.json)" customer

mysql customer -e "create table btest (id varbinary(50), name varchar(50))"
mysql customer -e "insert into btest (id, name) values (0xF901, 'test'),(0x1901, 'test'),(x'1801', 'test'),(x'F801', 'test')"

mysql customer -e "select hex(id) from btest"
mysql customer/-80 -e "select hex(id) from btest"
mysql customer/80- -e "select hex(id) from btest"

Results:

vitess@7a8847da946f:/vt/local$ mysql customer -e "select hex(id) from btest"
+---------+
| hex(id) |
+---------+
| 1901    |
| 1801    |
| F901    |
| F801    |
+---------+
vitess@7a8847da946f:/vt/local$ mysql customer/-80 -e "select hex(id) from btest"
+---------+
| hex(id) |
+---------+
| 1901    |
| 1801    |
+---------+
vitess@7a8847da946f:/vt/local$ mysql customer/80- -e "select hex(id) from btest"
+---------+
| hex(id) |
+---------+
| F901    |
| F801    |
+---------+

This is done by properly managing the HexVal BindVar:
  - The "raw" value is what was seen in the query
  - The ToBytes value returns they bytes that mysql would return

Signed-off-by: Matt Lord <mattalord@gmail.com>
Signed-off-by: Matt Lord <mattalord@gmail.com>
This is needed as with full hex value support we may need to
escape the hex value string which can produce an error.

Signed-off-by: Matt Lord <mattalord@gmail.com>
Copy link
Member

@harshit-gangal harshit-gangal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Signed-off-by: Matt Lord <mattalord@gmail.com>
@deepthi
Copy link
Member

deepthi commented Nov 29, 2021

Very well-written description. Thank you for taking the time to write it up along with the code changes.

@deepthi deepthi merged commit aa739a3 into vitessio:main Nov 29, 2021
@deepthi deepthi deleted the BinaryVindexHexVal branch November 29, 2021 17:17
mattlord pushed a commit to planetscale/vitess that referenced this pull request Nov 29, 2021
Use decoded hex string when calculating the keyspace ID

Signed-off-by: Matt Lord <mattalord@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Binary Hash VIndex Function With HexVals Always Uses First Shard
4 participants