Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use a left join to make sure that tables with tablespace=innodb_system are included in the schema #12672

Merged
merged 2 commits into from
Mar 27, 2023

Conversation

kovyrin
Copy link
Contributor

@kovyrin kovyrin commented Mar 20, 2023

Description

To make sure the behaviour of the schema engine is the same for 5.7 and 8.0 in terms of handling of tables that have innodb tablespace set to innodb_system (those are not included in information_schema.innodb_tablespaces). This should make sure that vstreamer works as expected when some of the tables in a keyspace have been created with tablespace innodb_system option (because of https://bugs.mysql.com/bug.php?id=110383 or some other reason).

This issue exists in 14.x and the fix needs backporting AFAIU.

Related Issue(s)

Fixes #12669

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on the CI
  • Documentation is not required

…m are not skipped when loading the schema (fixes vitessio#12669)

Signed-off-by: Oleksiy Kovyrin <oleksiy.kovyrin@shopify.com>
@kovyrin
Copy link
Contributor Author

kovyrin commented Mar 20, 2023

Not sure how to (or if I should) write a test for this, but tested it locally as follows:

  1. Checked out this branch locally
  2. Built Vitess
  3. Deployed a test cluster locally
  4. Created a test table in the commerce keyspace
  5. Used grpcurl to connect to vstream API
  6. Inserted data into the test table
  7. Observed vstream responses in the grpcurl console

Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me -- we want all tables, even if they don't have an associated dedicated tablespace (I would guess this is also an issue when using general tablespaces, as that's what the system_tablespace is). Thank you for this, @kovyrin ! ❤️

mattlord

This comment was marked as outdated.

Copy link
Contributor

@mattlord mattlord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could add a minimal test by e.g. creating the product table here in the system_tablespace: https://github.com/vitessio/vitess/blob/main/go/test/endtoend/vreplication/config_test.go#L37

@kovyrin make sense? Please let me know if you need any help.

@harshit-gangal maybe you have a better idea though? Schema Tracker tests may be better, e.g. here: https://github.com/vitessio/vitess/blob/main/go/test/endtoend/vtgate/schematracker/sharded/schema.sql

@kovyrin with this patch:

diff --git a/go/test/endtoend/vreplication/vstream_test.go b/go/test/endtoend/vreplication/vstream_test.go
index 1098632d87..2b3eaf4bbd 100644
--- a/go/test/endtoend/vreplication/vstream_test.go
+++ b/go/test/endtoend/vreplication/vstream_test.go
@@ -190,7 +190,7 @@ const vschemaUnsharded = `
 }
 `
 const schemaSharded = `
-create table customer(cid int, name varbinary(128), primary key(cid))  CHARSET=utf8mb4;
+create table customer(cid int, name varbinary(128), primary key(cid)) TABLESPACE innodb_system CHARSET=utf8mb4;
 `
 const vschemaSharded = `
 {

This fails on main:

$  go test -timeout 10m -v -failfast -run TestVStreamStopOnReshardTrue vitess.io/vitess/go/test/endtoend/vreplication

So that could serve as the test for your PR.

@kovyrin
Copy link
Contributor Author

kovyrin commented Mar 20, 2023

Having a very hard time running any of the existing tests locally. It reliably fails to start mysql both in vreplication end to end tests and in vtgate/schematracker :-(

One example:

❯ go test -timeout 30s -run ^TestVStreamFailover$ vitess.io/vitess/go/test/endtoend/vreplication
VTDATAROOT is vreple2e_268107
E0320 16:46:12.474533   72873 vtctldclient_process.go:101] CreateKeyspace does not support the --sidecar-db-name flag in vtctl version 15; ignoring...
E0320 16:46:13.084229   72873 vtorc_process.go:92] configuration - {
	"Debug": true,
	"ListenAddress": ":0",
	"MySQLTopologyUser": "orc_client_user",
	"MySQLTopologyPassword": "orc_client_user_password",
	"MySQLReplicaUser": "vt_repl",
	"MySQLReplicaPassword": "",
	"RecoveryPeriodBlockSeconds": 1
}
--- FAIL: TestVStreamFailover (14.75s)
    cluster_test.go:554: exit status 1 :: Unable to start mysql server for &{zone1-100 vttablet vreple2e_268107/tmp/vt_0000000100_querylog.txt 100 zone1-0000000100  16100 17091 0 {vtctl vtctl  etcd2 127.0.0.1:2379 /vitess/global 127.0.0.1:2379 / 15} vreple2e_268107/tmp 127.0.0.1 product replica 5 file vreple2e_268107/backups grpc-queryservice,grpc-tabletmanager,grpc-updatestream,grpc-throttler http://127.0.0.1:15000 vreple2e_268107/vt_0000000100 http://127.0.0.1:16100/debug/vars http://127.0.0.1:16100/queryz http://127.0.0.1:16100/debug/status_details false NOT_SERVING  0 PRIMARY  utf8mb4 http://127.0.0.1:16100/debug/consolidations [--queryserver-config-schema-reload-time 5 --enable-lag-throttler --heartbeat_enable --heartbeat_interval 250ms] <nil> <nil>}
FAIL
FAIL	vitess.io/vitess/go/test/endtoend/vreplication	15.366s
FAIL

Another one:

❯ go test -timeout 30s -run ^TestNewTable$ vitess.io/vitess/go/test/endtoend/vtgate/
E0320 16:41:20.982548   71466 vtctl_process.go:67] CreateKeyspace does not support the --sidecar-db-name flag in vtctl version 15; ignoring...
E0320 16:41:21.410968   71466 cluster_process.go:407] unable to start mysql process /opt/homebrew/bin/mysqlctl --log_dir vtroot_7101/tmp_7103 --tablet_uid 6660 --mysql_port 7109 init -- --init_db_sql_file /config/init_db.sql start: exit status 1
FAIL	vitess.io/vitess/go/test/endtoend/vtgate	4.094s
FAIL

Signed-off-by: Oleksiy Kovyrin <oleksiy.kovyrin@shopify.com>
@kovyrin kovyrin requested a review from deepthi as a code owner March 20, 2023 21:15
@kovyrin kovyrin requested review from mattlord and removed request for deepthi, systay and harshit-gangal March 20, 2023 21:16
@kovyrin
Copy link
Contributor Author

kovyrin commented Mar 20, 2023

Modified the customer schema to use the system tablespace. This should cover the change on 8.0 (breaks on main).

Copy link
Member

@harshit-gangal harshit-gangal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how do we get size for those tables?

@mattlord
Copy link
Contributor

mattlord commented Mar 25, 2023

how do we get size for those tables?

@harshit-gangal Good question! We'd need to do more (although I'd argue that recognizing the table w/o a known size is still an improvement over not being aware of it at all).

This is one possibility:

## Current query 
mysql> SELECT t.table_name, t.table_type, UNIX_TIMESTAMP(t.create_time), t.table_comment,  SUM(i.file_size) as storage_size,  SUM(i.allocated_s
ize)  FROM information_schema.tables t LEFT JOIN information_schema.innodb_tablespaces i ON i.name LIKE CONCAT(database(), '/%') AND (i.name =
CONCAT(t.table_schema, '/', t.table_name) OR i.name LIKE CONCAT(t.table_schema, '/', t.table_name, '#p#%'))  WHERE t.table_schema = database()
 GROUP BY t.table_name, t.table_type, t.create_time, t.table_comment\G
*************************** 1. row ***************************
                   TABLE_NAME: corder
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679718422
                TABLE_COMMENT:
                 storage_size: 114688
        SUM(i.allocated_size): 114688
*************************** 2. row ***************************
                   TABLE_NAME: customer
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679718422
                TABLE_COMMENT:
                 storage_size: 114688
        SUM(i.allocated_size): 114688
*************************** 3. row ***************************
                   TABLE_NAME: test1
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679719746
                TABLE_COMMENT:
                 storage_size: NULL
        SUM(i.allocated_size): NULL
*************************** 4. row ***************************
                   TABLE_NAME: ti
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679720520
                TABLE_COMMENT:
                 storage_size: 688128
        SUM(i.allocated_size): 688128
4 rows in set (0.01 sec)

## New Query
mysql> SELECT t.table_name, t.table_type, UNIX_TIMESTAMP(t.create_time), t.table_comment, IFNULL(SUM(i.file_size), (select data_length+index_length from information_schema.tables where table_name=t.table_name)) as storage_size,  SUM(i.allocated_size)  FROM information_schema.tables t LEFT JOIN information_schema.innodb_tablespaces i ON i.name LIKE CONCAT(database(), '/%') AND (i.name = CONCAT(t.table_schema, '/', t.table_name) OR i.name LIKE CONCAT(t.table_schema, '/', t.table_name, '#p#%'))  WHERE t.table_schema = database()  GROUP BY t.table_name, t.table_type, t.create_time, t.table_comment\G
*************************** 1. row ***************************
                   TABLE_NAME: corder
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679718422
                TABLE_COMMENT:
                 storage_size: 114688
        SUM(i.allocated_size): 114688
*************************** 2. row ***************************
                   TABLE_NAME: customer
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679718422
                TABLE_COMMENT:
                 storage_size: 114688
        SUM(i.allocated_size): 114688
*************************** 3. row ***************************
                   TABLE_NAME: test1
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679719746
                TABLE_COMMENT:
                 storage_size: 16384
        SUM(i.allocated_size): NULL
*************************** 4. row ***************************
                   TABLE_NAME: ti
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679720520
                TABLE_COMMENT:
                 storage_size: 688128
        SUM(i.allocated_size): 688128
4 rows in set (0.00 sec)

This makes the query more complicated and heavier, so it's likely to be problematic for those with many thousands of tables.

As a practical matter, it may be better to accept NULL or "unknown" size in these rare cases. If a non-integral result is the problem, we could use this instead: IFNULL(SUM(i.file_size), 0) as storage_size. That doesn't add any overhead to the query:

mysql> SELECT t.table_name, t.table_type, UNIX_TIMESTAMP(t.create_time), t.table_comment, IFNULL(SUM(i.file_size), 0) as storage_size, IFNULL(SUM(i.allocated_size), 0) as allocated_size  FROM information_schema.tables t LEFT JOIN information_schema.innodb_tablespaces i ON i.name LIKE C
ONCAT(database(), '/%') AND (i.name = CONCAT(t.table_schema, '/', t.table_name) OR i.name LIKE CONCAT(t.table_schema, '/', t.table_name, '#p#%'))
*************************** 1. row ***************************
                   TABLE_NAME: corder
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679718422
                TABLE_COMMENT:
                 storage_size: 114688
               allocated_size: 114688
*************************** 2. row ***************************
                   TABLE_NAME: customer
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679718422
                TABLE_COMMENT:
                 storage_size: 114688
               allocated_size: 114688
*************************** 3. row ***************************
                   TABLE_NAME: test1
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679719746
                TABLE_COMMENT:
                 storage_size: 0
               allocated_size: 0
*************************** 4. row ***************************
                   TABLE_NAME: ti
                   TABLE_TYPE: BASE TABLE
UNIX_TIMESTAMP(t.create_time): 1679720520
                TABLE_COMMENT:
                 storage_size: 688128
               allocated_size: 688128
4 rows in set (0.01 sec)

This was referenced Mar 27, 2023
@frouioui frouioui merged commit 0de7e01 into vitessio:main Mar 27, 2023
frouioui pushed a commit that referenced this pull request Mar 28, 2023
…ce=innodb_system are included in the schema (#12672) (#12734)

* Use a left join to make sure that tables with tablespace=innodb_system are not skipped when loading the schema (fixes #12669)

Signed-off-by: Oleksiy Kovyrin <oleksiy.kovyrin@shopify.com>

* Set tablespace on a test table to see if it breaks vstreamer

Signed-off-by: Oleksiy Kovyrin <oleksiy.kovyrin@shopify.com>

---------

Signed-off-by: Oleksiy Kovyrin <oleksiy.kovyrin@shopify.com>
Co-authored-by: Oleksiy Kovyrin <oleksiy.kovyrin@shopify.com>
frouioui pushed a commit that referenced this pull request Mar 28, 2023
…ce=innodb_system are included in the schema (#12672) (#12735)

* Use a left join to make sure that tables with tablespace=innodb_system are not skipped when loading the schema (fixes #12669)

Signed-off-by: Oleksiy Kovyrin <oleksiy.kovyrin@shopify.com>

* Set tablespace on a test table to see if it breaks vstreamer

Signed-off-by: Oleksiy Kovyrin <oleksiy.kovyrin@shopify.com>

---------

Signed-off-by: Oleksiy Kovyrin <oleksiy.kovyrin@shopify.com>
Co-authored-by: Oleksiy Kovyrin <oleksiy.kovyrin@shopify.com>
@hmaurer hmaurer mentioned this pull request Mar 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug Report: No schema found for table in vstreamer for tables with TABLESPACE innodb_system
4 participants