Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Frequently ExchangeTablePartition may crash when reach SyncAllSchema #7296

Closed
hongyunyan opened this issue Apr 14, 2023 · 3 comments
Closed

Comments

@hongyunyan
Copy link
Contributor

Bug Report

Please answer these questions before submitting your issue. Thanks!

1. Minimal reproduce step (Required)

  1. change maxNumberOfDiffs = 0(To simulate a lot of irrelevant DDL in the middle to reach syncAllSchema)
  2. run this ft, which is modified on tests/fullstack-test2/ddl/alter_exchange_partition.test
# Copyright 2022 PingCAP, Ltd.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

>> DBGInvoke __enable_schema_sync_service('true')
mysql> drop table if exists test.e;
mysql> drop table if exists test.e2;
mysql> drop table if exists test_new.e2;
mysql> drop database if exists test_new;

mysql> create table test.e(id INT NOT NULL,fname VARCHAR(30),lname VARCHAR(30)) PARTITION BY RANGE (id) ( PARTITION p0 VALUES LESS THAN (50),PARTITION p1 VALUES LESS THAN (100),PARTITION p2 VALUES LESS THAN (150), PARTITION p3 VALUES LESS THAN (MAXVALUE));
mysql> alter table test.e set tiflash replica 1;

mysql> create table test.e2(id int not null, fname varchar(30), lname varchar(30));
mysql> alter table test.e2 set tiflash replica 1;

mysql> create database test_new;
mysql> create table test_new.e2(id int not null, fname varchar(30), lname varchar(30));
mysql> alter table test_new.e2 set tiflash replica 1;

func> wait_table test e
func> wait_table test e2
func> wait_table test_new e2

mysql> insert into test.e values (1, 'a', 'b'),(108, 'a', 'b');
mysql> insert into test.e2 values (2, 'a', 'b');
mysql> insert into test_new.e2 values (3, 'a', 'b');

# disable schema sync service
>> DBGInvoke __enable_schema_sync_service('false')
>> DBGInvoke __refresh_schemas()

# case 2, exchange partition across databases, no error happens
mysql> set @@tidb_enable_exchange_partition=1; alter table test.e exchange partition p0 with table test_new.e2
>> DBGInvoke __refresh_schemas()
mysql> alter table test.e add column c1 int;
>> DBGInvoke __refresh_schemas()
mysql> set session tidb_isolation_read_engines='tiflash'; select * from test.e order by id;
+-----+-------+-------+------+
| id  | fname | lname | c1   |
+-----+-------+-------+------+
|   3 | a     | b     | NULL |
| 108 | a     | b     | NULL |
+-----+-------+-------+------+
mysql> set session tidb_isolation_read_engines='tiflash'; select * from test_new.e2;
+----+-------+-------+
| id | fname | lname |
+----+-------+-------+
|  1 | a     | b     |
+----+-------+-------+
mysql> alter table test.e drop column c1;
>> DBGInvoke __refresh_schemas()

# case 6, exchagne partition across databases, error happens after exchange step 1
mysql> set @@tidb_enable_exchange_partition=1; alter table test.e exchange partition p0 with table test_new.e2
>> DBGInvoke __enable_fail_point(exception_after_step_1_in_exchange_partition)
>> DBGInvoke __refresh_schemas()
>> DBGInvoke __refresh_schemas()
mysql> alter table test.e add column c1 int;
>> DBGInvoke __refresh_schemas()
mysql> set session tidb_isolation_read_engines='tiflash'; select * from test.e order by id;
+-----+-------+-------+------+
| id  | fname | lname | c1   |
+-----+-------+-------+------+
|   1 | a     | b     | NULL |
| 108 | a     | b     | NULL |
+-----+-------+-------+------+
mysql> set session tidb_isolation_read_engines='tiflash'; select * from test_new.e2;
+----+-------+-------+
| id | fname | lname |
+----+-------+-------+
|  3 | a     | b     |
+----+-------+-------+
mysql> alter table test.e drop column c1;
>> DBGInvoke __refresh_schemas()

mysql> drop table if exists test.e;
mysql> drop table if exists test.e2;
mysql> drop table if exists test_new.e2;
mysql> drop database if exists test_new;
>> DBGInvoke __enable_schema_sync_service('true')

2. What did you expect to see? (Required)

run successfully

3. What did you see instead (Required)

image

4. What is your TiFlash version? (Required)

master

@JaySon-Huang
Copy link
Contributor

Workaround:

Find out the partition with partition_id equals to the id in logging, and then drop it

> select TABLE_SCHEMA,TABLE_NAME,'' from information_schema.tables where TIDB_TABLE_ID=89 union select TABLE_SCHEMA,TABLE_NAME,PARTITION_NAME from information_schema.partitions where TIDB_PARTITION_ID=89;
+--------------+------------+----+
| TABLE_SCHEMA | TABLE_NAME |    |
+--------------+------------+----+
| test         | e          | p0 |
+--------------+------------+----+
> alter table e drop partition p0;

@JaySon-Huang
Copy link
Contributor

This issue is caused by:

  1. executing ALTER TABLE ... EXCHANGE PARTITION ... across databases on TiDB
  2. tiflash entre loadAllSchema rather than tryLoadSchemaDiffs
  3. tiflash can not handle ALTER TABLE ... EXCHANGE PARTITION ... across databases correctly

Confirm this issue does not affect v7.5.x/v8.1.x

@JaySon-Huang
Copy link
Contributor

close as this bug has been fixed in 6.5/7.1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants