Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC + Support of BACKUP and RESTORE statements (#15274) #16960

Merged
merged 2 commits into from
May 6, 2020

Conversation

sre-bot
Copy link
Contributor

@sre-bot sre-bot commented Apr 30, 2020

cherry-pick #15274 to release-4.0


What problem does this PR solve?

Support running BR inside TiDB directly.

What is changed and how it works?

Recognize the new *ast.BRIEStmt in pingcap/parser#746, and forward to the library functions in BR. When we execute

BACKUP DATABASE `tpcc` TO 'local:///tmp/storage/';

TiDB will spawn a new BR manager which backs up the database tpcc into the provided storage. The query blocks until backup completes. Returns an empty set on success:

MySQL [tpcc]> backup database tpcc to 'local:///tmp/br_tpcc_32';
Empty set (58.453 sec)

and returns an error on failure:

MySQL [tpcc]> backup table tpcc.stock to 'local:///tmp/br_tpcc_30';
ERROR 8124 (HY000): Backup failed: backup meta exists, may be some backup files in the path already

BRIE tasks must be executed sequentially. Currently, for simplicity, tasks are queued in the local server only. In the future we make the entire cluster share the same queue.

Use SHOW BACKUP / SHOW RESTORE in another session to list the tasks

MySQL [(none)]> show backup;
+-------------------------+---------+-------------------+---------------------+---------------------+------+
| Storage                 | State   | Progress          | Init_time           | Step_start_time     | ID   |
+-------------------------+---------+-------------------+---------------------+---------------------+------+
| local:///tmp/br_tpcc_30 | Backup  | 98.38709677419355 | 2020-04-12 23:09:03 | 2020-04-12 23:09:25 |    3 |
| local:///tmp/br_tpcc_30 | Wait    |                 0 | 2020-04-12 23:09:48 | 2020-04-12 23:09:48 |    4 |
+-------------------------+---------+-------------------+---------------------+---------------------+------+

Use KILL TIDB QUERY n to cancel a task.

Note: Currently running RESTORE may make the tables enter a "non-ACID" state where the backup archives are partially ingested. Maybe we need to pessimistically lock the entire database?

Note: No test cases yet. What to do?

Check List

Tests

  • Manual test (add detailed scripts or steps below)
    • Running backup on a simple table W=30 TPC-C database (2 GB), drop it, and run restore from the archive.

Code changes

Side effects

Related changes

  • Need to update the documentation

Release note

  • Added the BACKUP statement to create a logical backup archive.
  • Added the RESTORE statement to restore from the backup archive. (don't include into release note yet, do so after the entire feature is complete.)

@sre-bot sre-bot requested review from a team as code owners April 30, 2020 11:38
@sre-bot
Copy link
Contributor Author

sre-bot commented Apr 30, 2020

/run-all-tests

@kennytm
Copy link
Contributor

kennytm commented Apr 30, 2020

/run-all-tests

@kennytm
Copy link
Contributor

kennytm commented Apr 30, 2020

/rebuild plugin=pr/35

@kennytm
Copy link
Contributor

kennytm commented Apr 30, 2020

/run-integration-copr-test

1 similar comment
@kennytm
Copy link
Contributor

kennytm commented May 6, 2020

/run-integration-copr-test

@kennytm
Copy link
Contributor

kennytm commented May 6, 2020

/rebuild plugin=pr/35

@kennytm
Copy link
Contributor

kennytm commented May 6, 2020

/run-integration-copr-test

@kennytm
Copy link
Contributor

kennytm commented May 6, 2020

https://internal.pingcap.net/idc-jenkins/blue/organizations/jenkins/tidb_ghpr_integration_copr_test/detail/tidb_ghpr_integration_copr_test/4493/pipeline

failure in sql/randgen-topn/3_compare_1.sql
[2020-05-06T10:08:56.690Z] 2020/05/06 18:08:56 2020/05/06 18:08:11 Test fail: Outputs are not matching.
[2020-05-06T10:08:56.690Z] Test case: sql/randgen-topn/3_compare_1.sql
[2020-05-06T10:08:56.690Z] Statement: #28 -  SELECT 'y' LIKE `col_blob` AS field1 FROM `table1000_int_autoinc` WHERE 'ttlhcfjohbcwphmdugtkggyiifpczcvpbaxztbknxknemylwciigurrjoodkglmtgzpztmmpouegchmvqxbdzovqpehvtxaizxauegrkugschnyphmrecailqemxargwogkbeowinlztkskcymazkizxowqiacsnarrxpwxlhchutuduzrgjuhaxetjshqlgs' IN ( '1997-07-09 23:41:06.059230', 'tlhcfjohbcwphmdugtkggyiifpczcvpbaxztbknxknemylwciigurrjoodkglmtgzpztmmpouegchmvqxbdzovqpehvtxaizxauegrkugschnyphmrecailqemxargwogkbeowinlztkskcyma', `col_bit`, `col_int`, NULL ) ORDER BY field1 LIMIT 1 /* QNO 30 CON_ID 112 */ ;
[2020-05-06T10:08:56.690Z] NoPushDown Output: 
[2020-05-06T10:08:56.690Z] field1
[2020-05-06T10:08:56.690Z] NULL
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] WithPushDown Output: 
[2020-05-06T10:08:56.690Z] Error 1105: runtime error: index out of range [0] with length 0
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] NoPushDown Plan: 
[2020-05-06T10:08:56.690Z] id	estRows	task	access object	operator info
[2020-05-06T10:08:56.690Z] Projection_7	1.00	root		like(y, push_down_test_db.table1000_int_autoinc.col_blob, 92)->Column#62
[2020-05-06T10:08:56.690Z] └─Projection_14	1.00	root		push_down_test_db.table1000_int_autoinc.col_bit, push_down_test_db.table1000_int_autoinc.col_blob, push_down_test_db.table1000_int_autoinc.col_int
[2020-05-06T10:08:56.690Z]   └─TopN_10	1.00	root		Column#63:asc, offset:0, count:1
[2020-05-06T10:08:56.690Z]     └─Projection_15	8000.00	root		push_down_test_db.table1000_int_autoinc.col_bit, push_down_test_db.table1000_int_autoinc.col_blob, push_down_test_db.table1000_int_autoinc.col_int, like(y, push_down_test_db.table1000_int_autoinc.col_blob, 92)->Column#63
[2020-05-06T10:08:56.690Z]       └─Selection_11	8000.00	root		or(or(0, eq(0, push_down_test_db.table1000_int_autoinc.col_bit)), or(eq(0, push_down_test_db.table1000_int_autoinc.col_int), 0))
[2020-05-06T10:08:56.690Z]         └─TableReader_13	10000.00	root		data:TableFullScan_12
[2020-05-06T10:08:56.690Z]           └─TableFullScan_12	10000.00	cop[tikv]	table:table1000_int_autoinc	keep order:false, stats:pseudo
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] WithPushDown Plan: 
[2020-05-06T10:08:56.690Z] id	estRows	task	access object	operator info
[2020-05-06T10:08:56.690Z] Projection_7	1.00	root		like(y, push_down_test_db.table1000_int_autoinc.col_blob, 92)->Column#62
[2020-05-06T10:08:56.690Z] └─Projection_14	1.00	root		push_down_test_db.table1000_int_autoinc.col_bit, push_down_test_db.table1000_int_autoinc.col_blob, push_down_test_db.table1000_int_autoinc.col_int
[2020-05-06T10:08:56.690Z]   └─TopN_10	1.00	root		Column#63:asc, offset:0, count:1
[2020-05-06T10:08:56.690Z]     └─Projection_15	8000.00	root		push_down_test_db.table1000_int_autoinc.col_bit, push_down_test_db.table1000_int_autoinc.col_blob, push_down_test_db.table1000_int_autoinc.col_int, like(y, push_down_test_db.table1000_int_autoinc.col_blob, 92)->Column#63
[2020-05-06T10:08:56.690Z]       └─Selection_11	8000.00	root		or(or(0, eq(0, push_down_test_db.table1000_int_autoinc.col_bit)), or(eq(0, push_down_test_db.table1000_int_autoinc.col_int), 0))
[2020-05-06T10:08:56.690Z]         └─TableReader_13	10000.00	root		data:TableFullScan_12
[2020-05-06T10:08:56.690Z]           └─TableFullScan_12	10000.00	cop[tikv]	table:table1000_int_autoinc	keep order:false, stats:pseudo
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] 2020/05/06 18:08:50 Test fail: Outputs are not matching.
[2020-05-06T10:08:56.690Z] Test case: sql/randgen-topn/3_compare_1.sql
[2020-05-06T10:08:56.690Z] Statement: #819 -  SELECT STRCMP( '23:10:42.007765', `col_bigint_unsigned` ) AS field1, `col_varbinary_32` NOT LIKE `col_set` AS field2, `col_tinyint_unsigned_key` IS FALSE AS field3, ( INTERVAL( `col_bit`, ( INTERVAL( '2018-04-26 10:38:03.031949', `col_char_2`, NULL ) ), `col_varchar_1_key` ) ) NOT BETWEEN NULL AND NULL AS field4, COALESCE( '2018-04-27', -26901, ( `col_float_unsigned_key` <=> ( `col_float_unsigned_key` > 'pcbqbfmwpebbkyfsxemhlybhtahsdfttztbnrqjpdtwjharagadcroqtxlefjrhcokdymxvanwfvayfdbhtoxwppiavhgmizbrrxgafhbcxkosudjiyckygmuatynejcqwwbclkmfhgrrenyxlyawqekkchtjzebuphvkwswxbtqsjokzalmxfaklbeukgslyqnrheytuhbqsbseiojnyxesmnsdfyyisxjoljtdmdxwycmyxfnxnojmst' ) ), `col_binary_8_key`, `col_char_255` ) AS field5 FROM `table1000_int_autoinc` WHERE COALESCE( `col_blob`, '2030-01-04', `col_decimal_key` ) ORDER BY field1, field2, field3, field4, field5 LIMIT 3 /* QNO 824 CON_ID 112 */ ;
[2020-05-06T10:08:56.690Z] NoPushDown Output: 
[2020-05-06T10:08:56.690Z] Error 1105: runtime error: index out of range [0] with length 0
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] WithPushDown Output: 
[2020-05-06T10:08:56.690Z] field1	field2	field3	field4	field5
[2020-05-06T10:08:56.690Z] NULL	NULL	0	NULL	2018-04-27
[2020-05-06T10:08:56.690Z] NULL	NULL	0	NULL	2018-04-27
[2020-05-06T10:08:56.690Z] NULL	NULL	0	NULL	2018-04-27
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] 
[2020-05-06T10:08:56.690Z] NoPushDown Plan: 
[2020-05-06T10:08:56.690Z] id	estRows	task	access object	operator info
[2020-05-06T10:08:56.690Z] Projection_7	3.00	root		strcmp(23:10:42.007765, cast(push_down_test_db.table1000_int_autoinc.col_bigint_unsigned, var_string(20)))->Column#62, not(like(push_down_test_db.table1000_int_autoinc.col_varbinary_32, push_down_test_db.table1000_int_autoinc.col_set, 92))->Column#63, isfalse(push_down_test_db.table1000_int_autoinc.col_tinyint_unsigned_key)->Column#64, not(and(ge(interval(cast(push_down_test_db.table1000_int_autoinc.col_bit, double UNSIGNED BINARY), cast(interval(2018, cast(push_down_test_db.table1000_int_autoinc.col_char_2, double BINARY), <nil>), double BINARY), cast(push_down_test_db.table1000_int_autoinc.col_varchar_1_key, double BINARY)), <nil>), le(interval(cast(push_down_test_db.table1000_int_autoinc.col_bit, double UNSIGNED BINARY), cast(interval(2018, cast(push_down_test_db.table1000_int_autoinc.col_char_2, double BINARY), <nil>), double BINARY), cast(push_down_test_db.table1000_int_autoinc.col_varchar_1_key, double BINARY)), <nil>)))->Column#65, coalesce(2018-04-27, -26901, cast(nulleq(push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, cast(gt(push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, 0), double BINARY)), var_string(20)), push_down_test_db.table1000_int_autoinc.col_binary_8_key, push_down_test_db.table1000_int_autoinc.col_char_255)->Column#66
[2020-05-06T10:08:56.690Z] └─Projection_17	3.00	root		push_down_test_db.table1000_int_autoinc.col_varchar_1_key, push_down_test_db.table1000_int_autoinc.col_char_2, push_down_test_db.table1000_int_autoinc.col_bit, push_down_test_db.table1000_int_autoinc.col_bigint_unsigned, push_down_test_db.table1000_int_autoinc.col_decimal_key, push_down_test_db.table1000_int_autoinc.col_blob, push_down_test_db.table1000_int_autoinc.col_binary_8_key, push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, push_down_test_db.table1000_int_autoinc.col_set, push_down_test_db.table1000_int_autoinc.col_char_255, push_down_test_db.table1000_int_autoinc.col_tinyint_unsigned_key, push_down_test_db.table1000_int_autoinc.col_varbinary_32
[2020-05-06T10:08:56.690Z]   └─TopN_8	3.00	root		Column#67:asc, Column#68:asc, Column#69:asc, Column#70:asc, Column#71:asc, offset:0, count:3
[2020-05-06T10:08:56.690Z]     └─Projection_18	8000.00	root		push_down_test_db.table1000_int_autoinc.col_varchar_1_key, push_down_test_db.table1000_int_autoinc.col_char_2, push_down_test_db.table1000_int_autoinc.col_bit, push_down_test_db.table1000_int_autoinc.col_bigint_unsigned, push_down_test_db.table1000_int_autoinc.col_decimal_key, push_down_test_db.table1000_int_autoinc.col_blob, push_down_test_db.table1000_int_autoinc.col_binary_8_key, push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, push_down_test_db.table1000_int_autoinc.col_set, push_down_test_db.table1000_int_autoinc.col_char_255, push_down_test_db.table1000_int_autoinc.col_tinyint_unsigned_key, push_down_test_db.table1000_int_autoinc.col_varbinary_32, strcmp(23:10:42.007765, cast(push_down_test_db.table1000_int_autoinc.col_bigint_unsigned, var_string(20)))->Column#67, not(like(push_down_test_db.table1000_int_autoinc.col_varbinary_32, push_down_test_db.table1000_int_autoinc.col_set, 92))->Column#68, isfalse(push_down_test_db.table1000_int_autoinc.col_tinyint_unsigned_key)->Column#69, not(and(ge(interval(cast(push_down_test_db.table1000_int_autoinc.col_bit, double UNSIGNED BINARY), cast(interval(2018, cast(push_down_test_db.table1000_int_autoinc.col_char_2, double BINARY), <nil>), double BINARY), cast(push_down_test_db.table1000_int_autoinc.col_varchar_1_key, double BINARY)), <nil>), le(interval(cast(push_down_test_db.table1000_int_autoinc.col_bit, double UNSIGNED BINARY), cast(interval(2018, cast(push_down_test_db.table1000_int_autoinc.col_char_2, double BINARY), <nil>), double BINARY), cast(push_down_test_db.table1000_int_autoinc.col_varchar_1_key, double BINARY)), <nil>)))->Column#70, coalesce(2018-04-27, -26901, cast(nulleq(push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, cast(gt(push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, 0), double BINARY)), var_string(20)), push_down_test_db.table1000_int_autoinc.col_binary_8_key, push_down_test_db.table1000_int_autoinc.col_char_255)->Column#71
[2020-05-06T10:08:56.691Z]       └─TableReader_13	8000.00	root		data:Selection_12
[2020-05-06T10:08:56.691Z]         └─Selection_12	8000.00	cop[tikv]		coalesce(push_down_test_db.table1000_int_autoinc.col_blob, "2030-01-04", cast(push_down_test_db.table1000_int_autoinc.col_decimal_key))
[2020-05-06T10:08:56.691Z]           └─TableFullScan_11	10000.00	cop[tikv]	table:table1000_int_autoinc	keep order:false, stats:pseudo
[2020-05-06T10:08:56.691Z] 
[2020-05-06T10:08:56.691Z] 
[2020-05-06T10:08:56.691Z] WithPushDown Plan: 
[2020-05-06T10:08:56.691Z] id	estRows	task	access object	operator info
[2020-05-06T10:08:56.691Z] Projection_7	3.00	root		strcmp(23:10:42.007765, cast(push_down_test_db.table1000_int_autoinc.col_bigint_unsigned, var_string(20)))->Column#62, not(like(push_down_test_db.table1000_int_autoinc.col_varbinary_32, push_down_test_db.table1000_int_autoinc.col_set, 92))->Column#63, isfalse(push_down_test_db.table1000_int_autoinc.col_tinyint_unsigned_key)->Column#64, not(and(ge(interval(cast(push_down_test_db.table1000_int_autoinc.col_bit, double UNSIGNED BINARY), cast(interval(2018, cast(push_down_test_db.table1000_int_autoinc.col_char_2, double BINARY), <nil>), double BINARY), cast(push_down_test_db.table1000_int_autoinc.col_varchar_1_key, double BINARY)), <nil>), le(interval(cast(push_down_test_db.table1000_int_autoinc.col_bit, double UNSIGNED BINARY), cast(interval(2018, cast(push_down_test_db.table1000_int_autoinc.col_char_2, double BINARY), <nil>), double BINARY), cast(push_down_test_db.table1000_int_autoinc.col_varchar_1_key, double BINARY)), <nil>)))->Column#65, coalesce(2018-04-27, -26901, cast(nulleq(push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, cast(gt(push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, 0), double BINARY)), var_string(20)), push_down_test_db.table1000_int_autoinc.col_binary_8_key, push_down_test_db.table1000_int_autoinc.col_char_255)->Column#66
[2020-05-06T10:08:56.691Z] └─Projection_17	3.00	root		push_down_test_db.table1000_int_autoinc.col_varchar_1_key, push_down_test_db.table1000_int_autoinc.col_char_2, push_down_test_db.table1000_int_autoinc.col_bit, push_down_test_db.table1000_int_autoinc.col_bigint_unsigned, push_down_test_db.table1000_int_autoinc.col_decimal_key, push_down_test_db.table1000_int_autoinc.col_blob, push_down_test_db.table1000_int_autoinc.col_binary_8_key, push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, push_down_test_db.table1000_int_autoinc.col_set, push_down_test_db.table1000_int_autoinc.col_char_255, push_down_test_db.table1000_int_autoinc.col_tinyint_unsigned_key, push_down_test_db.table1000_int_autoinc.col_varbinary_32
[2020-05-06T10:08:56.691Z]   └─TopN_8	3.00	root		Column#67:asc, Column#68:asc, Column#69:asc, Column#70:asc, Column#71:asc, offset:0, count:3
[2020-05-06T10:08:56.691Z]     └─Projection_18	8000.00	root		push_down_test_db.table1000_int_autoinc.col_varchar_1_key, push_down_test_db.table1000_int_autoinc.col_char_2, push_down_test_db.table1000_int_autoinc.col_bit, push_down_test_db.table1000_int_autoinc.col_bigint_unsigned, push_down_test_db.table1000_int_autoinc.col_decimal_key, push_down_test_db.table1000_int_autoinc.col_blob, push_down_test_db.table1000_int_autoinc.col_binary_8_key, push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, push_down_test_db.table1000_int_autoinc.col_set, push_down_test_db.table1000_int_autoinc.col_char_255, push_down_test_db.table1000_int_autoinc.col_tinyint_unsigned_key, push_down_test_db.table1000_int_autoinc.col_varbinary_32, strcmp(23:10:42.007765, cast(push_down_test_db.table1000_int_autoinc.col_bigint_unsigned, var_string(20)))->Column#67, not(like(push_down_test_db.table1000_int_autoinc.col_varbinary_32, push_down_test_db.table1000_int_autoinc.col_set, 92))->Column#68, isfalse(push_down_test_db.table1000_int_autoinc.col_tinyint_unsigned_key)->Column#69, not(and(ge(interval(cast(push_down_test_db.table1000_int_autoinc.col_bit, double UNSIGNED BINARY), cast(interval(2018, cast(push_down_test_db.table1000_int_autoinc.col_char_2, double BINARY), <nil>), double BINARY), cast(push_down_test_db.table1000_int_autoinc.col_varchar_1_key, double BINARY)), <nil>), le(interval(cast(push_down_test_db.table1000_int_autoinc.col_bit, double UNSIGNED BINARY), cast(interval(2018, cast(push_down_test_db.table1000_int_autoinc.col_char_2, double BINARY), <nil>), double BINARY), cast(push_down_test_db.table1000_int_autoinc.col_varchar_1_key, double BINARY)), <nil>)))->Column#70, coalesce(2018-04-27, -26901, cast(nulleq(push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, cast(gt(push_down_test_db.table1000_int_autoinc.col_float_unsigned_key, 0), double BINARY)), var_string(20)), push_down_test_db.table1000_int_autoinc.col_binary_8_key, push_down_test_db.table1000_int_autoinc.col_char_255)->Column#71
[2020-05-06T10:08:56.691Z]       └─TableReader_13	8000.00	root		data:Selection_12
[2020-05-06T10:08:56.691Z]         └─Selection_12	8000.00	cop[tikv]		coalesce(push_down_test_db.table1000_int_autoinc.col_blob, "2030-01-04", cast(push_down_test_db.table1000_int_autoinc.col_decimal_key))
[2020-05-06T10:08:56.691Z]           └─TableFullScan_11	10000.00	cop[tikv]	table:table1000_int_autoinc	keep order:false, stats:pseudo
[2020-05-06T10:08:56.691Z] 
[2020-05-06T10:08:56.691Z] 
[2020-05-06T10:08:56.691Z] 
[2020-05-06T10:08:56.691Z] 2020/05/06 18:08:56 Test summary: non-matching queries: 2, success queries: 657, skipped queries: 340
[2020-05-06T10:08:56.691Z] 2020/05/06 18:08:56 Test summary(sql/randgen-topn/3_compare_1.sql): Test case FAIL
[2020-05-06T10:08:56.691Z] 
[2020-05-06T10:08:56.951Z] + Test finished

Copy link
Contributor

@SunRunAway SunRunAway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@zz-jason zz-jason left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@zz-jason zz-jason added status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. labels May 6, 2020
@sre-bot
Copy link
Contributor Author

sre-bot commented May 6, 2020

/run-all-tests

@sre-bot sre-bot merged commit 1fad460 into pingcap:release-4.0 May 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority/release-blocker This issue blocks a release. Please solve it ASAP. sig/execution SIG execution status/can-merge Indicates a PR has been approved by a committer. status/LGT2 Indicates that a PR has LGTM 2. type/new-feature type/4.0-cherry-pick
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants