-
Notifications
You must be signed in to change notification settings - Fork 5.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
design: add a proposal to support a command 'ADMIN RESTORE TABLE table_id' to speed up recover faulty dropped table. #7383
Merged
winkyao
merged 6 commits into
pingcap:master
from
winkyao:propose_restore_dropped_table
Aug 30, 2018
Merged
Changes from all commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
4cc21ed
design: add a proposal to support a command 'ADMIN RESTORE TABLE tabl…
winkyao e0ec0d8
address comment
winkyao 44292dd
address comment
winkyao 303ce64
address comment
winkyao 41eb2e7
Merge branch 'master' into propose_restore_dropped_table
shenli 3a8e147
Merge branch 'master' into propose_restore_dropped_table
winkyao File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,49 @@ | ||
|
||
# Proposal: | ||
|
||
- Author(s): [winkyao](https://github.com/winkyao) | ||
- Last updated: 2018-08-10 | ||
|
||
## Abstract | ||
|
||
This proposal proposes to support the `ADMIN RESTORE TABLE table_id` command, to restore the table that is dropped by a faulty operation. | ||
|
||
## Background | ||
|
||
At present, if we drop the table in production environment, we will realize whether the operation is faulty immediately. Before we support the proposed command, we can only [read data from history versions](https://pingcap.com/docs/op-guide/history-read/) to relieve the disaster. But it needs to read all the data in the storage and it takes too much time to just restore the dropped the table. | ||
|
||
## Proposal | ||
|
||
We can add a new command `ADMIN RESTORE TABLE table_id` to just make the dropped table public again. If the data is not deleted by GC worker, this command can work. So it is better to enlarge the GC life time with `update mysql.tidb set variable_value='30h' where variable_name='tikv_gc_life_time';`, before we execute the statement. The table and the original table data can be restored in a few seconds and it is a lot faster than before. It also can reduce the complexity of the operations and dissolve the artificial operation error. | ||
|
||
## Rationale | ||
|
||
Let's take a look at the workflow of the `DROP TABLE` statement. The `DROP TABLE` statement first removes the dropping table meta data from the coresponding database meta data. After the schemas are synced by all the TiDB instances, in `worker.deleteRange`, TiDB will insert a deleted range of the first row key to the end row key of the dropping table into the table `mysql.gc_delete_range`. At most `max(gcDefaultRunInterval, gcLifeTimeKey)` time later, the GC worker will delete the table data finally. | ||
|
||
The meta data of the table is not really deleted. The meta key format is `Table:table_id`. As long as we can find out the ID of the dropped table, we can recover the table information. The `admin show ddl jobs` statement can retrieve the table ID: | ||
|
||
``` | ||
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+ | ||
| JOBS | STATE | | ||
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+ | ||
| ID:44, Type:drop table, State:synced, SchemaState:none, SchemaID:1, TableID:39, RowCount:0, ArgLen:0, start time: 2018-08-11 11:23:53.308 +0800 CST, Err:<nil>, ErrCount:0, SnapshotVersion:0 | synced | | ||
+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-----------+ | ||
``` | ||
|
||
As you can see, if we can restore the table before the GC worker deletes the table data, we can restore the table completely. If the table data is deleted, we can only restore an empty table. | ||
|
||
Before we run `ADMIN RESTORE TABLE table_id`, you must: | ||
* Ensure that no GC task is running and then you can figure this by using TiDB logs and metrics. | ||
* Increase `tikv_gc_life_time` to a sufficient value. | ||
|
||
## Compatibility | ||
|
||
It's a new command and will not lead to compatibility issues. | ||
|
||
## Implementation | ||
|
||
1. `ADMIN RESTORE TABLE table_id` will enqueue a new DDL job to TiDB general DDL queue. | ||
2. Before we start to do this job, we need to check if the table name already exists (created a new table with the same table name after you drop it). If it exists, return the `ErrTableExists` error. | ||
3. Find out whether the delete-range of the dropped table is still in `mysql.gc_delete_range`. If not, it means that the GC worker has cleaned up the data. In this situation, we cannot restore the table successfully but return an error to the client. If it is still there, we remove the record in the `mysql.gc_delete_range` table. If we successfully remove the record, continue to Step 4; otherwise, we return an error to the client to indicate the command cannot be executed safely. | ||
4. Use the previous table meta information of the table_id to insert the meta data into the schema meta data, like what `Meta.CreateTable` does. And set the table information state to `model.StatePublic`, then the restoration will be finished after the schema is synced by all the TiDB instances. | ||
5. If the command is canceled or rollbacked, and the delete-range record is already removed, we need to insert it into `mysql.gc_delete_range` again, like what `Drop Table` does in `worker.finishDDLJob`. |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we put it to the beginning of the queue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Job in the general queue will be handled quickly, maybe treat it as a normal job is ok.