-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use mydumper to back up a database containing many tables, very slow #11733
Comments
fetch the database & table list; read the data from tables (
How many threads are you using? ( Can you provide the arguments used for mydumper? |
@csuzhangxc thanks for your support!
16 threads use for mydumper, but it still very slow. the command is
here anthor question: you said
I want to know what optimizations have been made in memory usage when using the mydumper for backups before and after version 2.1.13? thanks! |
What is the size of your largest table? Can you try our latest version of mydumper? It uses TiDB's _row_id to support dumping a single TiDB table concurrently (but needing to use
just |
The largest table is about 5 billion rows, about 400g in size. Ok, I use the latest mydumper tool and test it with the -r parameter. Thank you for your support. |
@AlexNewTao |
it did not solve my problem (1) Test with the -r parameter When you use the -r parameter to back up, is it a backup of the database with a lot of tables? If the single table is very large, what kind of configuration parameters does mydumper use for backup? (2) Upgrade the tidb version test Tested data: a table inserted by sysbench, 32 tables, 10000000 rows of data per table, At the time of backup, the memory overhead of tidb increased from 200M to 35G in a short time, and the overhead was still very large. |
what version of mydumper are you using?
when backuping, can you |
the version is |
It's not the latest release, you can download the latest via https://pingcap.com/docs-cn/v3.0/reference/tools/download/#syncer-loader-%E5%92%8C-mydumper. |
thanks, i'll test with the latest mydumper , what your suggestion about r param value in backup big data with the size of 1T; |
|
Thank you, when I use the latest tool test, with the -r parameter, the memory overhead of tidb has dropped significantly. When backing up big data sets, tidb will not oom; I still have a question When using the -r parameter, if you specify the -r size to 1000000, it means that the size of each file is cut according to the 1000000 rows of the table. If you use the loader for recovery, because the single transaction of the tidb has the following restrictions: A single transaction contains no more than 5000 SQL statements (default) When recovering as described above, a single transaction reads the recovered data for recovery. What is the size of the sql of a transaction? Is there a problem with this? |
Found in the cut file when looking for the insert statement 1000000 rows of data, corresponding to 205 inserts, each insert inserts about 4890 rows of data, less than the set 5000 rows; In other words, when mydumper performs backup, it will automatically cut the file. The data volume of sql is less than the given 5000 rows, regardless of the number specified by the -r parameter. |
Bug Report
Please answer these questions before submitting your issue. Thanks!
A:The number of tables in the backup target database is very large. There are more than 70,000 tables. The total amount of data is 1.2T from the monitoring. When I use mydumper for backup, I backed up 8 hours and backed up 40,000 tables. The amount of data backed up is only 20g. (not backed up)
I want to ask:
1)Why is it so slow when backing up many tables? What did mydumper do during this period?
2)Backing up a database with a particularly large number of database tables, is there any suggestion to speed up database backups for multiple tables?
B: I also encountered problems with tidb oom. At present, the version of tidb is 2.1.4. When I back up a database with a large amount of data, the situation of tidb oom will appear. Check the official website's issue, explaining the reason for the tidb version. When backing up, load all the data into the memory, which causes the tidb oom. The version after tidb2.1.13 fixes this.
When the database is backed up with a lot of tables, the speed will be very fast.
Very slow when backing up a database with a large number of tables
tidb-server -V
or runselect tidb_version();
on TiDB)?tidb version 2.1.4
The text was updated successfully, but these errors were encountered: