br: reduce memory footprint and HWM (high water mark) #51573
Labels
component/br
This issue is related to BR of TiDB.
type/enhancement
The issue or PR belongs to an enhancement.
Enhancement
BR can consume large size of volume especially at SAAS scenarios due to huge number of db/tables, temporary cache etc.
In fact, most of those memory is one time usage and can be allocated in smaller batches and also be released sooner.
In this issue, we track improvement on this area.
Common
The GC memory limit tuner would adjust the golang GC memory limit to a value close to TiDB server environment memory instead of BR's. Besides, backup/restore is a task with a lot of temporary memory, which requires to trigger GC frequently. Therefore, PR#51082 forbidden the GC memory limit tuner in BR binary.
Make stats export/import under DXF.
Catch possible goroutine leak
Automatically adjust
GOMEMLIMIT
for br clpBackup
Before v7.1.0, when the upstream cluster had a large number of wide tables, it was possible for BR to consume a lot of memory during the backup process. During a backup process, BR would keep three copies of the table information in memory:
PR#43003 removes the aforementioned second point of table/databse information. Instead, it adopts a traversal execution approach to promptly release the memory of information of backed up tables.
PR#47114 removes the aforementioned third point of table/databse information. It saves the schema information into some files, and the size of each file is at most 128 MB.
For the aforementioned first point, we will use BRIE via SQL on TiDB in future, and the TiDB shares the domain with BR task.
Restore
There might be a table with a large size of statistics (sometimes the table has many partitions). BR uses a lot of memory when backup/restore the table.
PR#49973 supports to dump/load statistics in partition dimension.
PR#49628 supports for BR to persist/restore the statistics data in partition dimension.
PR#57192 prevent preallocating too much items and uses too much memory.
Log Task
There is no need to start domain for br log operation except log restore. PR#52127 stops to start domain and creates etcd client by br itself.
Make sure connection to TiKV stores all closed finally
BR in SQL
Put BR in SQL under the memory quota control framework
The text was updated successfully, but these errors were encountered: