title | summary | aliases | ||
---|---|---|---|---|
TiDB Binlog Relay Log |
Learn how to use relay log to maintain data consistency in extreme cases. |
|
When replicating binlogs, Drainer splits transactions from the upstream and replicates the split transactions concurrently to the downstream.
In extreme cases where the upstream clusters are not available and Drainer exits abnormally, the downstream clusters (MySQL or TiDB) might be in the intermediate states with inconsistent data. In such cases, Drainer can use the relay log to ensure that the downstream clusters are in a consistent state.
The downstream clusters reaching a consistent state means the data of the downstream clusters are the same as the snapshot of the upstream which sets tidb_snapshot = ts
.
The checkpoint consistency means Drainer checkpoint saves the consistent state of replication in consistent
. When Drainer runs, consistent
is false
. After Drainer exits normally, consistent
is set to true
.
You can query the downstream checkpoint table as follows:
{{< copyable "sql" >}}
select * from tidb_binlog.checkpoint;
+---------------------+----------------------------------------------------------------+
| clusterID | checkPoint |
+---------------------+----------------------------------------------------------------+
| 6791641053252586769 | {"consistent":false,"commitTS":414529105591271429,"ts-map":{}} |
+---------------------+----------------------------------------------------------------+
After Drainer enables the relay log, it first writes the binlog events to the disks and then replicates the events to the downstream clusters.
If the upstream clusters are not available, Drainer can restore the downstream clusters to a consistent state by reading the relay log.
Note:
If the relay log data is lost at the same time, this method does not work, but its incidence is very low. In addition, you can use the Network File System to ensure data safety of the relay log.
When Drainer is started, if it fails to connect to the Placement Driver (PD) of the upstream clusters, and it detects that consistent = false
in the checkpoint, Drainer will try to read the relay log, and restore the downstream clusters to a consistent state. After that, the Drainer process sets the checkpoint consistent
to true
and then exits.
Before data is replicated to the downstream, Drainer writes data to the relay log file. If the size of a relay log file reaches 10 MB (by default) and the binlog data of the current transaction is completely written, Drainer starts to write data to the next relay log file. After Drainer successfully replicates data to the downstream, it automatically cleans up the relay log files whose data has been replicated. The relay log into which data is currently being written will not be cleaned up.
To enable the relay log, add the following configuration in Drainer:
{{< copyable "" >}}
[syncer.relay]
# It saves the directory of the relay log. The relay log is not enabled if the value is empty.
# The configuration only comes to effect if the downstream is TiDB or MySQL.
log-dir = "/dir/to/save/log"
# The size limit of a single relay log file (unit: byte).
# When the size of a relay log file reaches this limit, data is written to the next relay log file.
max-file-size = 10485760