diff --git a/_docHome.md b/_docHome.md index d376c34d3e0cc..0e5d5e46b8f74 100644 --- a/_docHome.md +++ b/_docHome.md @@ -9,25 +9,25 @@ hide_leftNav: true -TiDB Cloud is a fully-managed Database-as-a-Service (DBaaS) that brings everything great about TiDB to your cloud, and lets you focus on your applications, not the complexities of your database. +TiDB Cloud is a fully-managed Database-as-a-Service (DBaaS) that brings everything great about TiDB to your cloud, letting you focus on your applications instead of the complexities of your database. -See the documentation of TiDB Cloud +View the documentation for TiDB Cloud. -Guides you through an easy way to get started with TiDB Cloud +Guide for an easy way to get started with TiDB Cloud. -Helps you quickly complete a Proof of Concept (PoC) of TiDB Cloud +Helps you quickly complete a Proof of Concept (PoC) with TiDB Cloud. @@ -55,25 +55,25 @@ TiDB is an open-source distributed SQL database that supports Hybrid Transaction -See the documentation of TiDB +View the documentation for TiDB. -Walks you through the quickest way to get started with TiDB +Walks you through the quickest way to get started with TiDB. -Learn how to deploy TiDB locally in production +Learn how to deploy TiDB locally in a production environment. -The open-source TiDB platform is released under the Apache 2.0 license, and supported by the community. +The open-source TiDB platform is released under the Apache 2.0 license and is supported by the community. Download @@ -85,13 +85,13 @@ The open-source TiDB platform is released under the Apache 2.0 license, and supp -Documentation for TiDB application developers +Documentation for TiDB application developers. -Documentation for TiDB Cloud application developers +Documentation for TiDB Cloud application developers. @@ -105,55 +105,55 @@ Documentation for TiDB Cloud application developers -One-stop and interactive experience of TiDB's capabilities WITHOUT registration +Experience the capabilities of TiDB WITHOUT registration. -Learn TiDB and TiDB Cloud through well-designed online courses and instructor-led training +Learn TiDB and TiDB Cloud through well-designed online courses and instructor-led training. -Join us on Slack or become a contributor +Join us on Slack or become a contributor. -Learn great articles about TiDB and TiDB Cloud +Read great articles about TiDB and TiDB Cloud. -See a compilation of short videos describing TiDB and a variety of use cases +Watch a compilation of short videos describing TiDB and various use cases. -Learn events about PingCAP and the community +Learn about events hosted by PingCAP and the community. -Download eBooks and papers +Download eBooks and papers. -A powerful insight tool that analyzes in depth any GitHub repository, powered by TiDB Cloud +A powerful insight tool that analyzes any GitHub repository in depth, powered by TiDB Cloud. -Let’s work together to make the documentation better! +Let's work together to improve the documentation! diff --git a/quick-start-with-tidb.md b/quick-start-with-tidb.md index 52dc7064a6467..d5e3e15ff5c0b 100644 --- a/quick-start-with-tidb.md +++ b/quick-start-with-tidb.md @@ -6,7 +6,7 @@ aliases: ['/docs/dev/quick-start-with-tidb/','/docs/dev/test-deployment-using-do # Quick Start Guide for the TiDB Database Platform -This guide walks you through the quickest way to get started with TiDB. For non-production environments, you can deploy your TiDB database by either of the following methods: +This guide provides the quickest way to get started with TiDB. For non-production environments, you can deploy your TiDB database using either of the following methods: - [Deploy a local test cluster](#deploy-a-local-test-cluster) (for macOS and Linux) - [Simulate production deployment on a single machine](#simulate-production-deployment-on-a-single-machine) (for Linux only) @@ -17,7 +17,7 @@ In addition, you can try out TiDB features on [TiDB Playground](https://play.tid > > The deployment method provided in this guide is **ONLY FOR** quick start, **NOT FOR** production. > -> - To deploy a self-hosted production cluster, see [production installation guide](/production-deployment-using-tiup.md). +> - To deploy a self-hosted production cluster, see the [production installation guide](/production-deployment-using-tiup.md). > - To deploy TiDB on Kubernetes, see [Get Started with TiDB on Kubernetes](https://docs.pingcap.com/tidb-in-kubernetes/stable/get-started). > - To manage TiDB in the cloud, see [TiDB Cloud Quick Start](https://docs.pingcap.com/tidbcloud/tidb-cloud-quickstart). @@ -28,7 +28,7 @@ In addition, you can try out TiDB features on [TiDB Playground](https://play.tid
-As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB instances, 3 TiKV instances, 3 PD instances, and optional TiFlash instances. With TiUP Playground, you can quickly build the test cluster by taking the following steps: +As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB instances, 3 TiKV instances, 3 PD instances, and optional TiFlash instances. With TiUP Playground, you can quickly build the test cluster by following these steps: 1. Download and install TiUP: @@ -38,7 +38,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh ``` - If the following message is displayed, you have installed TiUP successfully: + If the following message is displayed, you have successfully installed TiUP: ```log Successfully set mirror to https://tiup-mirrors.pingcap.com @@ -68,7 +68,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in 3. Start the cluster in the current session: - - If you want to start a TiDB cluster of the latest version with 1 TiDB instance, 1 TiKV instance, 1 PD instance, and 1 TiFlash instance, run the following command: + - To start a TiDB cluster of the latest version with 1 TiDB instance, 1 TiKV instance, 1 PD instance, and 1 TiFlash instance, run the following command: {{< copyable "shell-regular" >}} @@ -76,7 +76,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in tiup playground ``` - - If you want to specify the TiDB version and the number of the instances of each component, run a command like this: + - To specify the TiDB version and the number of instances of each component, run a command like this: {{< copyable "shell-regular" >}} @@ -102,7 +102,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in > > + Since v5.2.0, TiDB supports running `tiup playground` on the machine that uses the Apple M1 chip. > + For the playground operated in this way, after the test deployment is finished, TiUP will clean up the original cluster data. You will get a new cluster after re-running the command. - > + If you want the data to be persisted on storage, run `tiup --tag playground ...`. For details, refer to [TiUP Reference Guide](/tiup/tiup-reference.md#-t---tag). + > + If you want the data to be persisted on storage, run `tiup --tag playground ...`. For details, refer to the [TiUP Reference](/tiup/tiup-reference.md#-t---tag) guide. 4. Start a new session to access TiDB: @@ -114,7 +114,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in tiup client ``` - + You can also use the MySQL client to connect to TiDB. + + Alternatively, you can use the MySQL client to connect to TiDB. {{< copyable "shell-regular" >}} @@ -124,7 +124,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in 5. Access the Prometheus dashboard of TiDB at . -6. Access the [TiDB Dashboard](/dashboard/dashboard-intro.md) at . The default username is `root`, with an empty password. +6. Access the [TiDB Dashboard](/dashboard/dashboard-intro.md) at . The default username is `root`, and the password is empty. 7. Access the Grafana dashboard of TiDB through . Both the default username and password are `admin`. @@ -149,7 +149,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in
-As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB instances, 3 TiKV instances, 3 PD instances, and optional TiFlash instances. With TiUP Playground, you can quickly build the test cluster by taking the following steps: +As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB instances, 3 TiKV instances, 3 PD instances, and optional TiFlash instances. With TiUP Playground, you can quickly build the test cluster by following these steps: 1. Download and install TiUP: @@ -159,7 +159,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in curl --proto '=https' --tlsv1.2 -sSf https://tiup-mirrors.pingcap.com/install.sh | sh ``` - If the following message is displayed, you have installed TiUP successfully: + If the following message is displayed, you have successfully installed TiUP: ```log Successfully set mirror to https://tiup-mirrors.pingcap.com @@ -189,7 +189,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in 3. Start the cluster in the current session: - - If you want to start a TiDB cluster of the latest version with 1 TiDB instance, 1 TiKV instance, 1 PD instance, and 1 TiFlash instance, run the following command: + - To start a TiDB cluster of the latest version with 1 TiDB instance, 1 TiKV instance, 1 PD instance, and 1 TiFlash instance, run the following command: {{< copyable "shell-regular" >}} @@ -197,7 +197,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in tiup playground ``` - - If you want to specify the TiDB version and the number of the instances of each component, run a command like this: + - To specify the TiDB version and the number of instances of each component, run a command like this: {{< copyable "shell-regular" >}} @@ -221,7 +221,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in > **Note:** > > For the playground operated in this way, after the test deployment is finished, TiUP will clean up the original cluster data. You will get a new cluster after re-running the command. - > If you want the data to be persisted on storage, run `tiup --tag playground ...`. For details, refer to [TiUP Reference Guide](/tiup/tiup-reference.md#-t---tag). + > If you want the data to be persisted on storage, run `tiup --tag playground ...`. For details, refer to the [TiUP Reference](/tiup/tiup-reference.md#-t---tag) guide. 4. Start a new session to access TiDB: @@ -233,7 +233,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in tiup client ``` - + You can also use the MySQL client to connect to TiDB. + + Alternatively, you can use the MySQL client to connect to TiDB. {{< copyable "shell-regular" >}} @@ -243,7 +243,7 @@ As a distributed system, a basic TiDB test cluster usually consists of 2 TiDB in 5. Access the Prometheus dashboard of TiDB at . -6. Access the [TiDB Dashboard](/dashboard/dashboard-intro.md) at . The default username is `root`, with an empty password. +6. Access the [TiDB Dashboard](/dashboard/dashboard-intro.md) at . The default username is `root`, and the password is empty. 7. Access the Grafana dashboard of TiDB through . Both the default username and password are `admin`. @@ -276,16 +276,16 @@ This section describes how to deploy a TiDB cluster using a YAML file of the sma ### Prepare -Prepare a target machine that meets the following requirements: +Before deploying the TiDB cluster, ensure that the target machine meets the following requirements: -- CentOS 7.3 or a later version is installed -- The Linux OS has access to the Internet, which is required to download TiDB and related software installation packages +- CentOS 7.3 or a later version is installed. +- The Linux OS has access to the internet, which is required to download TiDB and related software installation packages. -The smallest TiDB cluster topology is as follows: +The smallest TiDB cluster topology consists of the following instances: > **Note:** > -> The IP address of the following instances only serves as an example IP. In your actual deployment, you need to replace the IP with your actual IP. +> The IP addresses of the instances are given as examples only. In your actual deployment, replace the IP addresses with your actual IP addresses. | Instance | Count | IP | Configuration | |:-- | :-- | :-- | :-- | @@ -295,14 +295,14 @@ The smallest TiDB cluster topology is as follows: | TiFlash | 1 | 10.0.1.1 | The default port
Global directory configuration | | Monitor | 1 | 10.0.1.1 | The default port
Global directory configuration | -Other requirements for the target machine: +Other requirements for the target machine include: -- The `root` user and its password is required +- The `root` user and its password are required - [Stop the firewall service of the target machine](/check-before-deployment.md#check-and-stop-the-firewall-service-of-target-machines), or open the port needed by the TiDB cluster nodes - Currently, the TiUP cluster supports deploying TiDB on the x86_64 (AMD64) and ARM architectures: - - It is recommended to use CentOS 7.3 or later versions on AMD64 - - It is recommended to use CentOS 7.6 1810 on ARM + - It is recommended to use CentOS 7.3 or later versions on AMD64. + - It is recommended to use CentOS 7.6 1810 on ARM. ### Deploy @@ -346,7 +346,7 @@ Other requirements for the target machine: tiup update --self && tiup update cluster ``` -5. Use the root user privilege to increase the connection limit of the `sshd` service. This is because TiUP needs to simulate deployment on multiple machines. +5. Increase the connection limit of the `sshd` service using the root user privilege. This is because TiUP needs to simulate deployment on multiple machines. 1. Modify `/etc/ssh/sshd_config`, and set `MaxSessions` to `20`. 2. Restart the `sshd` service: @@ -499,17 +499,17 @@ Other requirements for the target machine: ## What's next -- If you have just deployed a TiDB cluster for the local test environment: +If you have just deployed a TiDB cluster for the local test environment, here are the next steps: - - Learn [Basic SQL operations in TiDB](/basic-sql-operations.md) - - [Migrate data to TiDB](/migration-overview.md) +- Learn about basic SQL operations in TiDB by referring to [Basic SQL operations in TiDB](/basic-sql-operations.md). +- You can also migrate data to TiDB by referring to [Migrate data to TiDB](/migration-overview.md). -- If you are ready to deploy a TiDB cluster for the production environment: +If you are ready to deploy a TiDB cluster for the production environment, here are the next steps: - - [Deploy TiDB using TiUP](/production-deployment-using-tiup.md) - - [Deploy TiDB on Cloud using TiDB Operator](https://docs.pingcap.com/tidb-in-kubernetes/stable) +- [Deploy TiDB using TiUP](/production-deployment-using-tiup.md) +- Alternatively, you can deploy TiDB on Cloud using TiDB Operator by referring to the [TiDB on Kubernetes](https://docs.pingcap.com/tidb-in-kubernetes/stable) documentation. -- If you're looking for analytics solution with TiFlash: +If you are looking for an analytics solution with TiFlash, here are the next steps: - - [Use TiFlash](/tiflash/tiflash-overview.md#use-tiflash) - - [TiFlash Overview](/tiflash/tiflash-overview.md) +- [Use TiFlash](/tiflash/tiflash-overview.md#use-tiflash) +- [TiFlash Overview](/tiflash/tiflash-overview.md) diff --git a/sql-statements/sql-statement-flush-privileges.md b/sql-statements/sql-statement-flush-privileges.md index aa1ece3d1117f..bbe49f51fc507 100644 --- a/sql-statements/sql-statement-flush-privileges.md +++ b/sql-statements/sql-statement-flush-privileges.md @@ -6,7 +6,7 @@ aliases: ['/docs/dev/sql-statements/sql-statement-flush-privileges/','/docs/dev/ # FLUSH PRIVILEGES -This statement triggers TiDB to reload the in-memory copy of privileges from the privilege tables. You should execute `FLUSH PRIVILEGES` after making manual edits to tables such as `mysql.user`. Executing this statement is not required after using privilege statements such as `GRANT` or `REVOKE`. Executing this statement requires the `RELOAD` privilege. +The statement `FLUSH PRIVILEGES` instructs TiDB to reload the in-memory copy of privileges from the privilege tables. You must execute this statement after manually editing tables such as `mysql.user`. However, executing this statement is not necessary after using privilege statements like `GRANT` or `REVOKE`. To execute this statement, the `RELOAD` privilege is required. ## Synopsis @@ -35,7 +35,7 @@ Query OK, 0 rows affected (0.01 sec) ## MySQL compatibility -This statement is understood to be fully compatible with MySQL. Any compatibility differences should be [reported via an issue](https://github.com/pingcap/tidb/issues/new/choose) on GitHub. +This statement is fully compatible with MySQL. If you find any compatibility differences, report them via [an issue on GitHub](https://github.com/pingcap/tidb/issues/new/choose). ## See also diff --git a/ticdc/ticdc-overview.md b/ticdc/ticdc-overview.md index 5e426a37fef4d..0f05b1ffd7ec1 100644 --- a/ticdc/ticdc-overview.md +++ b/ticdc/ticdc-overview.md @@ -6,88 +6,92 @@ aliases: ['/docs/dev/ticdc/ticdc-overview/','/docs/dev/reference/tools/ticdc/ove # TiCDC Overview -[TiCDC](https://github.com/pingcap/tiflow/tree/master/cdc) is a tool used for replicating incremental data of TiDB. Specifically, TiCDC pulls TiKV change logs, sorts captured data, and exports row-based incremental data to downstream databases. +[TiCDC](https://github.com/pingcap/tiflow/tree/master/cdc) is a tool used to replicate incremental data from TiDB. Specifically, TiCDC pulls TiKV change logs, sorts captured data, and exports row-based incremental data to downstream databases. ## Usage scenarios -- Provides data high availability and disaster recovery solutions for multiple TiDB clusters, ensuring eventual data consistency between primary and secondary clusters in case of disaster. -- Replicates real-time data changes to homogeneous systems so as to provide data sources for various scenarios such as monitoring, caching, global indexing, data analysis, and primary-secondary replication between heterogeneous databases. +TiCDC has multiple usage scenarios, including: + +- Providing high availability and disaster recovery solutions for multiple TiDB clusters. TiCDC ensures eventual data consistency between primary and secondary clusters in case of a disaster. +- Replicating real-time data changes to homogeneous systems. This provides data sources for various scenarios, such as monitoring, caching, global indexing, data analysis, and primary-secondary replication between heterogeneous databases. ## Major features ### Key capabilities -- Replicate incremental data from one TiDB cluster to another TiDB cluster with second-level RPO and minute-level RTO. -- Replicate data bidirectionally between TiDB clusters, based on which you can create a multi-active TiDB solution using TiCDC. -- Replicate incremental data from a TiDB cluster to a MySQL database (or other MySQL-compatible databases) with low latency. -- Replicate incremental data from a TiDB cluster to a Kafka cluster. The recommended data format includes [Canal-JSON](/ticdc/ticdc-canal-json.md) and [Avro](/ticdc/ticdc-avro-protocol.md). -- Replicate tables with the ability to filter databases, tables, DMLs, and DDLs. -- Be highly available with no single point of failure. Supports dynamically adding and deleting TiCDC nodes. -- Support cluster management through [Open API](/ticdc/ticdc-open-api.md), including querying task status, dynamically modifying task configuration, and creating or deleting tasks. +TiCDC has the following key capabilities: + +- Replicating incremental data between TiDB clusters with second-level RPO and minute-level RTO. +- Bidirectional replication between TiDB clusters, allowing the creation of a multi-active TiDB solution using TiCDC. +- Replicating incremental data from a TiDB cluster to a MySQL database or other MySQL-compatible databases with low latency. +- Replicating incremental data from a TiDB cluster to a Kafka cluster. The recommended data format includes [Canal-JSON](/ticdc/ticdc-canal-json.md) and [Avro](/ticdc/ticdc-avro-protocol.md). +- Replicating tables with the ability to filter databases, tables, DMLs, and DDLs. +- High availability with no single point of failure, supporting dynamically adding and deleting TiCDC nodes. +- Cluster management through [Open API](/ticdc/ticdc-open-api.md), including querying task status, dynamically modifying task configuration, and creating or deleting tasks. ### Replication order - For all DDL or DML statements, TiCDC outputs them **at least once**. - When the TiKV or TiCDC cluster encounters a failure, TiCDC might send the same DDL/DML statement repeatedly. For duplicated DDL/DML statements: - - MySQL sink can execute DDL statements repeatedly. For DDL statements that can be executed repeatedly in the downstream, such as `truncate table`, the statement is executed successfully. For those that cannot be executed repeatedly, such as `create table`, the execution fails, and TiCDC ignores the error and continues the replication. - - Kafka sink - - Kafka sink provides different strategies for data distribution. You can distribute data to different Kafka partitions based on the table, primary key, or timestamp. This ensures that the updated data of a row is sent to the same partition in order. - - All these distribution strategies send Resolved TS messages to all topics and partitions periodically. This indicates that all messages earlier than the Resolved TS have been sent to the topics and partitions. The Kafka consumer can use the Resolved TS to sort the messages received. - - Kafka sink sends duplicated messages sometimes, but these duplicated messages do not affect the constraints of `Resolved Ts`. For example, if a changefeed is paused and then resumed, Kafka sink might send `msg1`, `msg2`, `msg3`, `msg2`, and `msg3` in order. You can filter the duplicated messages from Kafka consumers. + - The MySQL sink can execute DDL statements repeatedly. For DDL statements that can be executed repeatedly in the downstream, such as `TRUNCATE TABLE`, the statement is executed successfully. For those that cannot be executed repeatedly, such as `CREATE TABLE`, the execution fails, and TiCDC ignores the error and continues with the replication process. + - The Kafka sink provides different strategies for data distribution. + - You can distribute data to different Kafka partitions based on the table, primary key, or timestamp. This ensures that the updated data of a row is sent to the same partition in order. + - All these distribution strategies send `Resolved TS` messages to all topics and partitions periodically. This indicates that all messages earlier than the `Resolved TS` have already been sent to the topics and partitions. The Kafka consumer can use the `Resolved TS` to sort the messages received. + - The Kafka sink sometimes sends duplicated messages, but these duplicated messages do not affect the constraints of `Resolved Ts`. For example, if a changefeed is paused and then resumed, the Kafka sink might send `msg1`, `msg2`, `msg3`, `msg2`, and `msg3` in order. You can filter out the duplicated messages from Kafka consumers. ### Replication consistency - MySQL sink - - TiCDC enables redo log to ensure eventual consistency of data replication. - - TiCDC **ensures** that the order of single-row updates is consistent with that in the upstream. - - TiCDC does **not ensure** that the execution order of downstream transactions is the same as that of upstream transactions. + - TiCDC enables the redo log to ensure eventual consistency of data replication. + - TiCDC ensures that the order of single-row updates is consistent with the upstream. + - TiCDC does not ensure that the downstream transactions are executed in the same order as the upstream transactions. > **Note:** > - > Since v6.2, you can use the sink uri parameter [`transaction-atomicity`](/ticdc/ticdc-sink-to-mysql.md#configure-sink-uri-for-mysql-or-tidb) to control whether to split single-table transactions. Splitting single-table transactions can greatly reduce the latency and memory consumption of replicating large transactions. + > Since v6.2, you can use the sink URI parameter [`transaction-atomicity`](/ticdc/ticdc-sink-to-mysql.md#configure-sink-uri-for-mysql-or-tidb) to control whether to split single-table transactions. Splitting single-table transactions can greatly reduce the latency and memory consumption of replicating large transactions. ## TiCDC architecture -As an incremental data replication tool for TiDB, TiCDC is highly available through PD's etcd. The replication process is as follows: +TiCDC is an incremental data replication tool for TiDB, which is highly available through PD's etcd. The replication process consists of the following steps: 1. Multiple TiCDC processes pull data changes from TiKV nodes. -2. Data changes pulled from TiKV are sorted and merged internally. -3. Data changes are replicated to multiple downstream systems through multiple replication tasks (changefeeds). +2. TiCDC sorts and merges the data changes. +3. TiCDC replicates the data changes to multiple downstream systems through multiple replication tasks (changefeeds). -The architecture of TiCDC is shown in the following figure: +The architecture of TiCDC is illustrated in the following figure: ![TiCDC architecture](/media/ticdc/cdc-architecture.png) -The components in the preceding architecture diagram are described as follows: +The components in the architecture diagram are described as follows: -- TiKV Server: TiKV nodes in a TiDB cluster. When data changes, TiKV nodes send the changes as change logs (KV change logs) to TiCDC nodes. If TiCDC nodes find the change logs not continuous, they will actively request the TiKV nodes to provide change logs. -- TiCDC: TiCDC nodes where the TiCDC processes run. Each node runs a TiCDC process. Each process pulls data changes from one or more tables in TiKV nodes, and replicates the changes to the downstream system through the sink component. -- PD: The scheduling module in a TiDB cluster. This module is in charge of scheduling cluster data and usually consists of three PD nodes. PD provides high availability through the etcd cluster. In the etcd cluster, TiCDC stores its metadata, such as node status information and changefeed configurations. +- TiKV Server: TiKV nodes in a TiDB cluster. When data changes occur, TiKV nodes send the changes as change logs (KV change logs) to TiCDC nodes. If TiCDC nodes detect that the change logs are not continuous, they will actively request the TiKV nodes to provide change logs. +- TiCDC: TiCDC nodes where TiCDC processes run. Each node runs a TiCDC process. Each process pulls data changes from one or more tables in TiKV nodes and replicates the changes to the downstream system through the sink component. +- PD: The scheduling module in a TiDB cluster. This module is responsible for scheduling cluster data and usually consists of three PD nodes. PD provides high availability through the etcd cluster. In the etcd cluster, TiCDC stores its metadata, such as node status information and changefeed configurations. -As shown in the preceding architecture diagram, TiCDC supports replicating data to TiDB, MySQL, and Kafka databases. +As shown in the architecture diagram, TiCDC supports replicating data to TiDB, MySQL, and Kafka databases. ## Best practices -- When you use TiCDC to replicate data between two TiDB clusters and the network latency between the clusters is higher than 100 ms, it is recommended that you deploy TiCDC in the region (IDC) where the downstream TiDB cluster is located. -- TiCDC only replicates the table that has at least one **valid index**. A **valid index** is defined as follows: +- If the network latency between two TiDB clusters is higher than 100 ms, it is recommended to deploy TiCDC in the region (IDC) where the downstream TiDB cluster is located when replicating data between the two clusters. +- TiCDC only replicates tables that have at least one valid index. A valid index is defined as follows: - A primary key (`PRIMARY KEY`) is a valid index. - - A unique index (`UNIQUE INDEX`) is valid if every column of the index is explicitly defined as non-nullable (`NOT NULL`) and the index does not have the virtual generated column (`VIRTUAL GENERATED COLUMNS`). + - A unique index (`UNIQUE INDEX`) is valid if every column of the index is explicitly defined as non-nullable (`NOT NULL`) and the index does not have a virtual generated column (`VIRTUAL GENERATED COLUMNS`). - To use TiCDC in disaster recovery scenarios, you need to configure [redo log](/ticdc/ticdc-sink-to-mysql.md#eventually-consistent-replication-in-disaster-scenarios). -- When you replicate a wide table with a large single row (greater than 1K), it is recommended that you configure [`per-table-memory-quota`](/ticdc/ticdc-server-config.md) so that `per-table-memory-quota` = `ticdcTotalMemory`/(`tableCount` * 2). `ticdcTotalMemory` is the memory of a TiCDC node, and `tableCount` is the number of target tables that a TiCDC node replicates. +- When you replicate a wide table with a large single row (greater than 1K), it is recommended to configure the [`per-table-memory-quota`](/ticdc/ticdc-server-config.md) so that `per-table-memory-quota` = `ticdcTotalMemory`/(`tableCount` * 2). `ticdcTotalMemory` is the memory of a TiCDC node, and `tableCount` is the number of target tables that a TiCDC node replicates. > **Note:** > -> Since v4.0.8, TiCDC supports replicating tables **without a valid index** by modifying the task configuration. However, this compromises the guarantee of data consistency to some extent. For more details, see [Replicate tables without a valid index](/ticdc/ticdc-manage-changefeed.md#replicate-tables-without-a-valid-index). +> Since v4.0.8, TiCDC supports replicating tables without a valid index by modifying the task configuration. However, this compromises the guarantee of data consistency to some extent. For more details, see [Replicate tables without a valid index](/ticdc/ticdc-manage-changefeed.md#replicate-tables-without-a-valid-index). -### Unsupported scenarios +## Unsupported scenarios Currently, the following scenarios are not supported: -- The TiKV cluster that uses RawKV alone. -- The [DDL operation `CREATE SEQUENCE`](/sql-statements/sql-statement-create-sequence.md) and the [SEQUENCE function](/sql-statements/sql-statement-create-sequence.md#sequence-function) in TiDB. When the upstream TiDB uses `SEQUENCE`, TiCDC ignores `SEQUENCE` DDL operations/functions performed upstream. However, DML operations using `SEQUENCE` functions can be correctly replicated. +- A TiKV cluster that uses RawKV alone. +- The [`CREATE SEQUENCE` DDL operation](/sql-statements/sql-statement-create-sequence.md) and the [`SEQUENCE` function](/sql-statements/sql-statement-create-sequence.md#sequence-function) in TiDB. When the upstream TiDB uses `SEQUENCE`, TiCDC ignores `SEQUENCE` DDL operations/functions performed upstream. However, DML operations using `SEQUENCE` functions can be correctly replicated. -TiCDC only provides partial support for scenarios of large transactions in the upstream. For details, refer to [Does TiCDC support replicating large transactions? Is there any risk?](/ticdc/ticdc-faq.md#does-ticdc-support-replicating-large-transactions-is-there-any-risk). +TiCDC only partially supports scenarios involving large transactions in the upstream. For details, refer to the [TiCDC FAQ](/ticdc/ticdc-faq.md#does-ticdc-support-replicating-large-transactions-is-there-any-risk), where you can find details on whether TiCDC supports replicating large transactions and any associated risks.