Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adjust the recommend value of raft election-timeout in multi dc deployment (#16468) #16636

Merged
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions dr-multi-replica.md
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,8 @@ summary: 了解 TiDB 提供的基于多副本的单集群容灾方案。
config:
server.labels: { Region: "Region3", AZ: "AZ5" }

raftstore.raft-min-election-timeout-ticks: 1000
raftstore.raft-max-election-timeout-ticks: 1200
raftstore.raft-min-election-timeout-ticks: 50
raftstore.raft-max-election-timeout-ticks: 60

monitoring_servers:
- host: tidb-dr-test2
Expand Down
133 changes: 131 additions & 2 deletions geo-distributed-deployment-topology.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,132 @@ summary: 介绍跨数据中心部署 TiDB 集群的拓扑结构。

### 拓扑模版

<<<<<<< HEAD
glorv marked this conversation as resolved.
Show resolved Hide resolved
[跨机房配置模板](https://github.com/pingcap/docs/blob/master/config-templates/geo-redundancy-deployment.yaml)
=======
<details>
<summary>跨机房配置模板</summary>

```yaml
# Tip: PD priority needs to be manually set using the PD-ctl client tool. such as, member Leader_priority PD-name numbers.
# Global variables are applied to all deployments and used as the default value of
# the deployments if a specific deployment value is missing.
#
# Abbreviations used in this example:
# sh: Shanghai Zone
# bj: Beijing Zone
# sha: Shanghai Datacenter A
# bja: Beijing Datacenter A
# bjb: Beijing Datacenter B

global:
user: "tidb"
ssh_port: 22
deploy_dir: "/tidb-deploy"
data_dir: "/tidb-data"
monitored:
node_exporter_port: 9100
blackbox_exporter_port: 9115
deploy_dir: "/tidb-deploy/monitored-9100"
server_configs:
tidb:
log.level: debug
log.slow-query-file: tidb-slow.log
tikv:
server.grpc-compression-type: gzip
readpool.storage.use-unified-pool: true
readpool.storage.low-concurrency: 8
pd:
replication.location-labels: ["zone","dc","rack","host"]
replication.max-replicas: 5
label-property: # TiDB 5.2 及以上版本默认不支持 label-property 配置。若要设置副本策略,请使用 Placement Rules。
reject-leader:
- key: "dc"
value: "sha"
pd_servers:
- host: 10.0.1.6
- host: 10.0.1.7
- host: 10.0.1.8
- host: 10.0.1.9
- host: 10.0.1.10
tidb_servers:
- host: 10.0.1.1
- host: 10.0.1.2
- host: 10.0.1.3
- host: 10.0.1.4
- host: 10.0.1.5
tikv_servers:
- host: 10.0.1.11
ssh_port: 22
port: 20160
status_port: 20180
deploy_dir: "/tidb-deploy/tikv-20160"
data_dir: "/tidb-data/tikv-20160"
config:
server.labels:
zone: bj
dc: bja
rack: rack1
host: host1
- host: 10.0.1.12
ssh_port: 22
port: 20161
status_port: 20181
deploy_dir: "/tidb-deploy/tikv-20161"
data_dir: "/tidb-data/tikv-20161"
config:
server.labels:
zone: bj
dc: bja
rack: rack1
host: host2
- host: 10.0.1.13
ssh_port: 22
port: 20160
status_port: 20180
deploy_dir: "/tidb-deploy/tikv-20160"
data_dir: "/tidb-data/tikv-20160"
config:
server.labels:
zone: bj
dc: bjb
rack: rack1
host: host1
- host: 10.0.1.14
ssh_port: 22
port: 20161
status_port: 20181
deploy_dir: "/tidb-deploy/tikv-20161"
data_dir: "/tidb-data/tikv-20161"
config:
server.labels:
zone: bj
dc: bjb
rack: rack1
host: host2
- host: 10.0.1.15
ssh_port: 22
port: 20160
deploy_dir: "/tidb-deploy/tikv-20160"
data_dir: "/tidb-data/tikv-20160"
config:
server.labels:
zone: sh
dc: sha
rack: rack1
host: host1
readpool.storage.use-unified-pool: true
readpool.storage.low-concurrency: 10
raftstore.raft-min-election-timeout-ticks: 50
raftstore.raft-max-election-timeout-ticks: 60
monitoring_servers:
- host: 10.0.1.16
grafana_servers:
- host: 10.0.1.16
```

</details>
>>>>>>> 074bd4bea6 (adjust the recommend value of raft election-timeout in multi dc deployment (#16468))
glorv marked this conversation as resolved.
Show resolved Hide resolved

以上 TiDB 集群拓扑文件中,详细的配置项说明见[通过 TiUP 部署 TiDB 集群的拓扑文件配置](/tiup/tiup-cluster-topology-reference.md)。

Expand Down Expand Up @@ -54,10 +179,14 @@ summary: 介绍跨数据中心部署 TiDB 集群的拓扑结构。
- 防止异地 TiKV 节点发起不必要的 Raft 选举,需要将异地 TiKV 节点发起选举时经过最少的 tick 个数和最多经过的 tick 个数都调大,这两个参数默认设置均为 `0`。

```yaml
raftstore.raft-min-election-timeout-ticks: 1000
raftstore.raft-max-election-timeout-ticks: 1020
raftstore.raft-min-election-timeout-ticks: 50
raftstore.raft-max-election-timeout-ticks: 60
```

> **注意:**
>
> 通过 `raftstore.raft-min-election-timeout-ticks` 和 `raftstore.raft-max-election-timeout-ticks` 为 TiKV 节点配置较大的 election timeout tick 可以大幅降低该节点上的 Region 成为 Leader 的概率。但在发生灾难的场景中,如果部分 TiKV 节点宕机,而其它存活的 TiKV 节点 Raft 日志落后,此时只有这个配置了较大的 election timeout tick 的 TiKV 节点上的 Region 能成为 Leader。由于此 TiKV 节点上的 Region 需要至少等待 `raftstore.raft-min-election-timeout-ticks` 设置的时间后才能发起选举,因此尽量避免将此配置值设置得过大,以免在这种场景下影响集群的可用性。

#### PD 参数

- PD 元数据信息记录 TiKV 集群的拓扑信息,根据四个维度调度 Raft Group 副本。
Expand Down
12 changes: 8 additions & 4 deletions three-data-centers-in-two-cities-deployment.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,8 +113,8 @@ tikv_servers:
- host: 10.63.10.34
config:
server.labels: { az: "3", replication zone: "5", rack: "5", host: "34" }
raftstore.raft-min-election-timeout-ticks: 1000
raftstore.raft-max-election-timeout-ticks: 1200
raftstore.raft-min-election-timeout-ticks: 50
raftstore.raft-max-election-timeout-ticks: 60

monitoring_servers:
- host: 10.63.10.60
Expand Down Expand Up @@ -174,10 +174,14 @@ tikv_servers:
- 优化跨区域 AZ3 的 TiKV 节点网络,修改 TiKV 的如下参数,拉长跨区域副本参与选举的时间,避免跨区域 TiKV 中的副本参与 Raft 选举。

```
raftstore.raft-min-election-timeout-ticks: 1000
raftstore.raft-max-election-timeout-ticks: 1200
raftstore.raft-min-election-timeout-ticks: 50
raftstore.raft-max-election-timeout-ticks: 60
```

> **注意:**
>
> 通过 `raftstore.raft-min-election-timeout-ticks` 和 `raftstore.raft-max-election-timeout-ticks` 为 TiKV 节点配置较大的 election timeout tick 可以大幅降低该节点上的 Region 成为 Leader 的概率。但在发生灾难的场景中,如果部分 TiKV 节点宕机,而其它存活的 TiKV 节点 Raft 日志落后,此时只有这个配置了较大的 election timeout tick 的 TiKV 节点上的 Region 能成为 Leader。由于此 TiKV 节点上的 Region 需要至少等待 `raftstore.raft-min-election-timeout-ticks` 设置的时间后才能发起选举,因此尽量避免将此配置值设置得过大,以免在这种场景下影响集群的可用性。

- 调度设置。在集群启动后,通过 `tiup ctl:v<CLUSTER_VERSION> pd` 工具进行调度策略修改。修改 TiKV Raft 副本数按照安装时规划好的副本数进行设置,在本例中为 5 副本。

```
Expand Down
Loading