Skip to content

Commit

Permalink
en: revise the tidbcluster scaling doc (#390)
Browse files Browse the repository at this point in the history
* en: revise the tidbcluster scaling doc

* Apply suggestions from code review

Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: TomShawn <41534398+TomShawn@users.noreply.github.com>
  • Loading branch information
ran-huang and TomShawn authored Jun 17, 2020
1 parent 683f244 commit 75b7bbb
Show file tree
Hide file tree
Showing 2 changed files with 170 additions and 47 deletions.
135 changes: 118 additions & 17 deletions en/enable-tidb-cluster-auto-scaling.md
Original file line number Diff line number Diff line change
Expand Up @@ -102,30 +102,131 @@ spec:
......
```

## Quick start
## Example

Run the following commands to quickly deploy a TiDB cluster with 3 PD instances, 3 TiKV instances, 2 TiDB instances, and the monitoring and the auto-scaling features.
1. Run the following commands to quickly deploy a TiDB cluster with 3 PD instances, 3 TiKV instances, 2 TiDB instances, and the monitoring and the auto-scaling features.

```shell
$ kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/examples/auto-scale/tidb-cluster.yaml -n ${namespace}
tidbcluster.pingcap.com/auto-scaling-demo created
{{< copyable "shell-regular" >}}

$ kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/examples/auto-scale/tidb-monitor.yaml -n ${namespace}
tidbmonitor.pingcap.com/auto-scaling-demo created
```shell
$ kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/examples/auto-scale/tidb-cluster.yaml -n ${namespace}
```

$ kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/examples/auto-scale/tidb-cluster-auto-scaler.yaml -n ${namespace}
tidbclusterautoscaler.pingcap.com/auto-scaling-demo created
```
{{< copyable "shell-regular" >}}

```shell
$ kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/examples/auto-scale/tidb-monitor.yaml -n ${namespace}
```

After the TiDB cluster is created, you can stress test the auto-scaling feature through database stress test tools such as [sysbench](https://www.percona.com/blog/tag/sysbench/).
{{< copyable "shell-regular" >}}

Run the following commands to destroy the environment:
```shell
$ kubectl apply -f https://raw.githubusercontent.com/pingcap/tidb-operator/master/examples/auto-scale/tidb-cluster-auto-scaler.yaml -n ${namespace}
```

```shell
kubectl delete tidbcluster auto-scaling-demo -n ${namespace}
kubectl delete tidbmonitor auto-scaling-demo -n ${namespace}
kubectl delete tidbclusterautoscaler auto-scaling-demo -n ${namespace}
```
2. After the TiDB cluster is created, expose the TiDB cluster service to the local machine by running the following command:

{{< copyable "shell-regular" >}}

```shell
kubectl port-forward svc/auto-scaling-demo-tidb 4000:4000 &
```

Copy the following content and paste it to the local `sysbench.config` file:

{{< copyable "" >}}

```config
mysql-host=127.0.0.1
mysql-port=4000
mysql-user=root
mysql-password=
mysql-db=test
time=120
threads=20
report-interval=5
db-driver=mysql
```

3. Prepare data and perform the stress test against the auto-scaling feature using [sysbench](https://github.com/akopytov/sysbench).

Copy the following content and paste it to the local `sysbench.config` file:

{{< copyable "" >}}

```config
mysql-host=127.0.0.1
mysql-port=4000
mysql-user=root
mysql-password=
mysql-db=test
time=120
threads=20
report-interval=5
db-driver=mysql
```

Prepare data by running the following command:

{{< copyable "shell-regular" >}}

```shell
sysbench --config-file=${path-to-file}/sysbench.config oltp_point_select --tables=1 --table-size=20000 prepare
```

Start the stress test:

{{< copyable "shell-regular" >}}

```shell
sysbench --config-file=${path-to-file}/sysbench.config oltp_point_select --tables=1 --table-size=20000 run
```

The command above will return the following result:

```sh
Initializing worker threads...
Threads started!
[ 5s ] thds: 20 tps: 37686.35 qps: 37686.35 (r/w/o: 37686.35/0.00/0.00) lat (ms,95%): 0.99 err/s: 0.00 reconn/s: 0.00
[ 10s ] thds: 20 tps: 38487.20 qps: 38487.20 (r/w/o: 38487.20/0.00/0.00) lat (ms,95%): 0.95 err/s: 0.00 reconn/s: 0.00
```

4. Create a new terminal session and view the Pod changing status of the TiDB cluster by running the following command:

{{< copyable "shell-regular" >}}

```shell
watch -n1 "kubectl -n ${namespace} get pod"
```

The output is as follows:

```sh
auto-scaling-demo-discovery-fbd95b679-f4cb9 1/1 Running 0 17m
auto-scaling-demo-monitor-6857c58564-ftkp4 3/3 Running 0 17m
auto-scaling-demo-pd-0 1/1 Running 0 17m
auto-scaling-demo-tidb-0 2/2 Running 0 15m
auto-scaling-demo-tidb-1 2/2 Running 0 15m
auto-scaling-demo-tikv-0 1/1 Running 0 15m
auto-scaling-demo-tikv-1 1/1 Running 0 15m
auto-scaling-demo-tikv-2 1/1 Running 0 15m
```

View the changing status of Pods and the TPS and QPS of sysbench. When new Pods are created in TiKV and TiDB, the TPS and QPS of sysbench increase significantly.

After sysbench finishes the test, the newly created Pods in TiKV and TiDB disappear automatically.

5. Destroy the environment by running the following commands:

{{< copyable "shell-regular" >}}

```shell
kubectl delete tidbcluster auto-scaling-demo -n ${namespace}
kubectl delete tidbmonitor auto-scaling-demo -n ${namespace}
kubectl delete tidbclusterautoscaler auto-scaling-demo -n ${namespace}
```

## TidbClusterAutoScaler configurations

Expand Down
82 changes: 52 additions & 30 deletions en/scale-a-tidb-cluster.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,47 @@ This document introduces how to horizontally and vertically scale a TiDB cluster

Horizontally scaling TiDB means that you scale TiDB out or in by adding or remove nodes in your pool of resources. When you scale a TiDB cluster, PD, TiKV, and TiDB are scaled out or in sequentially according to the values of their replicas. Scaling out operations add nodes based on the node ID in ascending order, while scaling in operations remove nodes based on the node ID in descending order.

Currently, the TiDB cluster supports management by Helm or by TidbCluster Custom Resource (CR). You can choose the scaling method based on the management method of your TiDB cluster.
Currently, the TiDB cluster supports management by TidbCluster Custom Resource (CR).

### Horizontal scaling operations (CR)
### Scale PD, TiDB, and TiKV

#### Scale PD, TiDB, and TiKV
Modify `spec.pd.replicas`, `spec.tidb.replicas`, and `spec.tikv.replicas` in the `TidbCluster` object of the cluster to a desired value using kubectl. You can modify the values in the local file or using online command.

Modify `spec.pd.replicas`, `spec.tidb.replicas`, and `spec.tikv.replicas` in the `TidbCluster` object of the cluster to a desired value using kubectl.
- If a yaml file that describes the TiDB cluster exists in your local machine, modify `spec.pd.replicas`, `spec.tidb.replicas`, and `spec.tikv.replicas` in the local file to your desired values. Then deploy the yaml file to the cluster by running the following command:

{{< copyable "shell-regular" >}}

```shell
kubectl apply -f ${target_file}.yaml -n ${namespace}
```

- You can also online modify the `TidbCluster` definition in the Kubernetes cluster by running the following command:

{{< copyable "shell-regular" >}}

```shell
kubectl edit tidbcluster ${cluster_name} -n ${namespace}
```

After modifying the values above, check whether the TiDB cluster in Kubernetes has updated to your desired definition:

{{< copyable "shell-regular" >}}

```shell
kubectl get tidbcluster ${cluster_name} -n ${namespace} -oyaml
```

In the `TidbCluster` file output by the command above, if the values of `spec.pd.replicas`, `spec.tidb.replicas`, and `spec.tikv.replicas` are consistent with the values you have modified, check whether the number of `TidbCluster` Pods has increased or decreased by running the following command:

{{< copyable "shell-regular" >}}

```shell
watch kubectl -n ${namespace} get pod -o wide
```

For the PD and TiDB components, it might take 10-30 seconds to scale in or out.

For the TiKV component, it might take 3-5 minutes to scale in or out because the process involves data migration.

#### Scale out TiFlash

Expand Down Expand Up @@ -76,21 +110,15 @@ If TiCDC is deployed in the cluster, you can scale out TiCDC by modifying `spec.

6. Modify `spec.tiflash.replicas` to scale in TiFlash.

### Horizontal scaling operations (Helm)

To perform a horizontal scaling operation, take the following steps:

1. Modify `pd.replicas`, `tidb.replicas`, `tikv.replicas` in the `value.yaml` file of the cluster to a desired value.

2. Run the `helm upgrade` command to scale out or in:
Check whether TiFlash in the TiDB cluster in Kubernetes has updated to your desired definition. Run the following command and see whether the value of `spec.tiflash.replicas` returned is expected:

{{< copyable "shell-regular" >}}

```shell
helm upgrade ${release_name} pingcap/tidb-cluster -f values.yaml --version=${version}
kubectl get tidbcluster ${cluster-name} -n ${namespace} -oyaml
```

### View the scaling status
### View the horizontal scaling status

To view the scaling status of the cluster, run the following command:

Expand All @@ -110,35 +138,25 @@ When the number of Pods for all components reaches the preset value and all comp
> - The TiFlash component has the same scale-in logic as TiKV.
> - When the PD, TiKV, and TiFlash components scale in, the PVC of the deleted node is retained during the scaling in process. Because the PV's reclaim policy is changed to `Retain`, the data can still be retrieved even if the PVC is deleted.
### Horizontal scaling failure
During the horizontal scaling operation, Pods might go to the Pending state because of insufficient resources. See [Troubleshoot the Pod in Pending state](troubleshoot.md#the-pod-is-in-the-pending-state).
## Vertical scaling
Vertically scaling TiDB means that you scale TiDB up or down by increasing or decreasing the limit of resources on the node. Vertically scaling is essentially the rolling update of the nodes.
Currently, the TiDB cluster supports management by Helm or by TidbCluster Custom Resource (CR). You can choose the scaling method based on the management method of your TiDB cluster.
Currently, the TiDB cluster supports management by TidbCluster Custom Resource (CR).
### Vertical scaling operations (CR)
### Vertical scaling operations
Modify `spec.pd.resources`, `spec.tikv.resources`, and `spec.tidb.resources` in the `TidbCluster` object that corresponds to the cluster to the desired values using kubectl.
If TiFlash is deployed in the cluster, you can scale up and down TiFlash by modifying `spec.tiflash.resources`.
If TiCDC is deployed in the cluster, you can scale up and down TiCDC by modifying `spec.ticdc.resources`.
### Vertical scaling operations (Helm)
To perform a vertical scaling operation:
1. Modify `tidb.resources`, `tikv.resources`, `pd.resources` in the `values.yaml` file to a desired value.
2. Run the `helm upgrade` command to upgrade:
{{< copyable "shell-regular" >}}
```shell
helm upgrade ${release_name} pingcap/tidb-cluster -f values.yaml --version=${version}
```
### View the upgrade progress
### View the vertical scaling progress
To view the upgrade progress of the cluster, run the following command:
Expand All @@ -154,3 +172,7 @@ When all Pods are rebuilt and in the `Running` state, the vertical scaling is co
>
> - If the resource's `requests` field is modified during the vertical scaling process, and if PD, TiKV, and TiFlash use `Local PV`, they will be scheduled back to the original node after the upgrade. At this time, if the original node does not have enough resources, the Pod ends up staying in the `Pending` status and thus impacts the service.
> - TiDB is a horizontally scalable database, so it is recommended to take advantage of it simply by adding more nodes rather than upgrading hardware resources like you do with a traditional database.

### Vertical scaling failure

During the vertical scaling operation, Pods might go to the Pending state because of insufficient resources. See [Troubleshoot the Pod in Pending state](troubleshoot.md#the-pod-is-in-the-pending-state) for details.

0 comments on commit 75b7bbb

Please sign in to comment.