Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TiUP: Add three docs about tiup dm display, upgrade, and cluster check #5291

Merged
merged 17 commits into from
Apr 16, 2021
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
216 changes: 216 additions & 0 deletions tiup/tiup-component-cluster-check.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,216 @@
---
title: tiup cluster check
---

# tiup cluster check

For a formal production environment, before the environment goes live, you need to perform a series of checks to ensure the clusters are in their best performance. To simplify the manual check steps, TiUP Cluster provides the `check` command to check whether the hardware and software environments of the target machines of a specified cluster meet the requirements to work normally.

## List of check items

### Operating system version

Check the operating system distribution and version of the deployed machines. Currently, only CentOS 7 is supported for deployment. More system versions may be supported in later releases for compatibility improvement.

### CPU EPOLLEXCLUSIVE

Check whether the CPU of the target machine supports EPOLLEXCLUSIVE.

### numactl

Check whether numactl is installed on the target machine. If tied cores are configured on the target machine, you must install numactl.

### System time

Check whether the system time of the target machine is synchronized. Compare the system time of the target machine with that of the central control machine, and report an error if the deviation exceeds a certain threshold (500ms).

### Time synchronization service

Check whether the time synchronization service is configured on the target machine. Namely, check whether ntpd is running.

### Swap partitioning

Check whether swap partitioning is enabled on the target machine. It is recommended to disable swap partitioning.

### Kernel parameters

Check the values of the following kernel parameters:

- `net.ipv4.tcp_tw_recycle`: 0
- `net.ipv4.tcp_syncookies`: 0
- `net.core.somaxconn`: 32768
- `vm.swappiness`: 0
- `vm.overcommit_memory`: 0 or 1
- `fs.file-max`: 1000000

### Transparent Huge Pages (THP)

Check whether THP is enabled on the target machine. It is recommended to disable THP.

### System limits

Check the limit values in the `/etc/security/limits.conf` file:

```
<deploy-user> soft nofile 1000000
<deploy-user> hard nofile 1000000
<deploy-user> soft stack 10240
```

`<deploy-user>` is the user who deploys and runs the TiDB cluster, and the last column is the minimum value required for the system.

### SELinux

Check whether SELinux is enabled. It is recommended to disable SELinux.

### Firewall

Check whether the FirewallD service is enabled. It is recommended to either disable the FirewallD service or add permission rules for each service in the TiDB cluster.

### irqbalance

Check whether the irqbalance service is enabled. It is recommended to enable the irqbalance service.

### Disk mount options

Check the mount options for ext4 partitions. Make sure the mount options include the nodelalloc option and the noatime option.

### Port usage

Check if the ports defined in the topology (including the auto-completion default ports) are already used by the processes on the target machine.

> **Note:**
>
> The port usage check assumes that a cluster is not started yet. If a cluster is already deployed and started, the port usage check on the cluster fails because the ports must be in use in this case.

### CPU core number

Check the CPU information of the target machine. For a production cluster, it is recommended that the number of the CPU logical core is greater than or equal to 16.

> **Note:**
>
> CPU core number is not checked by default. To enable the check, you need to add the `-enable-cpu` option to the command.

### Memory size

Check the memory size of the target machine. For a production cluster, it is recommended that the total memory capacity is greater than or equal to 32GB.

> **Note:**
>
> Memory size is not checked by default. To enable the check, you need to add the `-enable-mem` option to the command.

### Fio disk performance test

Use flexible I/O tester (fio) to test the performance of the disk where `data_dir` is located, including the following three test items:

- fio_randread_write_latency
- fio_randread_write
- fio_randread

> **Note:**
>
> The fio disk performance test is not performed by default. To perform the test, you need to add the `-enable-disk` option to the command.

## Syntax

```shell
tiup cluster check <topology.yml | cluster-name> [flags]
```

- If a cluster is not deployed yet, you need to pass the [topology.yml](/tiup/tiup-cluster-topology-reference.md) file that is used to deploy the cluster. According to the content in this file, tiup-cluster connects to the corresponding machine to perform the check.
- If a cluster is already deployed, you can use the `<cluster-name>` as the check object.

> **Note:**
>
> If `<cluster-name>` is used for the check, you need to add the `--cluster` option in the command.

## Options

### --apply

- Attempts to automatically repair the failed check items. Currently, tiup-cluster only attempts to repair the following check items:
- SELinux
- firewall
- irqbalance
- kernel parameters
- System limits
- THP (Transparent Huge Pages)
- Data type: `BOOLEAN`
- This option is disabled by default with the `false` value. To enable this option, add this option to the command, and either pass the `true` value or do not pass any value.

### --cluster

- Indicates that the check is for the deployed clusters.
- Data type: `BOOLEAN`
- This option is disabled by default with the `false` value. To enable this option, add this option to the command, and either pass the `true` value or do not pass any value.

> **Note:**
>
> tiup-cluster supports checking both un-deployed clusters and deployed clusters with the following command format:
>
> ```shell
> tiup cluster check <topology.yml | cluster-name> [flags]
> ```
>
> If the `tiup cluster check <cluster-name>` command is used, you must add the `--cluster` option: `tiup cluster check <cluster-name> --cluster`.

### --enable-cpu

- Enables the check of CPU core number.
- Data type: `BOOLEAN`
- This option is disabled by default with the `false` value. To enable this option, add this option to the command, and either pass the `true` value or do not pass any value.

### --enable-disk

- Enables the fio disk performance test.
- Data type: `BOOLEAN`
- This option is disabled by default with the `false` value. To enable this option, add this option to the command, and either pass the `true` value or do not pass any value.

### --enable-mem

- Enables the memory size check.
- Data type: `BOOLEAN`
- This option is disabled by default with the `false` value. To enable this option, add this option to the command, and either pass the `true` value or do not pass any value.

### --u, --user

- Specifies the user name to connect to the target machine. The specified user needs to have the password-free sudo root privileges on the target machine.
- Data type: `STRING`
- If this option is not specified in the command, the user who executes the command is used as the default value.

> **Note:**
>
> This option is valid only if the `-cluster` option is false. Otherwise, the value of this option is fixed to the username specified in the topology file for the cluster deployment.

### -i, --identity_file

- Specifies the key file to connect to the target machine.
- Data type: `STRING`
- The option is enabled by default with `~/.ssh/id_rsa` (the default value) passed in.

> **Note:**
>
> This option is valid only if the `--cluster` option is false. Otherwise, the value of this option is fixed to `${TIUP_HOME}/storage/cluster/clusters/<cluster-name>/ssh/id_rsa`.

### -p, --password

- Logs in with a password when connecting to the target machine.
- If the `--cluster` option is added for a cluster, the password is the password of the user specified in the topology file when the cluster was deployed.
- If the `--cluster` option is not added for a cluster, the password is the password of the user specified in the `-u/--user` option.
- Data type: `BOOLEAN`
- This option is disabled by default with the `false` value. To enable this option, add this option to the command, and either pass the `true` value or do not pass any value.

### -h, --help

- Prints the help information of the related commands.
- Data type: `BOOLEAN`
- This option is disabled by default with the `false` value. To enable this option, add this option to the command, and either pass the `true` value or do not pass any value.

## Output

A table containing the following fields:

- `Node`: the target node
- `Check`: the check item
- `Result`: the check result (Pass, Warn, or Fail)
- `Message`: the result description
58 changes: 58 additions & 0 deletions tiup/tiup-component-dm-display.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,58 @@
---
title: tiup dm display
---

# tiup dm display

If you want to check the operational status of each component in a DM cluster, it is inefficient to log in to each machine one by one. Therefore, tiup-dm provides the `tiup dm display` command to do this job efficiently.

## Syntax

```shell
tiup dm display <cluster-name> [flags]
```

`<cluster-name>` is the name of the cluster to operate on. If you forget the cluster name, check it in [tiup cluster list](/tiup/tiup-component-cluster-list.md).
qiancai marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
`<cluster-name>` is the name of the cluster to operate on. If you forget the cluster name, check it in [tiup cluster list](/tiup/tiup-component-cluster-list.md).
`<cluster-name>` is the name of the cluster to operate on. If you forget the cluster name, you can check it using the `[tiup cluster list](/tiup/tiup-component-cluster-list.md)` command.


## Options

### -N, --node

Specifies the IDs of the nodes to query, splitting by commas for multiple nodes. If you are not sure about the ID of a node, you can skip this option in the command to show the IDs and status of all nodes in the output.
Data type: `STRING`
This option is enabled by default with `[]` (which means all nodes) passed in.
qiancai marked this conversation as resolved.
Show resolved Hide resolved

> **Note:**
>
> If `-R, --role` is also specified, only the services in the intersection of the specified nodes and roles is queried.

### -R, --role

Specifies the roles to query, splitting by commas for multiple roles. If you are not sure about the role deployed on a node, you can skip this option in the command to show the roles and status of all nodes in the output.
Data type: `STRING`
This option is enabled by default with `[]` (which means all roles) passed in.
qiancai marked this conversation as resolved.
Show resolved Hide resolved

> **Note:**
>
> If `-N, --node` is also specified, only the services in the intersection of the specified nodes and roles is queried.

### -h, --help

- Prints the help information.
- Data type: `BOOLEAN`
- This option is disabled by default with the `false` value. To enable this option, add this option to the command, and either pass the `true` value or do not pass any value.

## Output

- Cluster name
- Cluster version
- SSH client type
- A table containing the following fields:
- `ID`: the node ID, consisting of IP:PORT.
- `Role`: the service role deployed on the node (for example, TiDB or TiKV).
- `Host`: the IP address of the machine corresponding to the node.
- `Ports`: the port number used by the service.
- `OS/Arch`: the operating system and machine architecture of the node.
- `Status`: the current status of the services on the node.
- `Data Dir`: the data directory of the service. `-` means that there is no data directory.
- `Deploy Dir`: the deployment directory of the service.
28 changes: 28 additions & 0 deletions tiup/tiup-component-dm-upgrade.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
--
title: tiup dm upgrade
--
qiancai marked this conversation as resolved.
Show resolved Hide resolved

# tiup dm upgrade

The `tiup dm upgrade` command is used to upgrade a specified cluster to a specific version.

## Syntax

```shell
tiup dm upgrade <cluster-name> <version> [flags]
```

- `<cluster-name>` is the name of the cluster to operate on. If you forget the cluster name, check it in [tiup cluster list](/tiup/tiup-component-cluster-list.md).
qiancai marked this conversation as resolved.
Show resolved Hide resolved
- `<version>` is the target version to be upgraded to. Currently, only upgrading to a later version is allowed, and upgrading to an earlier version is not allowed, which means the downgrade is not allowed. Upgrading to a nightly version is not allowed either.

## Options

### -h, --help

- Prints the help information.
- Data type: `BOOLEAN`
- This option is disabled by default with the `false` value. To enable this option, add this option to the command, and either pass the `true` value or do not pass any value.

## Output

Log of the service upgrade process.