Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

remove zone #1648

Merged
merged 10 commits into from
Apr 15, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion docs-2.0/20.appendix/0.FAQ.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ Nebula Graph 一直在持续开发,功能或操作的行为可能会有变化

从 Nebula Graph 3.0.0 开始,查询语句`LOOKUP`、`GO`、`FETCH`必须用`YIELD`子句指定输出结果。详情请参见[YIELD](../3.ngql-guide/8.clauses-and-options/yield.md)。

### 如何处理错误信息 `Zone not enough!`
### 如何处理错误信息 `Host not enough!`

从 3.0.0 版本开始,在配置文件中添加的 Storage 节点无法直接读写,配置文件的作用仅仅是将 Storage 节点注册至 Meta 服务中。必须使用`ADD HOSTS`命令后,才能正常读写 Storage 节点。详情参见[管理 Storage 主机](../4.deployment-and-installation/manage-storage-host.md)。

Expand Down
19 changes: 19 additions & 0 deletions docs-2.0/3.ngql-guide/4.job-statements.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,25 @@

所有作业管理命令都需要先选择图空间后才能执行。

## SUBMIT JOB BALANCE DATA

!!! enterpriseonly

仅企业版支持本功能。

`SUBMIT JOB BALANCE DATA`语句会在当前图空间内启动任务均衡分布分片。该命令会返回任务 ID。

示例:

```ngql
nebula> SUBMIT JOB BALANCE DATA;
+------------+
| New Job Id |
+------------+
| 28 |
+------------+
```

<!-- balance-3.1
## SUBMIT JOB BALANCE IN ZONE

Expand Down
11 changes: 11 additions & 0 deletions docs-2.0/3.ngql-guide/9.space-statements/4.describe-space.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,16 @@ DESC[RIBE] SPACE <graph_space_name>;

## 示例

```ngql
nebula> DESCRIBE SPACE basketballplayer;
+----+--------------------+------------------+----------------+---------+------------+--------------------+-------------+---------+
| ID | Name | Partition Number | Replica Factor | Charset | Collate | Vid Type | Atomic Edge | Comment |
+----+--------------------+------------------+----------------+---------+------------+--------------------+-------------+---------+
| 1 | "basketballplayer" | 10 | 1 | "utf8" | "utf8_bin" | "FIXED_STRING(32)" | false | |
+----+--------------------+------------------+----------------+---------+------------+--------------------+-------------+---------+
```

<!--
```ngql
nebula> DESCRIBE SPACE basketballplayer;
+----+--------------------+------------------+----------------+---------+------------+--------------------+-------------+-----------+---------+
Expand All @@ -20,3 +30,4 @@ nebula> DESCRIBE SPACE basketballplayer;
| 1 | "basketballplayer" | 10 | 1 | "utf8" | "utf8_bin" | "FIXED_STRING(32)" | false | "default" | |
+----+--------------------+------------------+----------------+---------+------------+--------------------+-------------+-----------+---------+
```
-->
113 changes: 111 additions & 2 deletions docs-2.0/8.service-tuning/load-balance.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,118 @@

用户可以使用`BALANCE`语句平衡分片和 Raft leader 的分布,或者清空某些 Storage 服务器方便进行维护。详情请参见 [BALANCE](../synchronization-and-migration/2.balance-syntax.md)。

!!! compatibility "历史版本兼容性"
!!! danger

`BALANCE`命令通过创建和执行一组子任务来迁移数据和均衡分片分布,**禁止**停止集群中的任何机器或改变机器的 IP 地址,直到所有子任务完成,否则后续子任务会失败。

## 均衡分片分布

!!! enterpriseonly

仅企业版支持均衡分片分布。

`BALANCE DATA`语句会开始一个任务,将当前图空间的分片平均分配到所有 Storage 服务器。通过创建和执行一组子任务来迁移数据和均衡分片分布。

### 示例

以横向扩容 Nebula Graph 为例,集群中增加新的 Storage 主机后,新主机上没有分片。

1. 执行命令`SHOW HOSTS`检查分片的分布。

```ngql
nebual> SHOW HOSTS;
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
| Host | Port | HTTP port | Status | Leader count | Leader distribution | Partition distribution | Version |
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
| "192.168.8.101" | 9779 | 19669 | "ONLINE" | 0 | "No valid partition" | "No valid partition" | "3.1.0-ent" |
| "192.168.8.100" | 9779 | 19669 | "ONLINE" | 15 | "basketballplayer:15" | "basketballplayer:15" | "3.1.0-ent" |
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
```

2. 进入图空间`basketballplayer`,然后执行命令`BALANCE DATA`将所有分片均衡分布。

```ngql
nebula> USE basketballplayer;
nebula> BALANCE DATA;
+------------+
| New Job Id |
+------------+
| 2 |
+------------+
```

3. 根据返回的任务ID,执行命令`SHOW JOB <job_id>`检查任务状态。

```ngql
nebula> SHOW JOB 2;
+------------------------+------------------------------------------+-------------+---------------------------------+---------------------------------+-------------+
| Job Id(spaceId:partId) | Command(src->dst) | Status | Start Time | Stop Time | Error Code |
+------------------------+------------------------------------------+-------------+---------------------------------+---------------------------------+-------------+
| 2 | "DATA_BALANCE" | "FINISHED" | "2022-04-12T03:41:43.000000000" | "2022-04-12T03:41:53.000000000" | "SUCCEEDED" |
| "2, 1:1" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "2, 1:2" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "2, 1:3" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "2, 1:4" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "2, 1:5" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "2, 1:6" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:43.000000 | "SUCCEEDED" |
| "2, 1:7" | "192.168.8.100:9779->192.168.8.101:9779" | "SUCCEEDED" | 2022-04-12T03:41:43.000000 | 2022-04-12T03:41:53.000000 | "SUCCEEDED" |
| "Total:7" | "Succeeded:7" | "Failed:0" | "In Progress:0" | "Invalid:0" | "" |
+------------------------+------------------------------------------+-------------+---------------------------------+---------------------------------+-------------+
```

4. 等待所有子任务完成,负载均衡进程结束,执行命令`SHOW HOSTS`确认分片已经均衡分布。

!!! Note

`BALANCE DATA`不会均衡 leader 的分布。均衡 leader 请参见[均衡leader分布](#leader)。

```ngql
nebula> SHOW HOSTS;
+-----------------+------+-----------+----------+--------------+----------------------+------------------------+-------------+
| Host | Port | HTTP port | Status | Leader count | Leader distribution | Partition distribution | Version |
+-----------------+------+-----------+----------+--------------+----------------------+------------------------+-------------+
| "192.168.8.101" | 9779 | 19669 | "ONLINE" | 7 | "basketballplayer:7" | "basketballplayer:7" | "3.1.0-ent" |
| "192.168.8.100" | 9779 | 19669 | "ONLINE" | 8 | "basketballplayer:8" | "basketballplayer:8" | "3.1.0-ent" |
+-----------------+------+-----------+----------+--------------+----------------------+------------------------+-------------+
```

如果有子任务失败,请执行`RECOVER JOB <job_id>`。如果重做负载均衡仍然不能解决问题,请到[Nebula Graph社区](https://discuss.nebula-graph.com.cn/)寻求帮助。

### 停止负载均衡作业

停止负载均衡作业,请执行命令`STOP JOB <job_id>`。

- 如果没有正在执行的负载均衡作业,会返回错误。

- 如果有正在执行的负载均衡作业,会返回`Job stopped`。

!!! note

- `STOP JOB <job_id>`不会停止正在执行的子任务,而是取消所有后续子任务,状态会置为`INVALID`,然后等待正在执行的子任执行完毕根据结果置为`SUCCEEDED`或`FAILED`。用户可以执行命令`SHOW JOB <job_id>`检查停止的作业状态。
- 宕机重启后,作业状态变为`QUEUE`,子任务如果之前是`INVALID`或`FAILED`,状态会置为`IN_PROGRESS`,如果是`IN_PROGRESS`或`SUCCEEDED`则保持不变。

一旦所有子任务都完成或停止,用户可以再次执行命令`RECOVER JOB <job_id>`重启作业,子任务按原有的状态继续执行。

### 迁移分片

迁移指定的 Storage 主机中的分片来缩小集群规模,可以使用命令`BALANCE DATA REMOVE <ip:port> [,<ip>:<port> ...]`。

例如需要迁移`192.168.8.100:9779`中的分片,请执行如下命令:
cooper-lzy marked this conversation as resolved.
Show resolved Hide resolved

```ngql
nebula> BALANCE DATA REMOVE 192.168.8.100:9779;
nebula> SHOW HOSTS;
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
| Host | Port | HTTP port | Status | Leader count | Leader distribution | Partition distribution | Version |
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
| "192.168.8.101" | 9779 | 19669 | "ONLINE" | 15 | "basketballplayer:15" | "basketballplayer:15" | "3.1.0-ent" |
| "192.168.8.100" | 9779 | 19669 | "ONLINE" | 0 | "No valid partition" | "No valid partition" | "3.1.0-ent" |
+-----------------+------+-----------+----------+--------------+-----------------------+------------------------+-------------+
```

!!! note

不支持`BALANCE DATA`命令
该命令仅迁移分片,不会将 Storage 主机从集群中删除。删除 Storage 主机请参见[管理 Storage 主机](../4.deployment-and-installation/manage-storage-host.md)

<!-- balance-3.1
!!! danger
Expand Down
8 changes: 3 additions & 5 deletions docs-2.0/synchronization-and-migration/2.balance-syntax.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,15 +2,13 @@

`BALANCE`语句可以让 Nebula Graph 的 Storage 服务实现负载均衡。更多`BALANCE`语句示例和 Storage 负载均衡,请参见 [Storage 负载均衡](../8.service-tuning/load-balance.md)。

!!! compatibility "历史版本兼容性"

不支持`BALANCE DATA`命令。

`BALANCE`语法说明如下。

|语法|说明|
|:---|:---|
|`BALANCE LEADER`|在当前图空间内所有 Zone 之间启动任务均衡分布 leader。该命令会返回任务 ID。|
|`BALANCE DATA`|启动任务均衡分布当前图空间中的所有分片。该命令会返回任务 ID(`job_id`)。|
|`BALANCE DATA REMOVE <ip:port> [,<ip>:<port> ...]`|启动任务迁空当前图空间指定的 Storage 服务中的分片。|
|`BALANCE LEADER`|启动任务均衡分布当前图空间中的所有 leader。该命令会返回任务 ID(`job_id`)。|
<!-- balance-3.1
|`BALANCE IN ZONE [REMOVE <ip>:<port> [,<ip>:<port> ...]]`|在当前图空间内每个 Zone 内部启动任务均衡分布分片。该命令会返回任务 ID。可以使用`REMOVE`选项指定需要清空的 Storage 节点,该节点的分片会移动到其他节点,方便进行维护。|
|`BALANCE ACROSS ZONE [REMOVE "zone_name" [,"zone_name" ...]]`|在当前图空间内所有 Zone 之间启动任务均衡分布分片,保证各个 Zone 分片数量平衡。该命令会返回任务 ID。可以使用`REMOVE`选项指定需要清空的 Zone,该节点的分片会移动到其他节点,方便进行维护。|
Expand Down