Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Zone] Maintenance reminder for downsizing: Downsizing operation may fail, and manual handling of the "balance data" job task is required. #280

Closed
jinyingsunny opened this issue Sep 18, 2023 · 1 comment
Labels
affects/none PR/issue: this bug affects none version. process/done Process of bug severity/none Severity of bug type/bug Type: something is unexpected wontfix Solution: this will not be worked on

Comments

@jinyingsunny
Copy link

Please check the FAQ documentation before raising an issue
当part比较多 ,比如100,那么执行缩容的时候,可能因为任务失败而缩容中断,这时,需要人为处理失败的任务,比如recover job xxx;balance常见报错如下:

I20230918 02:26:21.756175   255 BalanceTask.cpp:81] 10, 5:48,nebulazone-storaged-15.nebulazone-storaged-headless.nebula.svc.cluster.local:9779->nebulazone-storaged-6.nebulazone-storaged-headless.nebula.svc.cluster.local:9779 Transfer leader failed, status Transfer leader failed: E_RETRY_EXHAUSTED

另外,因为缩容需要逐个space做balance data in zone remove,如果space比较多 ,出现上面👆🏻失败的概率会更高,用户需要找到存在balance data失败的space做recover处理。

可以通过kubectl -n <namespace> describe nc <cluster_name> 查看状态
image

(root@nebula) [王者荣耀]> show jobs
+--------+----------------+------------+----------------------------+----------------------------+
| Job Id | Command        | Status     | Start Time                 | Stop Time                  |
+--------+----------------+------------+----------------------------+----------------------------+
| 10     | "DATA_BALANCE" | "FAILED"   | 2023-09-18T02:25:51.000000 | 2023-09-18T02:26:37.000000 |
| 6      | "STATS"        | "FINISHED" | 2023-09-18T02:14:30.000000 | 2023-09-18T02:14:30.000000 |
+--------+----------------+------------+----------------------------+----------------------------+
Got 2 rows (time spent 7.862ms/8.791903ms)

Mon, 18 Sep 2023 02:29:52 UTC

(root@nebula) [王者荣耀]> desc space  王者荣耀
+----+------------+------------------+----------------+---------+------------+--------------------+-----------------------------------------------------+---------+
| ID | Name       | Partition Number | Replica Factor | Charset | Collate    | Vid Type           | Zones                                               | Comment |
+----+------------+------------------+----------------+---------+------------+--------------------+-----------------------------------------------------+---------+
| 5  | "王者荣耀" | 100              | 3              | "utf8"  | "utf8_bin" | "FIXED_STRING(32)" | ["us-central1-a", "us-central1-b", "us-central1-c"] |         |
+----+------------+------------------+----------------+---------+------------+--------------------+-----------------------------------------------------+---------+
Got 1 rows (time spent 6.739ms/13.160116ms)

Mon, 18 Sep 2023 02:43:43 UTC

Your Environments (required)

operator 镜像:1.6

@jinyingsunny jinyingsunny added the type/bug Type: something is unexpected label Sep 18, 2023
@github-actions github-actions bot added affects/none PR/issue: this bug affects none version. severity/none Severity of bug labels Sep 18, 2023
@QingZ11 QingZ11 changed the title [zone]缩容运维提醒,缩容可能失败,需要手动处理下balance data的job任务 [Zone] Maintenance reminder for downsizing: Downsizing operation may fail, and manual handling of the "balance data" job task is required. Sep 18, 2023
@jinyingsunny jinyingsunny added wontfix Solution: this will not be worked on and removed affects/none PR/issue: this bug affects none version. labels Oct 10, 2023
@jinyingsunny
Copy link
Author

mark as known issue

@github-actions github-actions bot added the affects/none PR/issue: this bug affects none version. label Oct 10, 2023
@jinyingsunny jinyingsunny added the process/done Process of bug label Oct 10, 2023
@github-actions github-actions bot added process/fixed Process of bug and removed process/fixed Process of bug labels Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects/none PR/issue: this bug affects none version. process/done Process of bug severity/none Severity of bug type/bug Type: something is unexpected wontfix Solution: this will not be worked on
Projects
None yet
Development

No branches or pull requests

1 participant