Operator should stop current load balancer job and start a new one if retry job constantly fails #512

kevinliu24 · 2024-07-01T23:30:56Z

Introduction
Currently when a data balance job, the operator constantly retries a job until it succeed. Sometimes success is not possible (i.e. if the job was manually removed). We should detect the job status and retry only a few times after it fails. If it still fails, we should stop the current data balance job and start a new one.

Contents
If a load balance job fails after a few retries, stop the current job and start a new one.

Related work

kevinliu24 added the type/enhancement Type: make the code neat or more efficient label Jul 1, 2024

kevinliu24 assigned kevinliu24 and MegaByte875 Jul 1, 2024

wey-gu mentioned this issue Jul 6, 2024

Weekly Report 2024-07-05 vesoft-inc/nebula-community#442

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Operator should stop current load balancer job and start a new one if retry job constantly fails #512

Operator should stop current load balancer job and start a new one if retry job constantly fails #512

kevinliu24 commented Jul 1, 2024

Operator should stop current load balancer job and start a new one if retry job constantly fails #512

Operator should stop current load balancer job and start a new one if retry job constantly fails #512

Comments

kevinliu24 commented Jul 1, 2024