Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compaction job won't fail and keep running in the job management when one of the nodes stopped #4451

Closed
henry202108 opened this issue Jul 22, 2022 · 1 comment
Assignees
Labels
invalid Solution: this issue is invalid and will be closed type/bug Type: something is unexpected
Milestone

Comments

@henry202108
Copy link

Please check the FAQ documentation before raising an issue

When start the compaction job if one of the nodes stop the job won't fail and will keep running until manally stop.If another job starts,it will be in queue status

Your Environments (required)

  • OS: uname -a
  • Compiler: g++ --version or clang++ --version
  • CPU: lscpu
  • Commit id (e.g. a3ffc7d8)

Linux 192-168-8-191 3.10.0-1160.36.2.el7.x86_64 #1 SMP Wed Jul 21 11:57:15 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
g++ (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)
Architecture: x86_64
CPU op-mode(s): 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 4
On-line CPU(s) list: 0-3
Thread(s) per core: 1

f895632

How To Reproduce(required)

Steps to reproduce the behavior:

  1. Step 1
    start the compaction job
  2. Step 2
    kill one of the storage services
  3. Step 3
    the job will not fail and keep running

Expected behavior

In the job management,the job should not keep running and fail when one of the nodes stopped

Additional context

@henry202108 henry202108 added the type/bug Type: something is unexpected label Jul 22, 2022
@Sophie-Xie Sophie-Xie added this to the v3.3.0 milestone Jul 22, 2022
@critical27
Copy link
Contributor

In the job management,the job should not keep running and fail when one of the nodes stopped

I don't agree:

  1. Even if some nodes run the job, it still has meaning, which is not useless.
  2. Job Mananger will only run a job in async, it is expected behavior when some nodes failed after the job has started, this is by design.

@jinyingsunny jinyingsunny added the invalid Solution: this issue is invalid and will be closed label Nov 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
invalid Solution: this issue is invalid and will be closed type/bug Type: something is unexpected
Projects
None yet
Development

No branches or pull requests

4 participants