Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DSIP-69] Fix master dispatch task timeout might cause task duplicate running in worker #16481

Open
2 tasks done
Tracked by #14102
ruanwenjun opened this issue Aug 18, 2024 · 0 comments · May be fixed by #16539
Open
2 tasks done
Tracked by #14102

[DSIP-69] Fix master dispatch task timeout might cause task duplicate running in worker #16481

ruanwenjun opened this issue Aug 18, 2024 · 0 comments · May be fixed by #16539

Comments

@ruanwenjun
Copy link
Member

ruanwenjun commented Aug 18, 2024

Search before asking

  • I had searched in the DSIP and found no similar DSIP.

Motivation

Right now, there exist some case might cause the task duplicated dispatched.
e.g.

image The master dispatch task a to worker A first, but receive a timeout response, this might happen when the worker rpc is busy, then master will select a new worker B and retry the dispatch.

Then there might exist two situations:

  1. The task has been received by worker A, then take will duplicate exist in worker A and worker B, both the two worker will execute the task, a worser case is the task might duplicated in more worker.
  2. The task hasn't been received by worker A, then task will not duplicate executed.

The first situation is not accepted.

Design Detail

In order to solve this, we should change the dispatch logic.

image

Compatibility, Deprecation, and Migration Plan

No response

Test Plan

No response

Code of Conduct

@ruanwenjun ruanwenjun self-assigned this Aug 22, 2024
@ruanwenjun ruanwenjun changed the title [DSIP-66] Fix master dispatch task timeout might cause task duplicate running in worker [DSIP-69] Fix master dispatch task timeout might cause task duplicate running in worker Aug 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant