-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ddl: fix runnable ingest job checking #52503
ddl: fix runnable ingest job checking #52503
Conversation
Hi @tangenta. Thanks for your PR. PRs from untrusted users cannot be marked as trusted with I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to use some channels to make the test more stable
@@ -831,6 +831,7 @@ func (d *ddl) Start(ctxPool *pools.ResourcePool) error { | |||
if err != nil { | |||
logutil.BgLogger().Error("error when getting the ddl history count", zap.Error(err)) | |||
} | |||
d.runningJobs.clear() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we reuse the startDispatchLoop
to initialize the runningJobs
, by relying on the deterministic order from order by processing desc, job_id
and add the jobs one by one.
Considering the case that the former DDL owner marked wrong jobs as running, like
100 (running), 101 (pending), 102 (running wrongly). Now the new DDL owner will let the states be (running, pending, pending) then (finished, pending, running). However, the correct state should be (running, pending, pending) then (finished, running, pending).
I slightly prefer the new DDL owner re-compute the running jobs instead of reuse the persistent state from persistent table.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“Considering the case that the former DDL owner marked wrong jobs as running, like
100 (running), 101 (pending), 102 (running wrongly).”
How come?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
“Considering the case that the former DDL owner marked wrong jobs as running, like 100 (running), 101 (pending), 102 (running wrongly).” How come?
The old owner uses the code before this PR, and marked 102 as running, like in the linking issue
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #52503 +/- ##
================================================
+ Coverage 72.1321% 74.5830% +2.4508%
================================================
Files 1467 1493 +26
Lines 426954 439041 +12087
================================================
+ Hits 307971 327450 +19479
+ Misses 99738 91078 -8660
- Partials 19245 20513 +1268
Flags with carried forward coverage won't be shown. Click here to find out more.
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: lance6716, ywqzzy The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cherry-pick release-8.1 |
@lance6716: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
/cherry-pick release-7.5 |
@wjhuang2016: new pull request created to branch In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the ti-community-infra/tichi repository. |
Signed-off-by: ti-chi-bot <ti-community-prow-bot@tidb.io>
What problem does this PR solve?
Issue Number: close #52475
Problem Summary: see #52475
What changed and how does it work?
MarkJobProcessing
andRetireOwnerHook
.runningJobs
.unfinished jobs
when ddl owner is acquired.Check List
Tests
Side effects
Documentation
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.