Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the workload health controller #17215

Merged
merged 1 commit into from
May 19, 2023
Merged

Conversation

mkeeler
Copy link
Member

@mkeeler mkeeler commented May 3, 2023

Description

This PR builds on #17214. It implements a workload health controller similar to the node health controller that PR introduced.

This one is a little more complex though as workload health depends on health statuses with workload owners but also with the associated nodes. The workload health controller therefore needs an extra dependency that is capable of tracking the node <-> workload associations and to map node watch events into the corresponding workloads that require reconciliation.

Testing & Reproduction steps

  • The unit tests for the controller amount to ~96% code coverage. A future PR will introduce a multi-catalog-controller integration test.

PR Checklist

  • updated test coverage
  • external facing docs updated
  • appropriate backport labels added
  • not a security concern

@mkeeler mkeeler requested a review from ishustava May 3, 2023 13:41
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from da5a915 to 93c345a Compare May 3, 2023 21:10
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 509271c to 300b669 Compare May 3, 2023 21:10
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch 6 times, most recently from a66a7e5 to b9ed172 Compare May 5, 2023 16:33
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 300b669 to 95ae401 Compare May 8, 2023 17:36
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from b9ed172 to d702cad Compare May 8, 2023 17:39
@mkeeler mkeeler marked this pull request as ready for review May 8, 2023 17:40
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 95ae401 to 73b732c Compare May 8, 2023 17:41
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from d702cad to c6bae49 Compare May 8, 2023 20:09
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 73b732c to 87bc35f Compare May 8, 2023 20:09
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from c6bae49 to 16fcab1 Compare May 9, 2023 16:44
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 87bc35f to 5279d27 Compare May 9, 2023 16:44
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from 16fcab1 to 9594a10 Compare May 10, 2023 11:59
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 5279d27 to 040875a Compare May 10, 2023 11:59
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from 9594a10 to 17ee0be Compare May 10, 2023 16:40
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 040875a to d39340e Compare May 10, 2023 16:41
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch 4 times, most recently from 0d1bae6 to b6f86ee Compare May 10, 2023 18:02
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from d39340e to 2edb268 Compare May 10, 2023 18:05
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from b6f86ee to 8d05b6c Compare May 10, 2023 19:26
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch 2 times, most recently from a5f7ec5 to fdbf749 Compare May 10, 2023 20:24
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from 8d05b6c to 7b55777 Compare May 10, 2023 20:50
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from fdbf749 to 49f2f51 Compare May 10, 2023 20:51
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from 7b55777 to 9f56985 Compare May 11, 2023 21:04
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 49f2f51 to e590726 Compare May 11, 2023 21:04
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from 9f56985 to 4a88416 Compare May 11, 2023 21:09
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from e590726 to 6410f76 Compare May 11, 2023 21:09
Copy link
Contributor

@ishustava ishustava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great Matt!!

I left some comments inline, the biggest one being about the node mapper and whether it's better than the "naive" mapper.

Comment on lines 132 to 159
if resource.EqualStatus(res.Status[StatusKey], newStatus, false) {
return nil
}

_, err = rt.Client.WriteStatus(ctx, &pbresource.WriteStatusRequest{
Id: res.Id,
Key: StatusKey,
Status: newStatus,
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The propagation of this logic and the tests for it makes me feel like it should be in a helper because every controller would need to do the same thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@boxofrad I agree with Iryna. Where do you think this logic should live. It could be a method on the controller.Runtime or it could live as a helper within the resource package.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, do you mean calling WriteStatus only if it's changed?

Runtime might work well. I'm imagining the signature will be:

func (r Runtime) WriteStatus(ctx context.Context, res *pbresource.Resource, key string, status *pbresource.Status) error {
    // ...
}

Is that roughly what you have in mind?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that’s what I was thinking.

@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 6410f76 to 9107f47 Compare May 12, 2023 11:53
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from 4a88416 to 7eb038c Compare May 12, 2023 12:08
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch 2 times, most recently from 6ed8ced to 6517df1 Compare May 12, 2023 12:16
@mkeeler mkeeler force-pushed the catalog-v2/node-health-controller branch from 7eb038c to 34924a7 Compare May 12, 2023 13:25
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 6517df1 to a465a8c Compare May 12, 2023 13:26
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from a465a8c to 7e5e873 Compare May 15, 2023 13:49
Base automatically changed from catalog-v2/node-health-controller to main May 15, 2023 13:55
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 7e5e873 to 5dc5051 Compare May 15, 2023 13:55
Copy link
Contributor

@ishustava ishustava left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I left a few more comments, but don't think they are blocking.

@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from 5dc5051 to de295b1 Compare May 15, 2023 16:24
@mkeeler mkeeler added pr/no-changelog PR does not need a corresponding .changelog entry pr/no-backport labels May 15, 2023
@mkeeler mkeeler force-pushed the catalog-v2/workload-health-controller branch from de295b1 to edb2b71 Compare May 19, 2023 17:36
@mkeeler mkeeler merged commit 1d6a0c8 into main May 19, 2023
@mkeeler mkeeler deleted the catalog-v2/workload-health-controller branch May 19, 2023 17:53
nickethier pushed a commit that referenced this pull request May 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr/no-backport pr/no-changelog PR does not need a corresponding .changelog entry
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants