Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement OperatorPolicy health checks for CSV #196

Merged

Conversation

JeffeyL
Copy link
Contributor

@JeffeyL JeffeyL commented Feb 1, 2024

This change follows the previous pattern of using handle functions in order to handle different resources, in this case for ClusterServiceVersions (CSVs). The handleSubscription function was modified in order to return the unstructured subscription to make accessing CSV metadata easier. Since CSVs are managed by OLM, the controller reports conditions directly from the CSV in order to monitor its health. Changes also include updating relatedObjects accordingly.

Copy link
Member

@JustinKuli JustinKuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's neat to see how much information is in the CSV status, and how we can bring that into the OperatorPolicy status. This is looking like it's on the right track.

controllers/operatorpolicy_status.go Outdated Show resolved Hide resolved
test/e2e/case40_operator_health_checks_test.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_status.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_status.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_controller.go Show resolved Hide resolved
controllers/operatorpolicy_controller.go Show resolved Hide resolved
controllers/operatorpolicy_status.go Show resolved Hide resolved
controllers/operatorpolicy_status.go Outdated Show resolved Hide resolved
@JeffeyL JeffeyL force-pushed the ACM-6598 branch 7 times, most recently from badaf01 to ceb7b63 Compare February 5, 2024 14:47
Copy link
Member

@JustinKuli JustinKuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think separate commits for each of these resources (like you have done) is good. This is really shaping up nicely I think!

controllers/operatorpolicy_controller.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_controller.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_controller.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_controller.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_controller.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_status.go Show resolved Hide resolved
controllers/operatorpolicy_status.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_status.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_status.go Outdated Show resolved Hide resolved
@@ -17,7 +17,7 @@ var _ = Describe("Test installing an operator from OperatorPolicy", Ordered, fun
opPolTestNS = "operator-policy-testns"
parentPolicyYAML = "../resources/case38_operator_install/parent-policy.yaml"
parentPolicyName = "parent-policy"
opPolTimeout = 10
opPolTimeout = 60
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👀 what's taking up to 60 seconds? I'd rather we adjust it for that specific test somehow. Maybe a separate Eventually for whatever that test is waiting a while for, before the main check.

It looks like this is only (currently) used in Eventually kinds of checks, so keeping this change wouldn't be the end of the world.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forget the exact one, but I will check.

Following the 'handle*' function pattern that is
used by the subscription and operatorgroup checks,
similar functionality was added to handle OLM CSVs.
Modifications were made to the original subscription
handle function in order to return the unstructured
subscription object to make checking CSVs easier.

Ref: https://issues.redhat.com/browse/ACM-9284

Signed-off-by: Jeffrey Luo <jeluo@redhat.com>
Copy link
Member

@JustinKuli JustinKuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! It would be nice to have additional test cases for some of the other possible conditions. I know it's difficult to find an operator that will definitely not just work... In one of my OpenShift experiments I found that a subscription like this:

spec:
  channel: singlenamespace-alpha
  installPlanApproval: Manual
  name: etcd
  source: community-operators
  sourceNamespace: openshift-marketplace
  startingCSV: etcdoperator.v0.9.0

has some problems - have you tried that?

If we really can't find a "broken" situation to test, that's ok for now. Eventually I think we'll need to create our own fake operator that we package so that we can test out odd upgrade situations, including when the deployment is broken, or the CSV gets into a weird status.

controllers/operatorpolicy_controller.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_controller.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_controller.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_controller.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_controller.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_status.go Outdated Show resolved Hide resolved
controllers/operatorpolicy_status.go Outdated Show resolved Hide resolved
@JeffeyL
Copy link
Contributor Author

JeffeyL commented Feb 6, 2024

Still working on the additional test cases, will update here when done.

@JeffeyL JeffeyL force-pushed the ACM-6598 branch 2 times, most recently from c09bf61 to 6fd4d1e Compare February 6, 2024 20:59
@JeffeyL JeffeyL force-pushed the ACM-6598 branch 7 times, most recently from d91bd99 to 7018699 Compare February 7, 2024 06:59
Following the 'handle*' function pattern that is
used by the subscription and operatorgroup checks,
similar functionality was added to handle Deployments.

Ref: https://issues.redhat.com/browse/ACM-6598

Signed-off-by: Jeffrey Luo <jeluo@redhat.com>
Copy link
Member

@JustinKuli JustinKuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for dealing with all of the comments back and forth :)

@openshift-ci openshift-ci bot added the lgtm label Feb 7, 2024
Copy link

openshift-ci bot commented Feb 7, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JeffeyL, JustinKuli

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants