Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Objects are pruned on templating errors #139

Conversation

yiraeChristineKim
Copy link
Contributor

Description:
Deploying a enforced ConfigurationPolicy with a valid template and then updating the template to be invalid caused the created object to be pruned. (I did it with hub templates, but I'd suspect managed cluster templates would have a similar behavior.)

expected: Not perform any deletion while in this error state.

Ref: https://issues.redhat.com/browse/ACM-5301
Signed-off-by: Yi Rae Kim yikim@redhat.com

@@ -967,7 +969,7 @@ func (r *ConfigurationPolicyReconciler) handleObjectTemplates(plc policyv1.Confi
encryptionConfig, usedKeyCache, err = r.getEncryptionConfig(plc, true)

if err != nil {
addTemplateErrorViolation("", err.Error())
addTemplateErrorViolation("", err.Error(), true)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are more addTemplateErrorViolation I am actually not sure which one should I add "skip removing resources"... Would anyone give me advice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the instances I see where you don't set the parameter are:

  • when getEncryptionConfig has an error
  • when templates.NewResolver has an error
  • when ObjectTemplatesRaw is used, but doesn't have any templates, and then yaml.Unmarshal has an error.

I can't think of a situation where I would want things related objects removed when one of these errors occurs. So I wonder if there are really any template errors where we should consider removing resources - I think that when any error occurs, we probably shouldn't clean anything up, just in case.

If addTemplateErrorViolation always passed a deleteDetachedObjs = false parameter (see other comments) to checkRelatedAndUpdate, I think that would be a bit cleaner.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JustinKuli Did I miss anything?!

@@ -967,7 +969,7 @@ func (r *ConfigurationPolicyReconciler) handleObjectTemplates(plc policyv1.Confi
encryptionConfig, usedKeyCache, err = r.getEncryptionConfig(plc, true)

if err != nil {
addTemplateErrorViolation("", err.Error())
addTemplateErrorViolation("", err.Error(), true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now the instances I see where you don't set the parameter are:

  • when getEncryptionConfig has an error
  • when templates.NewResolver has an error
  • when ObjectTemplatesRaw is used, but doesn't have any templates, and then yaml.Unmarshal has an error.

I can't think of a situation where I would want things related objects removed when one of these errors occurs. So I wonder if there are really any template errors where we should consider removing resources - I think that when any error occurs, we probably shouldn't clean anything up, just in case.

If addTemplateErrorViolation always passed a deleteDetachedObjs = false parameter (see other comments) to checkRelatedAndUpdate, I think that would be a bit cleaner.

@@ -1193,8 +1195,14 @@ func (r *ConfigurationPolicyReconciler) handleObjectTemplates(plc policyv1.Confi
// checkRelatedAndUpdate checks the related objects field and triggers an update on the ConfigurationPolicy
func (r *ConfigurationPolicyReconciler) checkRelatedAndUpdate(
plc policyv1.ConfigurationPolicy, related, oldRelated []policyv1.RelatedObject, sendEvent bool,
isTemplateParseErr ...bool,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would prefer this to not be variadic, because it makes it easier to review that we've considered each time that this function is called. I would also prefer it to be called something like deleteDetachedObjs to describe what it does, rather than where it's coming from, since there could be other cases where we don't want to do this cleanup, but the reason has nothing to do with template errors.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at other times checkRelatedAndUpdate is called, I think there are more error cases where we don't want to delete "detached" objects.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Final note (sorry), I think the function signature is long enough now that each parameter should be on a separate line, like https://github.com/open-cluster-management-io/config-policy-controller/blob/main/controllers/configurationpolicy_controller.go#L1394. But that's just a style thing.

Copy link
Member

@JustinKuli JustinKuli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the update, I think it's looking right overall, just a few last details to iron out! Once we get the tests passing I think this will be good.

controllers/configurationpolicy_controller.go Outdated Show resolved Hide resolved
controllers/configurationpolicy_controller.go Outdated Show resolved Hide resolved
controllers/configurationpolicy_controller.go Outdated Show resolved Hide resolved
Signed-off-by: Yi Rae Kim <yikim@redhat.com>
@openshift-ci openshift-ci bot added the lgtm label Jun 1, 2023
@openshift-ci
Copy link

openshift-ci bot commented Jun 1, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: JustinKuli, yiraeChristineKim

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [JustinKuli,yiraeChristineKim]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants