-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Handle HSM for HFC #351
Handle HSM for HFC #351
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: iurygregory The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Is this a downstream-only change? |
@honza nope, is just to try to test downstream first in the setup Jad has (I have a thread in slack about it) |
/hold |
c2448d6
to
0fe2a34
Compare
/retest |
0fe2a34
to
1ed7c6f
Compare
hfc.Status.Updates = hfc.Spec.Updates | ||
t := metav1.Now() | ||
hfc.Status.LastUpdated = &t | ||
return nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without calling Update()
this doesn't have any effect.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are talking about having the reconciler calling Status().Update() right ? r.Status().Update(info.ctx, info.hfc)
Any ideas on how to access the HostFirmwareComponentsReconciler? Or there is other way to call
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this function is in baremetalhost controller, you can either define it as a method to BMH Reconciler or call the Status().Update
in the caller function.
By the way, in line 1750 are we sure that all the updates have been applied by this point ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hroyrh ohhh. that makes some sense!
Regarding L1750, since this is in the saveHostFirmwareComponents
I should probably re-think the place I'm calling it actionPreparing (L1174 atm), maybe I should move it to L118.. so we would be sure that there is nothing else to do and we would do actionComplete right after it..
@@ -1168,6 +1169,10 @@ func (r *BareMetalHostReconciler) actionPreparing(prov provisioner.Provisioner, | |||
if err != nil { | |||
return actionError{errors.Wrap(err, "could not save the host provisioning settings")} | |||
} | |||
if hfc != nil { | |||
info.log.Info("saving hostfirmwarecomponents updates into status") | |||
saveHostFirmwareComponents(hfc, info) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This updates the status when manual cleaning starts, but what if it fails?
I suspect we only want to update this after cleaning has succeeded, but we also have to think about what happens if updating the status fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right!, yeah I agree we want to only update if it succeeded.
You mean updating the status in case an error happened during cleaning or what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how I handled status updates - https://github.com/openshift/baremetal-operator/blob/master/controllers/metal3.io/baremetalhost_controller.go#L1571
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @hroyrh
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean if the call to Update() fails, do we get the chance to try again? Or will we restart the whole manual cleaning process or something weird like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But we don't have a way to remember that the Update failed in the last Reconcile, right ? So, do we have to check if the config changes were already applied before, by fetching the current Ironic node and if yes - then, simply run Update again, rather than the whole manual cleaning process as you mentioned ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One way, could be like @hroyrh mentioned I think...
We can check if the information ironic provides about Components would be different (if there was a Firmware Update something would be different in the ironic DB)
Does it makes sense?
@@ -223,9 +223,7 @@ func (r *HostFirmwareComponentsReconciler) updateHostFirmware(info *rhfcInfo, co | |||
|
|||
// Update Status if has changed | |||
if dirty { | |||
info.log.Info("Status for HostFirmwareComponents changed") | |||
info.hfc.Status = *newStatus.DeepCopy() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without this you're no longer writing the components read from ironic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
94b0778
to
84f1818
Compare
84f1818
to
53919fb
Compare
57db30c
to
a903ba9
Compare
newStatus.Components = make([]metal3api.FirmwareComponentStatus, len(components)) | ||
for i := range info.hfc.Status.Components { | ||
components[i].DeepCopyInto(&newStatus.Components[i]) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed to this approach to see if would help, but maybe it's missing to trigger a condition before doing this, checking if !reflect.DeepEqual(info.hfc.Status.Components, components)
?
a903ba9
to
94eb128
Compare
cba0e20
to
6e930a0
Compare
6e930a0
to
e7e1c3a
Compare
I've update this PR to address some of the upstream comments from 1793 and 1821. I'm still struggling to figure out how to proper update the Components information with the newer information Ironic has about the firmware. |
89a6138
to
934b7a3
Compare
@dtantsur @zaneb I'm going to test Scenario 2 - where we have a provisioned BMH and we need to scale-down and scale-up. |
- Fixed the comments provided in the upstream review - investigation about how to update the newer firmware information in Status after the update.
934b7a3
to
dcb99c2
Compare
@iurygregory: The following test failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Issues go stale after 90d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle stale |
PR needs rebase. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
Stale issues rot after 30d of inactivity. Mark the issue as fresh by commenting If this issue is safe to close now please do so with /lifecycle rotten |
Rotten issues close after 30d of inactivity. Reopen the issue by commenting /close |
@openshift-bot: Closed this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
No description provided.