Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug 2031049: Fix panic when PlatformStatus VSphere is nil #2865

Merged
merged 1 commit into from
Dec 17, 2021

Conversation

jcpowermac
Copy link
Contributor

This PR is to resolve a panic when
PlatformStatus.VSphere is nil.

This PR is to resolve a panic when
`PlatformStatus.VSphere` is nil.
@openshift-ci openshift-ci bot added the bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. label Dec 10, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 10, 2021

@jcpowermac: This pull request references Bugzilla bug 2031049, which is invalid:

  • expected the bug to target the "4.10.0" release, but it targets "---" instead

Comment /bugzilla refresh to re-evaluate validity if changes to the Bugzilla bug are made, or edit the title of this pull request to link to a different bug.

In response to this:

Bug 2031049: Fix panic when PlatformStatus VSphere is nil

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot added the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Dec 10, 2021
@openshift-ci openshift-ci bot requested review from jkyros and yuqi-zhang December 10, 2021 17:56
@jcpowermac
Copy link
Contributor Author

/bugzilla refresh

@openshift-ci openshift-ci bot added bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. and removed bugzilla/severity-unspecified Referenced Bugzilla bug's severity is unspecified for the PR. labels Dec 10, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 10, 2021

@jcpowermac: This pull request references Bugzilla bug 2031049, which is valid. The bug has been moved to the POST state. The bug has been updated to refer to the pull request using the external bug tracker.

3 validation(s) were run on this bug
  • bug is open, matching expected state (open)
  • bug target release (4.10.0) matches configured target release for branch (4.10.0)
  • bug is in the state NEW, which is one of the valid states (NEW, ASSIGNED, ON_DEV, POST, POST)

No GitHub users were found matching the public email listed for the QA contact in Bugzilla (rioliu@redhat.com), skipping review request.

In response to this:

/bugzilla refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci openshift-ci bot removed the bugzilla/invalid-bug Indicates that a referenced Bugzilla bug is invalid for the branch this PR is targeting. label Dec 10, 2021
@jcpowermac
Copy link
Contributor Author

/test e2e-vsphere
/test e2e-vsphere-upi

@jcpowermac
Copy link
Contributor Author

I am still a little concerned about this. This section of code hasn't changed in 18 months.
And I am not entirely sure the scenario in which this occurs because based on CI results its not upi, ipi or platform none.

@jcpowermac
Copy link
Contributor Author

cc: @sinnykumari

@@ -353,8 +353,12 @@ func getIgnitionHost(infraStatus *configv1.InfrastructureStatus) (string, error)
case configv1.OvirtPlatformType:
ignitionHost = net.JoinHostPort(infraStatus.PlatformStatus.Ovirt.APIServerInternalIP, securePortStr)
case configv1.VSpherePlatformType:
if infraStatus.PlatformStatus.VSphere.APIServerInternalIP != "" {
ignitionHost = net.JoinHostPort(infraStatus.PlatformStatus.VSphere.APIServerInternalIP, securePortStr)
if infraStatus.PlatformStatus.VSphere != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed we haven't changed this code in a while, so wondering why we are hitting this bug now. My main concern is, is this something expected on vSphere to have infraStatus.PlatformStatus.VSphere nil or something is missing during vSphere cluster install. Also, can it cause any possible issue later with cluster where we are not adding securePortStr on a vSphere cluster?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I think this was due to the recent PR merge to manage user data secret, as a result of

129d6e4#diff-3e205577bec1a1d1711df8bfeff30e63f044b24cfbdc69bbd7d0052fdc881891

and

59f1381#diff-3e205577bec1a1d1711df8bfeff30e63f044b24cfbdc69bbd7d0052fdc881891

Which we didn't do in sync.go prior to this since we (presumably) didn't need to figure out the ignition port. I may have been a little hasty on merging that PR. Maybe we can revisit that whole section? Otherwise the fix here (although it would get us past the failure) will still cause problems for the managed secret.

Copy link
Contributor

@sinnykumari sinnykumari Dec 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like we may hit this bug on other platform handled in getIgnitionHost() right?
Jerry, since you have better context of #2827 , do you think it is better to revert the PR or check with PR author to see if we can fix the issue?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @zaneb

I think this is because some of the old ported code may not be updated with updated assumptions on what can be nil on what platform.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no immediate plans to use the managed secret in vSphere, so I think we should go ahead with this fix.

It's not clear to me why infraStatus.PlatformStatus would be non-nil but infraStatus.PlatformStatus.VSphere nil on a vSphere cluster.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so I dug around for this, and we aren't exactly internally consistent on this either. Namely, we have 2 functions that query for this field,

  1. in the operator:
    func onPremPlatformAPIServerInternalIP(cfg mcfgv1.ControllerConfigSpec) (interface{}, error) {
    - this doesn't actually seem to have that exact detection
  2. in the template controller:
    func onPremPlatformAPIServerInternalIP(cfg RenderConfig) (interface{}, error) {
    - This does have the extra detection and the reasoning.

I wonder why the operator one doesn't error. In any case I think we should probably look to de-duplicate the above 2 functions and then use the correct one here. At least we should be safe for the other platforms, so it's ok either way.

We can just merge this PR as is to unblock CI -> properly de-dupe as well.

Copy link
Contributor

@sinnykumari sinnykumari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving based on the discussion on the PR and unblock CI. Let's handle any additional fixes that would be useful in a follow-up PR.

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 15, 2021
@rvanderp3
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 15, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 15, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jcpowermac, rvanderp3, sinnykumari

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

7 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

10 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@sinnykumari
Copy link
Contributor

/skip

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 16, 2021

@jcpowermac: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-vsphere 3f0b397 link false /test e2e-vsphere
ci/prow/e2e-aws-single-node 3f0b397 link false /test e2e-aws-single-node
ci/prow/e2e-aws-upgrade-single-node 3f0b397 link false /test e2e-aws-upgrade-single-node
ci/prow/e2e-aws-disruptive 3f0b397 link false /test e2e-aws-disruptive
ci/prow/e2e-agnostic-upgrade 3f0b397 link true /test e2e-agnostic-upgrade

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

3 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@sinnykumari
Copy link
Contributor

/test e2e-agnostic-upgrade

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

5 similar comments
@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link
Contributor

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-merge-robot openshift-merge-robot merged commit 0476b25 into openshift:master Dec 17, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 17, 2021

@jcpowermac: All pull requests linked via external trackers have merged:

Bugzilla bug 2031049 has been moved to the MODIFIED state.

In response to this:

Bug 2031049: Fix panic when PlatformStatus VSphere is nil

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. bugzilla/severity-medium Referenced Bugzilla bug's severity is medium for the branch this PR is targeting. bugzilla/valid-bug Indicates that a referenced Bugzilla bug is valid for the branch this PR is targeting. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants