Skip to content

Conversation

@rioliu-rh
Copy link
Contributor

@rioliu-rh rioliu-rh commented Nov 12, 2025

Summary

Fix compat_otp.DebugNode() and helper functions to support HyperShift hosted clusters by automatically using guest kubeconfig when configured.

Problem

The DebugNode() function and its helper functions were hardcoded to use AsAdmin(), preventing them from working with HyperShift hosted clusters when a guest kubeconfig is configured via SetGuestKubeconf().

Solution

Introduced a centralized determineExecCLI() function that automatically selects the appropriate CLI config (guest or admin) based on whether a guest kubeconfig is set. All helper functions now use this function instead of hardcoded AsAdmin().

Changes

  • Add determineExecCLI() to automatically select guest or admin config
  • Update all helper functions to use determineExecCLI():
    • IsNamespacePrivileged()
    • SetNamespacePrivileged()
    • RecoverNamespaceRestricted()
    • GetNodeListByLabel()
    • IsDefaultNodeSelectorEnabled()
    • AddAnnotationsToSpecificResource()
    • RemoveAnnotationFromSpecificResource()
    • GetAnnotationsFromSpecificResource()
  • Update debugNode() final debug command to use determineExecCLI()

Backward Compatibility

This change is fully backward compatible:

  • Existing callers continue to work as-is (they get admin config as before)
  • HyperShift scenarios now work correctly when guest kubeconfig is set
  • No API changes or breaking changes

Testing

All affected helper functions are only used within debugNode() - verified no external dependencies that would break.

Fixes: OCPERT-222

The DebugNode() function and its helper functions were hardcoded to use
AsAdmin(), preventing them from working with HyperShift hosted clusters
when a guest kubeconfig is configured via SetGuestKubeconf().

This commit introduces a centralized determineExecCLI() function that
automatically selects the appropriate CLI config (guest or admin) based
on whether a guest kubeconfig is set. All helper functions now use this
function instead of hardcoded AsAdmin(), ensuring operations target the
correct cluster.

Changes:
- Add determineExecCLI() to automatically select guest or admin config
- Update IsNamespacePrivileged() to use determineExecCLI()
- Update SetNamespacePrivileged() to use determineExecCLI()
- Update RecoverNamespaceRestricted() to use determineExecCLI()
- Update GetNodeListByLabel() to use determineExecCLI()
- Update IsDefaultNodeSelectorEnabled() to use determineExecCLI()
- Update AddAnnotationsToSpecificResource() to use determineExecCLI()
- Update RemoveAnnotationFromSpecificResource() to use determineExecCLI()
- Update GetAnnotationsFromSpecificResource() to use determineExecCLI()
- Update debugNode() final debug command to use determineExecCLI()

This is backward compatible - existing callers continue to work as-is,
while HyperShift scenarios now work correctly by setting guest kubeconfig.

Fixes: OCPERT-222
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 12, 2025
@rioliu-rh rioliu-rh changed the title Fix compat_otp.DebugNode() to support guest kubeconfig for HyperShift NO-JIRA: Fix compat_otp.DebugNode() to support guest kubeconfig for HyperShift Nov 12, 2025
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Nov 12, 2025
@openshift-ci-robot
Copy link

@rioliu-rh: This pull request explicitly references no jira issue.

In response to this:

Summary

Fix compat_otp.DebugNode() and helper functions to support HyperShift hosted clusters by automatically using guest kubeconfig when configured.

Problem

The DebugNode() function and its helper functions were hardcoded to use AsAdmin(), preventing them from working with HyperShift hosted clusters when a guest kubeconfig is configured via SetGuestKubeconf().

Solution

Introduced a centralized determineExecCLI() function that automatically selects the appropriate CLI config (guest or admin) based on whether a guest kubeconfig is set. All helper functions now use this function instead of hardcoded AsAdmin().

Changes

  • Add determineExecCLI() to automatically select guest or admin config
  • Update all helper functions to use determineExecCLI():
  • IsNamespacePrivileged()
  • SetNamespacePrivileged()
  • RecoverNamespaceRestricted()
  • GetNodeListByLabel()
  • IsDefaultNodeSelectorEnabled()
  • AddAnnotationsToSpecificResource()
  • RemoveAnnotationFromSpecificResource()
  • GetAnnotationsFromSpecificResource()
  • Update debugNode() final debug command to use determineExecCLI()

Backward Compatibility

This change is fully backward compatible:

  • Existing callers continue to work as-is (they get admin config as before)
  • HyperShift scenarios now work correctly when guest kubeconfig is set
  • No API changes or breaking changes

Testing

All affected helper functions are only used within debugNode() - verified no external dependencies that would break.

Fixes: OCPERT-222

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@sergiordlr
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Nov 12, 2025
@rioliu-rh
Copy link
Contributor Author

/test e2e-vsphere-ovn

1 similar comment
@rioliu-rh
Copy link
Contributor Author

/test e2e-vsphere-ovn

@rioliu-rh
Copy link
Contributor Author

run sanity test with MCO case

 ./bin/extended-platform-tests run all --dry-run | grep '42361' | ./bin/extended-platform-tests run --timeout 30m -f -
  I1113 16:01:18.968096 48380 test.go:180] Found authentication type used:
  I1113 16:01:18.969114 48380 test_context.go:567] The --provider flag is not set. Continuing as if --provider=skeleton had been used.
  I1113 16:01:21.329960 48380 api.go:57] EnvIsKubernetesCluster = no, start monitoring ClusterOperators and ClusterVersions
started: (0/1/1) "[sig-mco] MCO Author:rioliu-Longduration-NonPreRelease-Critical-42361-[P2][OnCLayer] add chrony systemd config [Disruptive] [Serial]"

  I1113 16:01:25.344871 48419 openshift-tests.go:203] Is kubernetes cluster: no, is external OIDC cluster: no
  I1113 16:01:25.345213 48419 test_context.go:567] The --provider flag is not set. Continuing as if --provider=skeleton had been used.
  [1763020881] openshift extended e2e - 1/1 specs I1113 16:01:29.466544 48419 client.go:293] configPath is now "/var/folders/pk/ntb29tq142l2dm5899xcc4yh0000gn/T/configfile4011077424"
  I1113 16:01:29.466649 48419 client.go:368] The user is now "e2e-test-mco-b4f4k-user"
  I1113 16:01:29.466663 48419 client.go:370] Creating project "e2e-test-mco-b4f4k"
  I1113 16:01:29.788172 48419 client.go:378] Waiting on permissions in project "e2e-test-mco-b4f4k" ...
  I1113 16:01:30.428849 48419 client.go:407] DeploymentConfig capability is enabled, adding 'deployer' SA to the list of default SAs
  I1113 16:01:30.591741 48419 client.go:422] Waiting for ServiceAccount "default" to be provisioned...
  I1113 16:01:31.013576 48419 client.go:422] Waiting for ServiceAccount "builder" to be provisioned...
  I1113 16:01:31.439206 48419 client.go:422] Waiting for ServiceAccount "deployer" to be provisioned...
  I1113 16:01:31.859060 48419 client.go:432] Waiting for RoleBinding "system:image-pullers" to be provisioned...
  I1113 16:01:32.172719 48419 client.go:432] Waiting for RoleBinding "system:image-builders" to be provisioned...
  I1113 16:01:32.480902 48419 client.go:432] Waiting for RoleBinding "system:deployers" to be provisioned...
  I1113 16:01:33.205654 48419 client.go:465] Project "e2e-test-mco-b4f4k" has been fully provisioned.
  STEP: MCO Preconditions Checks 11/13/25 16:01:33.206
  Nov 13 16:01:33.952: INFO: Check that master pool is ready for testing
  Nov 13 16:01:34.541: INFO: Num nodes: 3, wait time per node 13 minutes
  Nov 13 16:01:34.541: INFO: Increase waiting time because it is master pool
  Nov 13 16:01:34.541: INFO: Waiting 3m54s for MCP master to be completed.
  Nov 13 16:01:35.834: INFO: MCP 'master' is ready for testing
  Nov 13 16:01:35.834: INFO: Check that worker pool is ready for testing
  Nov 13 16:01:36.444: INFO: Num nodes: 3, wait time per node 13 minutes
  Nov 13 16:01:36.444: INFO: Waiting 3m0s for MCP worker to be completed.
  Nov 13 16:01:37.629: INFO: MCP 'worker' is ready for testing
  Nov 13 16:01:37.629: INFO: Wait for MCC to get the leader lease
  Nov 13 16:01:39.720: INFO: End of MCO Preconditions

  STEP: create new mc to apply chrony config on worker nodes 11/13/25 16:01:39.721
  NAME     CONFIG                                             UPDATED   UPDATING   DEGRADED   MACHINECOUNT   READYMACHINECOUNT   UPDATEDMACHINECOUNT   DEGRADEDMACHINECOUNT   AGE
  master   rendered-master-e41588c75bd461db66eaef6bed46c49a   True      False      False      3              3                   3                     0                      39m
  worker   rendered-worker-9e98c66dabbf7dbea83aa1d2918e02d7   True      False      False      3              3                   3                     0                      39m
  Nov 13 16:02:03.818: INFO: mco fixture dir is not initialized, start to create
  Nov 13 16:02:03.835: INFO: mco fixture dir is initialized: /var/folders/pk/ntb29tq142l2dm5899xcc4yh0000gn/T/fixture-testdata-dir1018079994/test/extended/testdata/mco
  I1113 16:02:07.521036 48419 template.go:78] the file of resource is /tmp/e2e-test-mco-b4f4k-qeld7ilxconfig.json.stdout
  machineconfig.machineconfiguration.openshift.io/ztc-42361-change-workers-chrony-configuration-c0twxfjd created
  Nov 13 16:02:13.761: INFO: mc ztc-42361-change-workers-chrony-configuration-c0twxfjd is created successfully
  Nov 13 16:02:15.593: INFO: Num nodes: 3, wait time per node 13 minutes
  Nov 13 16:02:15.593: INFO: Waiting 39m0s for MCP worker to be completed.
  Nov 13 16:13:17.581: INFO: The new MC has been successfully applied to MCP 'worker'
  STEP: get one node to verify the config changes 11/13/25 16:13:17.582
  namespace/e2e-test-mco-b4f4k labeled
  namespace/e2e-test-mco-b4f4k unlabeled
  Nov 13 16:13:32.523: INFO: pool 0.rhel.pool.ntp.org iburst
  driftfile /var/lib/chrony/drift
  makestep 1.0 3
  rtcsync
  logdir /var/log/chrony
  Starting pod/ip-10-0-29-236us-west-1computeinternal-debug-9f6kc ...
  To use host binaries, run `chroot /host`

  Removing debug pod ...
  machineconfig.machineconfiguration.openshift.io "ztc-42361-change-workers-chrony-configuration-c0twxfjd" deleted
  Nov 13 16:13:35.344: INFO: Num nodes: 3, wait time per node 13 minutes
  Nov 13 16:13:35.344: INFO: Waiting 39m0s for MCP worker to be completed.
  Nov 13 16:26:37.543: INFO: The new MC has been successfully applied to MCP 'worker'
  I1113 16:26:38.050572 48419 client.go:681] Deleted {user.openshift.io/v1, Resource=users  e2e-test-mco-b4f4k-user}, err: <nil>
  I1113 16:26:38.219049 48419 client.go:681] Deleted {oauth.openshift.io/v1, Resource=oauthclients  e2e-client-e2e-test-mco-b4f4k}, err: <nil>
  I1113 16:26:38.387205 48419 client.go:681] Deleted {oauth.openshift.io/v1, Resource=oauthaccesstokens  sha256~q6C6CC6fG0Ve1RAwOUjRc32Ggw0UlikcgZbkjgF70pc}, err: <nil>
  • SUCCESS! 25m11.903483375s
passed: (25m17s) 2025-11-13T08:26:38 "[sig-mco] MCO Author:rioliu-Longduration-NonPreRelease-Critical-42361-[P2][OnCLayer] add chrony systemd config [Disruptive] [Serial]"

1 pass, 0 skip (25m17s)

@rioliu-rh
Copy link
Contributor Author

/verified by "[sig-mco] MCO Author:rioliu-Longduration-NonPreRelease-Critical-42361-[P2][OnCLayer] add chrony systemd config [Disruptive] [Serial]"

@openshift-ci-robot openshift-ci-robot added the verified Signifies that the PR passed pre-merge verification criteria label Nov 13, 2025
@openshift-ci-robot
Copy link

@rioliu-rh: This PR has been marked as verified by "[sig-mco] MCO Author:rioliu-Longduration-NonPreRelease-Critical-42361-[P2][OnCLayer] add chrony systemd config [Disruptive] [Serial]".

In response to this:

/verified by "[sig-mco] MCO Author:rioliu-Longduration-NonPreRelease-Critical-42361-[P2][OnCLayer] add chrony systemd config [Disruptive] [Serial]"

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@rioliu-rh
Copy link
Contributor Author

/test e2e-aws-ovn-fips

@rioliu-rh
Copy link
Contributor Author

/test e2e-gcp-ovn

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 782ff8b and 2 for PR HEAD 534f0e5 in total

@rioliu-rh
Copy link
Contributor Author

/test e2e-aws-ovn-fips

@rioliu-rh
Copy link
Contributor Author

/test e2e-gcp-ovn

1 similar comment
@rioliu-rh
Copy link
Contributor Author

/test e2e-gcp-ovn

Copy link

@LuboTerifaj LuboTerifaj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 14, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: LuboTerifaj, rioliu-rh, sergiordlr, tomasdavidorg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD b7d2a64 and 1 for PR HEAD 534f0e5 in total

@rioliu-rh
Copy link
Contributor Author

/test e2e-vsphere-ovn

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Nov 15, 2025

@rioliu-rh: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 8aaf884 into openshift:main Nov 15, 2025
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants