Skip to content

Conversation

@miyadav
Copy link
Member

@miyadav miyadav commented May 29, 2025

fixes - #64959 for now.

@sunzhaohua2 @huali9 @shellyyang1989 PTAL .

Validation looks good ( did for longduration cases ), discussing in slack , we can add to others steps/chains as well, once productivity team is good with changes .

@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented May 29, 2025

@miyadav: This pull request references OCPQE-29735 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set.

In response to this:

fixes - #64959 for now.

@sunzhaohua2 @huali9 @shellyyang1989 PTAL .

Validation looks good , discussing in slack , we can add to others steps/chains as well, once productivity team is good with changes .

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 29, 2025
@openshift-ci openshift-ci bot requested review from Phaow and memodi May 29, 2025 04:29
@openshift-ci-robot
Copy link
Contributor

openshift-ci-robot commented May 29, 2025

@miyadav: This pull request references OCPQE-29735 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.20.0" version, but no target version was set.

In response to this:

fixes - #64959 for now.

@sunzhaohua2 @huali9 @shellyyang1989 PTAL .

Validation looks good ( did for longduration cases ), discussing in slack , we can add to others steps/chains as well, once productivity team is good with changes .

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@miyadav
Copy link
Member Author

miyadav commented May 29, 2025

/pj-rehearse pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-1of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-2of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-3of3

@openshift-ci-robot
Copy link
Contributor

@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@miyadav
Copy link
Member Author

miyadav commented May 29, 2025

/pj-rehearse pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-1of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-2of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-3of3

@openshift-ci-robot
Copy link
Contributor

@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

Copy link
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

You'll need to add the rehearsal ack when happy

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 29, 2025
@damdo
Copy link
Member

damdo commented May 29, 2025

@miyadav you'll also need /approve from:

documentation: |-
Indicates if the active cluster is an OpenShift cluster or a derivative (e.g., Hypershift, Microshift).
A value of "true" means the cluster is OpenShift or a derivative, while "false" means it is not (e.g., AKS).
- name: SHARD_ARGS
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the arg used?

Copy link
Member Author

@miyadav miyadav May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got this failure in earlier run , hence was trying ( I tested it for one of the workflow where the step was openshift-extented-test and it passed , the chain used here also uses openshift-extended-test , so earlier I did not think it needed it but when failed , hence added it )
SHARD_ARGS="--shard-count 3 --shard-id 1" error: unknown flag: --shard-count
/hold
holding this up again and reviewing it further , apparently one of the run has passed so , will figure out what is the difference .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it needs to go to clusterinfra-qe/regression/openshift-e2e-test-clusterinfra-qe-regression-chain.yaml

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test regression-clusterinfra-azure-ipi-mapi contains steps below. So which step we add the option in depends on which step we want to divide based on running time. Do you mean if we don't set the option in the step, it would error out? we need to figure it out why and it doesn't make much sense if we declare it but never use it.

 steps:
  - chain: cucushift-installer-check-cluster-health
  - ref: idp-htpasswd
  - ref: cucushift-pre
  - ref: openshift-extended-test
  - ref: cucushift-e2e
  - ref: openshift-e2e-test-clusterinfra-qe
  - ref: openshift-e2e-test-qe-report

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes @shellyyang1989 you are correct , reviewed it , checked with @liangxia as well , it doesn't have any requirement for chain to be aware about it , the if step knows it . Made all the changes now, the tests looking good now , will wait for them to finish and report status .

Copy link
Member Author

@miyadav miyadav May 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The one without shard took 3 hrs to run for openshift-tests-private tests -
INFO[2025-05-22T04:56:41Z] Running step regression-clusterinfra-azure-ipi-mapi-openshift-extended-test. INFO[2025-05-22T08:11:19Z] Step regression-clusterinfra-azure-ipi-mapi-openshift-extended-test succeeded after 3h14m37s.

With sharding they took approx ~ 1 hr 10 mins- 1-2-3 shards

INFO[2025-05-30T08:08:14Z] Running step regression-clusterinfra-azure-ipi-mapi-openshift-extended-test. INFO[2025-05-30T09:24:01Z] Step regression-clusterinfra-azure-ipi-mapi-openshift-extended-test succeeded after 1h15m47s.

INFO[2025-05-30T08:12:19Z] Running step regression-clusterinfra-azure-ipi-mapi-openshift-extended-test. INFO[2025-05-30T09:09:01Z] Step regression-clusterinfra-azure-ipi-mapi-openshift-extended-test succeeded after 56m41s.
INFO[2025-05-30T08:22:45Z] Running step regression-clusterinfra-azure-ipi-mapi-openshift-extended-test. INFO[2025-05-30T09:41:40Z] Step regression-clusterinfra-azure-ipi-mapi-openshift-extended-test succeeded after 1h18m54s.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks Milind. yeah, for the for openshift-tests-private tests, we have the shard option in ci-operator/step-registry/openshift-extended/test/openshift-extended-test-ref.yaml which is fine. I meant do we need the shard option to be defined for ci-operator/step-registry/openshift/e2e/test/clusterinfra-qe/openshift-e2e-test-clusterinfra-qe-ref.yaml. It doesn't used in openshift-e2e-test-clusterinfra-qe-commands.sh and we need to think about if we need to shard the step which run the tests in cluster-api-actuator-pkg. Am I misunderstanding anything?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no you didn't misunderstood, we don't need it , unless we want to shard the cluster-api-actuator tests as well .

@openshift-ci openshift-ci bot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. and removed lgtm Indicates that a PR is ready to be merged. labels May 29, 2025
@miyadav miyadav force-pushed the enableshardingopenshiftprivatetests branch 2 times, most recently from 3c61379 to de0af77 Compare May 29, 2025 09:42
@miyadav
Copy link
Member Author

miyadav commented May 29, 2025

/pj-rehearse pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-1of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-2of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-3of3

@openshift-ci-robot
Copy link
Contributor

@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@miyadav
Copy link
Member Author

miyadav commented May 29, 2025

/pj-rehearse pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-1of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-2of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-3of3

@openshift-ci-robot
Copy link
Contributor

@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@miyadav miyadav force-pushed the enableshardingopenshiftprivatetests branch from de0af77 to fa7d363 Compare May 30, 2025 02:25
@miyadav
Copy link
Member Author

miyadav commented May 30, 2025

/pj-rehearse pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-1of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-2of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-3of3

@openshift-ci-robot
Copy link
Contributor

@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@miyadav miyadav force-pushed the enableshardingopenshiftprivatetests branch from fa7d363 to 81e6e43 Compare May 30, 2025 05:02
@miyadav
Copy link
Member Author

miyadav commented May 30, 2025

/pj-rehearse pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-1of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-2of3 pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-3of3

@openshift-ci-robot
Copy link
Contributor

@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot
Copy link
Contributor

@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@miyadav miyadav requested a review from Phaow May 30, 2025 11:14
@shellyyang1989
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label May 30, 2025
@miyadav
Copy link
Member Author

miyadav commented Jun 2, 2025

/pj-rehearse ack

@openshift-ci-robot
Copy link
Contributor

@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot openshift-ci-robot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Jun 2, 2025
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jun 3, 2025
@miyadav
Copy link
Member Author

miyadav commented Jun 3, 2025

/unhold

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jun 3, 2025
@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD effe97d and 2 for PR HEAD a59e7d7 in total

@openshift-ci-robot
Copy link
Contributor

/retest-required

Remaining retests: 0 against base HEAD 3875301 and 1 for PR HEAD a59e7d7 in total

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 3, 2025

@miyadav: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/rehearse/openshift/machine-api-provider-azure/main/regression-clusterinfra-azure-ipi-mapi-2of3 f084aa6 link unknown /pj-rehearse pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-2of3
ci/rehearse/openshift/machine-api-provider-azure/main/regression-clusterinfra-azure-ipi-mapi-3of3 f084aa6 link unknown /pj-rehearse pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-3of3
ci/rehearse/openshift/machine-api-provider-azure/main/regression-clusterinfra-azure-ipi-mapi-1of3 f084aa6 link unknown /pj-rehearse pull-ci-openshift-machine-api-provider-azure-main-regression-clusterinfra-azure-ipi-mapi-1of3

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

miyadav and others added 2 commits June 3, 2025 16:32
test the changes

informing main chain as well of the param in the step

Update ci-operator/step-registry/openshift-extended/test/openshift-extended-test-ref.yaml

Update ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh

Update ci-operator/step-registry/openshift/e2e/test/clusterinfra-qe/openshift-e2e-test-clusterinfra-qe-ref.yaml

Co-authored-by: Penghao <pewang@redhat.com>
@miyadav miyadav force-pushed the enableshardingopenshiftprivatetests branch from a59e7d7 to bfc4da8 Compare June 3, 2025 11:10
@openshift-ci openshift-ci bot removed the lgtm Indicates that a PR is ready to be merged. label Jun 3, 2025
@openshift-ci-robot openshift-ci-robot removed the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Jun 3, 2025
@openshift-ci-robot
Copy link
Contributor

[REHEARSALNOTIFIER]
@miyadav: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-level0-clusterinfra-azure-ipi-proxy-tests openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-main-regression-vsphere-ipi-ccmo openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.21-level0-clusterinfra-azure-ipi-proxy-tests openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.21-regression-vsphere-ipi-ccmo openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.20-level0-clusterinfra-azure-ipi-proxy-tests openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.20-regression-vsphere-ipi-ccmo openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.19-level0-clusterinfra-azure-ipi-proxy-tests openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.19-regression-vsphere-ipi-ccmo openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.18-level0-clusterinfra-azure-ipi-proxy-tests openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.18-regression-vsphere-ipi-ccmo openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.17-level0-clusterinfra-azure-ipi-proxy-tests openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.17-regression-vsphere-ipi-ccmo openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.16-level0-clusterinfra-azure-ipi-proxy-tests openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-cluster-cloud-controller-manager-operator-release-4.16-regression-vsphere-ipi-ccmo openshift/cluster-cloud-controller-manager-operator presubmit Registry content changed
pull-ci-openshift-azure-file-csi-driver-master-e2e-azure-file-csi-extended openshift/azure-file-csi-driver presubmit Registry content changed
pull-ci-openshift-azure-file-csi-driver-release-4.21-e2e-azure-file-csi-extended openshift/azure-file-csi-driver presubmit Registry content changed
pull-ci-openshift-azure-file-csi-driver-release-4.20-e2e-azure-file-csi-extended openshift/azure-file-csi-driver presubmit Registry content changed
pull-ci-openshift-azure-file-csi-driver-release-4.19-e2e-azure-file-csi-extended openshift/azure-file-csi-driver presubmit Registry content changed
pull-ci-openshift-azure-file-csi-driver-release-4.18-e2e-azure-file-csi-extended openshift/azure-file-csi-driver presubmit Registry content changed
pull-ci-openshift-azure-file-csi-driver-release-4.17-e2e-azure-file-csi-extended openshift/azure-file-csi-driver presubmit Registry content changed
pull-ci-openshift-azure-file-csi-driver-release-4.16-e2e-azure-file-csi-extended openshift/azure-file-csi-driver presubmit Registry content changed
pull-ci-openshift-azure-file-csi-driver-release-4.15-e2e-azure-file-csi-extended openshift/azure-file-csi-driver presubmit Registry content changed
pull-ci-openshift-azure-file-csi-driver-release-4.14-e2e-azure-file-csi-extended openshift/azure-file-csi-driver presubmit Registry content changed
pull-ci-openshift-cluster-api-provider-gcp-master-regression-clusterinfra-cucushift-rehearse-gcp-ipi-techprev openshift/cluster-api-provider-gcp presubmit Registry content changed
pull-ci-openshift-cluster-api-provider-gcp-release-4.21-regression-clusterinfra-cucushift-rehearse-gcp-ipi-techprev openshift/cluster-api-provider-gcp presubmit Registry content changed

A total of 2740 jobs have been affected by this change. The above listing is non-exhaustive and limited to 25 jobs.

A full list of affected jobs can be found here

Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@miyadav
Copy link
Member Author

miyadav commented Jun 3, 2025

/pj-rehearse ack

@openshift-ci-robot
Copy link
Contributor

@miyadav: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci-robot openshift-ci-robot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Jun 3, 2025
@shellyyang1989
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 3, 2025
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 3, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: damdo, liangxia, miyadav, shellyyang1989

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-bot openshift-merge-bot bot merged commit 34435da into openshift:master Jun 3, 2025
17 checks passed
tbuskey pushed a commit to tbuskey/release that referenced this pull request Jun 3, 2025
…hift#65460)

* OCPQE-29735: Adding shard option to actual run command to work

test the changes

informing main chain as well of the param in the step

Update ci-operator/step-registry/openshift-extended/test/openshift-extended-test-ref.yaml

Update ci-operator/step-registry/openshift-extended/test/openshift-extended-test-commands.sh

Update ci-operator/step-registry/openshift/e2e/test/clusterinfra-qe/openshift-e2e-test-clusterinfra-qe-ref.yaml

Co-authored-by: Penghao <pewang@redhat.com>

* removing shard_args from chain as not doing sharding for cluster-api-actuator-pkg

---------

Co-authored-by: Penghao <pewang@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. rehearsals-ack Signifies that rehearsal jobs have been acknowledged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants