Cluster-wide operators support improvements #2479

greyerof · 2024-10-01T14:15:19Z

This commit enables the discovery of cluster-wide operator pods (controllers), but also singleNamespace/multiNamespace installation methods where the targetNamespaces were set to be different than the installation namespace. The code will look for the controller pod in the installation namespace always. Operand pods can always be tested using normal namespace or label discovery methods, but a best-effort to find them in the configured namespaces will be implemented in a follow-up PR.

As a reminder:

installation namespace: namespace where both the subscription and the operator group were created. This is where the operator's controller (aka operator pod) is deployed by OLM.
targetNamespaces: namespaced where the controller will watch for CRs. The installation namespace can appear in this list but it's not mandatory.

Also, the test cases of the operator test suite related to controller pods have been moved to the access-control test suite, as those requirements also apply to any workload. For that reason, the discovered controller pods have also been added to the normal env.testPods so all the checks can be performed on them.

To distinguish between securityContext.runAsUser and securityContext.runAsNonRoot, the original test case access-control-security-context-non-root-user-check has been renamed to access-control-security-context-non-root-user-id-check

In short, the impact of this PR is twofold:

All the discovered operator (controller) pods (and their containers) are now
added to the normal list of pods/containers under test, meaning that they will be target
of all the test cases that match the label filter (--label-filter).
Test cases reorg in operator and access-control test suites:
Removed as they already existed in access-control test suite.
- operator-automount-tokens
- operator-run-as-user-id
Renamed:
access-control-security-context-non-root-user-check -> access-control-security-context-non-root-user-id-check

Moved from operator to access-control test suite:
- operator-read-only-file-system -> access-control-security-context-read-only-file-system
- operator-run-as-non-root -> access-control-security-context-run-as-non-root-user-check

--

build-depends: 32446
build-depends: https://github.com/dci-labs/dallas-pipelines/pull/1248

This commit enables the discovery of cluster-wide operator pods (controllers), but also singleNamespace/multiNamespace installation methods where the targetNamespaces were set to be different than the installation namespace. The code will look for the controller pod in the installation namespace always. Operand pods can always be tested using normal namespace or label discovery methods, but a best-effort to find them in the configured namespaces will be implemented in a follow-up PR. As a reminder: - installation namespace: namespace where both the subscription and the operator group where created. This is where the operator's controller (aka operator pod) is deployed by OLM. - targetNamespaces: namespaced where the controller will watch for CRs. The installation namespace can appear in this list but it's not mandatory. Also, the test cases of the operator test suite related to controller pods have been moved to the access-control test suite, as those requirements also apply to any workload. For that reason, the discovered controller pods have also been added to the normal env.testPods so all the checks can be performed on them. To distinguish between securityContext.runAsUser and securityContext.runAsNonRoot, the original test case access-control-security-context-non-root-user-check has been renamed to access-control-security-context-non-root-user-id-check

dcibot · 2024-10-01T14:47:15Z

from change #2479:

FAILURE https://www.distributed-ci.io/jobs/9f62749b-0d42-4c75-b085-9501c67442ae/jobStates

sebrandon1 · 2024-10-01T17:14:44Z

If you make the changes in the QE repo, just temporarily change the ref: from main (in .github/workflows/qe-ocp-arm-416.yaml and .github/workflows/qe-ocp-pre-main.yaml) to whatever branch you have to make the 4.16 OCP tests pass.

If the operator pods were already discovered by either label or (more likely) namepace discovery mode, we should not add them again, just flag them. Also, add the operator pods' containers to the containers under test list.

The non-root check test case has been renamed in: redhat-best-practices-for-k8s/certsuite#2479

The non-root check test case has been renamed in: redhat-best-practices-for-k8s/certsuite#2479 Also, four operator test cases have been moved to access-control test suite. I just removed them for now as they might need more rework in follow-up PRs.

dcibot · 2024-10-09T11:52:53Z

from change #2479:

FAILURE https://www.distributed-ci.io/jobs/3163cba7-1650-450a-9a44-a9c6b82d80f0/jobStates

pkg/provider/provider.go

greyerof · 2024-10-11T08:15:20Z

@ramperher, @manurodriguez this PR impacts on the regression check in DCI jobs as some test cases are now failing on mongodb pods and/or containers. We have similar issue with the hazelcast operator that we deploy in kind clusters for our github workflows so we decided (to do in follow-up PRs) to not install it and rely exclusively on QE for operator workload tests.

sebrandon1 · 2024-10-11T14:47:43Z

I think we will have to merge this until the test case names for expected output are changed on the DCI side.

manurodriguez · 2024-10-11T19:27:11Z

@ramperher, @manurodriguez this PR impacts on the regression check in DCI jobs as some test cases are now failing on mongodb pods and/or containers. We have similar issue with the hazelcast operator that we deploy in kind clusters for our github workflows so we decided (to do in follow-up PRs) to not install it and rely exclusively on QE for operator workload tests.

@greyerof thanks for the information, I see what you mean, several tests failed

I think we will have to merge this until the test case names for expected output are changed on the DCI side.
@sebrandon1, is there anything to do in our side? some test case names?

sebrandon1 · 2024-10-11T19:53:52Z

/dci-rerun

sebrandon1 · 2024-10-11T19:55:23Z

@manurodriguez I suppose it depends if any of the expected-to-fail test cases have had their name changed in this PR. If not, then we are good. The renamed funcs are in the original comment.

ramperher · 2024-10-15T15:05:30Z

Hi, just wanted to say that I've created this to avoid installing the operator in our workload: https://softwarefactory-project.io/r/c/dci-openshift-app-agent/+/32446, testing now.

dcibot · 2024-10-15T15:10:20Z

from change https://github.com/dci-labs/bos2-ci-config/pull/212:

SUCCESS https://www.distributed-ci.io/jobs/ae83c677-a638-408e-96df-ad4fa45b092b/jobStates

dcibot · 2024-10-15T15:23:43Z

from change https://github.com/dci-labs/dallas-pipelines/pull/1248:

SUCCESS https://www.distributed-ci.io/jobs/0c9adde6-fa89-4090-98ae-8762ede6d249/jobStates

dcibot · 2024-10-15T15:48:01Z

from change https://github.com/dci-labs/bos2-ci-config/pull/212:

SUCCESS https://www.distributed-ci.io/jobs/cbab6300-78e1-4acf-8135-19ff468ff014/jobStates

dcibot · 2024-10-15T16:24:18Z

from change https://github.com/dci-labs/dallas-pipelines/pull/1248:

SUCCESS https://www.distributed-ci.io/jobs/49f81c3f-779c-4df5-88cb-150ea1e7acfe/jobStates

The non-root check test case has been renamed in: redhat-best-practices-for-k8s/certsuite#2479 Also, four operator test cases have been moved to access-control test suite. I just removed them for now as they might need more rework in follow-up PRs.

greyerof added dci-disable work in progress labels Oct 2, 2024

greyerof added 5 commits October 3, 2024 15:50

Make sure operator pods are appended once.

4012a5e

If the operator pods were already discovered by either label or (more likely) namepace discovery mode, we should not add them again, just flag them. Also, add the operator pods' containers to the containers under test list.

Show failures after smoke tests in human readable way.

e79f2ea

Output dir fixed for claim show failures step.

cf6ee3e

Fix claim show failures subcommand call

9071b65

Adjust expected results for hazelcast failures.

7d178d1

greyerof added a commit to greyerof/cnfcert-tests-verification that referenced this pull request Oct 8, 2024

access-control tc renaming.

bc1f806

The non-root check test case has been renamed in: redhat-best-practices-for-k8s/certsuite#2479

greyerof mentioned this pull request Oct 8, 2024

access-control and operator test suite updates redhat-best-practices-for-k8s/certsuite-qe#952

Closed

QE reference temporarily changed to point to PR#952.

9106f08

greyerof mentioned this pull request Oct 8, 2024

access-control and operator test suite updates redhat-best-practices-for-k8s/certsuite-qe#953

Merged

greyerof added 2 commits October 8, 2024 16:17

QE reference temporarily changed to point to PR#953

5c5128b

Temporary QE ref update in qe-hosted.yml

35373d9

greyerof removed work in progress dci-disable labels Oct 9, 2024

greyerof requested review from sebrandon1 and edcdavid and removed request for sebrandon1 October 9, 2024 13:18

sebrandon1 approved these changes Oct 9, 2024

View reviewed changes

edcdavid reviewed Oct 10, 2024

View reviewed changes

pkg/provider/provider.go Outdated Show resolved Hide resolved

edcdavid approved these changes Oct 10, 2024

View reviewed changes

greyerof added 2 commits October 11, 2024 09:07

Addressed David's comment.

956e53c

Merge branch 'main' into cw_operator_improvements

89cfa06

sebrandon1 mentioned this pull request Oct 15, 2024

operator: add olm.skipRange annotation test #2508

Merged

sebrandon1 merged commit bfc9411 into redhat-best-practices-for-k8s:main Oct 15, 2024
34 checks passed

sebrandon1 mentioned this pull request Oct 15, 2024

Remove ref for operator branch #2517

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster-wide operators support improvements #2479

Cluster-wide operators support improvements #2479

greyerof commented Oct 1, 2024 •

edited by ramperher

Loading

dcibot commented Oct 1, 2024

sebrandon1 commented Oct 1, 2024

dcibot commented Oct 9, 2024

greyerof commented Oct 11, 2024

sebrandon1 commented Oct 11, 2024

manurodriguez commented Oct 11, 2024

sebrandon1 commented Oct 11, 2024

sebrandon1 commented Oct 11, 2024

ramperher commented Oct 15, 2024

dcibot commented Oct 15, 2024

dcibot commented Oct 15, 2024

dcibot commented Oct 15, 2024

dcibot commented Oct 15, 2024

Cluster-wide operators support improvements #2479

Cluster-wide operators support improvements #2479

Conversation

greyerof commented Oct 1, 2024 • edited by ramperher Loading

dcibot commented Oct 1, 2024

sebrandon1 commented Oct 1, 2024

dcibot commented Oct 9, 2024

greyerof commented Oct 11, 2024

sebrandon1 commented Oct 11, 2024

manurodriguez commented Oct 11, 2024

sebrandon1 commented Oct 11, 2024

sebrandon1 commented Oct 11, 2024

ramperher commented Oct 15, 2024

dcibot commented Oct 15, 2024

dcibot commented Oct 15, 2024

dcibot commented Oct 15, 2024

dcibot commented Oct 15, 2024

greyerof commented Oct 1, 2024 •

edited by ramperher

Loading