-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OCPBUGS-33067: Don't fatal error when filter cannot iterate #509
OCPBUGS-33067: Don't fatal error when filter cannot iterate #509
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
I wonder if we should exclude this error, it seems like we should fail here because |
/retest-required |
/hold for test |
Still got the same failure with/without ComplianceAsCode/content#11906.
|
@Vincent056 But the error happens when trying to fetch
@xiaojiey Do you know if on 4.15 or older is there any |
@yuumasato Compliance-operator-v1.4.0 works as expected on 4.15 hypershift hosted cluster
|
9a73e58
to
d12ed7c
Compare
@BhargaviGudi I have updated the patch. Also, could you check if CO 1.4.0 works as expected on OCP4.16 too? Thanks a lot.. |
@yuumasato: This pull request references Jira Issue OCPBUGS-33067, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
A jq filter may expect to iterate over a list of results, but it can happen that no result is returned.
d12ed7c
to
8042b4f
Compare
@@ -554,6 +557,11 @@ func filter(ctx context.Context, rawobj []byte, filter string) ([]byte, error) { | |||
} | |||
if err, ok := v.(error); ok { | |||
DBG("Error while filtering: %s", err) | |||
// gojq may return a diverse set of internal errors caused by null values. | |||
// These errors are happen when a piped filter ends up acting on a null value. | |||
if strings.HasSuffix(err.Error(), ": null") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not fond the approach of checking the error string sufffix.
But gojq
may return private error types and we cannot get more insight about the filter value error.
The returned error is not even gojq.HaltError
, 😕
For this specific bug the error returned is gojq.iteratorError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Examples of errors I have seen:
Fetching URI: '/api/v1/namespaces/-/pods?labelSelector=app%3Dkube-controller-manager'
debug: Applying filter '[[.items[0].spec.containers[0].args[] | select(. | match("--root-ca-file") )] | length | if . ==1 then true else false end]' to path '/api/v1/namespaces/-/pods?labelSelector=app%3Dkube-controller-manager'
debug: Error while filtering: cannot iterate over: null
debug: Error type:, [*gojq.iteratorError]
debug: Applying filter '[.items[0].spec.containers[0].command | join(" ")]' to path '/api/v1/namespaces/-/pods?labelSelector=app%3Detcd'
debug: Error while filtering: join(" ") cannot be applied to: null
debug: Error type:, [*gojq.func1TypeError]
debug: Persisting warnings to output file
FATAL:Error fetching resources: couldn't filter '{"kind":"PodList","apiVersion":"v1","metadata":{"resourceVersion":"288952"},"items":[]}': join(" ") cannot be applied to: null
@@ -0,0 +1,4 @@ | |||
{ | |||
"metadata": {}, | |||
"items": null |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is weird that on 4.16, hipershift, through CLI I get "items": []
, while the API seems to get "items": null
.
The cli gojq
behaves consistent with the errors we are facing.
$ echo '{"metadata": {}, "items": []}' | gojq '[.items[] | select(.metadata.name | test("^rendered-worker-[0-9a-z]+$|^rendered-master-[0-9a-z]+$"))] | map(.spec.fips == true)'
[]
$ echo '{"metadata": {}, "items": null}' | gojq '[.items[] | select(.metadata.name | test("^rendered-worker-[0-9a-z]+$|^rendered-master-[0-9a-z]+$"))] | map(.spec.fips == true)'
gojq: cannot iterate over: null
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - that seems like an API change in the machine config API. Maybe there was a alpha to beta, or beta to stable jump between those versions.
@yuumasato Issue can be reproducible on with 4.16.0-0.nightly-2024-05-01-111315 + compliance-operator.v1.4.0 on hypershift hosted cluster
However, issue is not observed on normal cluster. |
Verification passed with 4.16.0-0.nightly-2024-05-01-111315 + compliance-operator from PR code #509 + with/without ComplianceAsCode/content#11906 code Create a ssb with upstream-ocp4-pci-dss and upstream-ocp4-pci-dss-node profile(from ComplianceAsCode/content#11906)
Create ssb with ocp4-pci-dss and ocp4-pci-dss-node
|
Thank you for testing @BhargaviGudi For visibility I''m posting the warnings that are added to the scan:
In OCP 4.16 HyperShift no
|
/test e2e-aws-serial |
For comparison, on 4.15 HyperShift: CO debug logs:
|
In conclusion, CO in OCP 4.16 HyperShift obtains a different response for URI While in 4.15 CO got a response that it failed to list the MachineConfigs, in 4.16 CO gets a list with no MachineConfigs. |
/test all |
/test e2e-aws-parallel |
@@ -0,0 +1,4 @@ | |||
{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super nit picky here, but we could elaborate on the name. To me, an empty machine config is items: []
. But we could name it nil_machineconfig_list.json
.
Also, this is certainly something I can bike shed later in a separate PR.
rawmc, readErr = io.ReadAll(nsFile) | ||
Expect(readErr).To(BeNil()) | ||
}) | ||
It("skips filter piping errors", func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: we could include the nil aspect into the test name here, so it's more specific about what case we're testing for.
It("gracefully handles nil item lists", func() {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rhmdnd, Vincent056, yuumasato The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/jira refresh |
@rhmdnd: This pull request references Jira Issue OCPBUGS-33067, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
@yuumasato: This pull request references Jira Issue OCPBUGS-33067, which is valid. 3 validation(s) were run on this bug
Requesting review from QA contact: In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
adding qe approved since @BhargaviGudi verified the change. |
/label docs-approved |
e9ca64b
into
ComplianceAsCode:master
@yuumasato: Jira Issue OCPBUGS-33067: All pull requests linked via external trackers have merged: Jira Issue OCPBUGS-33067 has been moved to the MODIFIED state. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
A jq filter may expect to iterate over a list of results, but it can happen that no result is returned.
Let's not fatal error when this happens.
In an HyperShift environment, when no MC config exists, the following error occurs:
After creating a dummy
MachineConfig
, the URI fetching succeeds: