Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for Node Feature Discovery Operator in OCP 4.17 #341

Merged
merged 1 commit into from
Nov 22, 2024

Conversation

carlmes
Copy link
Contributor

@carlmes carlmes commented Nov 22, 2024

GitOps projects that use the nfd operator have been failing in OpenShift 4.17, it looks like the base image has been changed - we verified this by manually installing the operator and observing the new image name.

Update base image in node-feature-discovery.yaml fixes the issue.

We've tested this on an OpenShift 4.17 and 4.15 cluster, and NFD is now working great. More details in one of the downstream projects at: redhat-ai-services/ai-accelerator#80

GitOps projects that use the nfd operator have been failing in OpenShift 4.17, it looks like the base image has been changed - we verified this by manually installing the operator and observing the new image name.

Update base image in node-feature-discovery.yaml fixes the issue.

We've tested this on an OpenShift 4.17 and 4.15 cluster, and NFD is now working great. More details in one of the downstream projects at: redhat-ai-services/ai-accelerator#80
@strangiato
Copy link
Contributor

@carlmes is this still backwards compatible with older versions?

I'm wondering if we need to do a seperate overlay for 4.17 to maintain backwards compatibility.

@strangiato
Copy link
Contributor

Something else to be aware of, we have been testing composer on 4.17 and NFD does not like wake up from a cluster shutdown in demo.redhat.com.

We noticed this image was different in 4.17 and we weren't sure if that was an issue related to that specific image, or if it was something specific to 4.17.

That would be something good to test in 4.16 with that same image.

This is the bug Anthony filed for NFD for reference:
https://issues.redhat.com/browse/OCPBUGS-44471

@carlmes
Copy link
Contributor Author

carlmes commented Nov 22, 2024

@carlmes is this still backwards compatible with older versions?

I'm wondering if we need to do a seperate overlay for 4.17 to maintain backwards compatibility.

We tested it on OpenShift 4.17 as well as 4.15 clusters to check for backwards compatibility, screenshots in the downstream project update: redhat-ai-services/ai-accelerator#80

Do you think we should try an even earlier version of OpenShift? The lowest version we can provision on demo.redhat.com is 4.14.

@carlmes
Copy link
Contributor Author

carlmes commented Nov 22, 2024

Tested this patched version on my temporary cluster by stopping and then restarting the cluster through admin UI:

image

NFD Operator seems to start up just fine, so this might fix the other issue as well.

image

@strangiato
Copy link
Contributor

Nice!

Should have known you would have tested everything so thoroughly!

@strangiato strangiato merged commit ba9d8c1 into redhat-cop:main Nov 22, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants