-
Notifications
You must be signed in to change notification settings - Fork 205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GPU crashing on 1 node. #1628
Comments
Hi @ryanm101 I found a bit similar error here: intel/intel-technology-enabling-for-openshift#113. There are a couple of workarounds in the issue that could work. Could you try them out? |
I reproduced the issue on a VM. Device plugin seems to work without selinux but fails with selinux. In the selinux audit logs there is an entry:
I'll need to study if this is similar/same as the above linked issue. EDIT: using |
|
I followed instructions from the audit entry:
That seems to allow device plugin to access kubelet. I'm not sure where we should file a bug to: FC, k3s or somewhere else. |
The plugins already run with proper label to have access to kubelet. That policy went into container-selinux package. Is that package installed on your node? |
Those get installed alongside k3s. and are installed. |
Yes this seems to solve it. |
@mregmi do you happen to know the container-selinux version? |
@tkatila Was this SELinux issue already handled? |
Running 3 master nodes using k3s
NUC 1 & 3 both deploy fine.
NUC 2 the container crashes with
command used to provision NUC2:
The only differences between NUC2 and NUC1/3 are:
Any advice appreciated.
I will test re-adding the node without the
--selinux
and if all else fails change it to FC38.The text was updated successfully, but these errors were encountered: