-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
This plugin not working when used IB NIC the LINK_TYPE_P1=ETH! #98
Comments
im not sure i understand the issue, are you changing the link type ? of both ports ? or of a single port of the NIC ? |
My Mellanox NIC config
k8s clsuter apply rdma-sharerdma-device
rdma-shared-dp-ds Daemonset
Result
|
can you provide device plugin logs and the content of |
|
I want fix the problem . Can you show me how to modify this plugin? |
from the logs, device plugin behaves as expected. i see that device plugin discovered resources properly. can you provide the output of: can you add the yaml used for deployment of device plugin daemonset ? is it what we have in master branch ? [1]
|
the kubelet --root-dir is
|
ok, did you deploy rdma-shared-device-plugin with the modified mounts as suggested in #96 ? please provide some additional information on how to reproduce this (k8s version, OS, NIC hardware and its config). |
I'm showing my environment: |
if your kubelet root dir is configured as can you try it ? that is: mount apart of that, everything looks ok to me |
Why must use the default when I used the default |
new log show
Daemonset apiVersion: apps/v1
kind: DaemonSet
metadata:
name: rdma-shared-dp-ds
namespace: cni-plugin
spec:
selector:
matchLabels:
name: rdma-shared-dp-ds
template:
metadata:
labels:
name: rdma-shared-dp-ds
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: rdma
operator: In
values:
- sugon
hostNetwork: true
priorityClassName: system-node-critical
containers:
- image: ghcr.io/mellanox/k8s-rdma-shared-dev-plugin
name: k8s-rdma-shared-dp-ds
imagePullPolicy: IfNotPresent
#securityContext:
# privileged: true
volumeMounts:
- name: device-plugin
mountPath: /var/lib/kubelet/device-plugins
- name: plugins-registry
mountPath: /var/lib/kubelet/plugins_registry
- name: config
mountPath: /k8s-rdma-shared-dev-plugin
- name: devs
mountPath: /dev/
volumes:
- name: device-plugin
hostPath:
path: /data/kubelet/device-plugins
- name: plugins-registry
hostPath:
path: /data/kubelet/plugins_registry
- name: config
configMap:
name: rdma-devices
items:
- key: config.json
path: config.json
- name: devs
hostPath:
path: /dev/
---
apiVersion: v1
kind: ConfigMap
metadata:
name: rdma-devices
namespace: cni-plugin
data:
config.json: |
{
"periodicUpdateInterval": 300,
"configList": [{
"resourceName": "hca_3",
"rdmaHcaMax": 1000,
"selectors": {
"ifNames": ["ens24np0"]
}
}
]
} start kubelet argument
kubelet version v1.23.0 kubelet log
show the kubelet --root-dir=/data/kubelet/plugins_registry
the rdma-shared-dev-plugin not create the socket file at |
This plugin not working when used IB NIC the LINK_TYPE_P1=ETH!
The IB NIC change ETH model. run the
rdma-shared-dev-plugin
in k8s cluster.The node
Capacity
andAllocatable
resourceName values is 0.NIC version: Mellanox ConnectX 6.
But Mellanox ConnectX 6 Dx can share rdma resources in k8s cluster.
The text was updated successfully, but these errors were encountered: