Skip to content

Commit

Permalink
change operator/plugin check to avoid failing tests
Browse files Browse the repository at this point in the history
  • Loading branch information
supertetelman committed Feb 12, 2022
1 parent 02e3e4d commit d5415d1
Showing 1 changed file with 3 additions and 2 deletions.
5 changes: 3 additions & 2 deletions scripts/k8s/deploy_monitoring.sh
Original file line number Diff line number Diff line change
Expand Up @@ -276,8 +276,9 @@ install_dependencies
setup_prom_monitoring

# Install DCGM-Exporter and setup custom metrics, if needed
kubectl get daemonsets -A | grep nvidia-dcgm-exporter | grep nvidia.com/gpu.deploy.dcgm-exporter=true
if [ $? -ne 0 ] ; then
# # GPU Device Plugin is installed into kube-system, GPU Operator installs it into gpu-operator-resources
plugin_namespace=$( kubectl get pods -A -l app.kubernetes.io/instance=nvidia-device-plugin --no-headers --no-headers -o custom-columns=NAMESPACE:.metadata.namespace)
if [ ${plugin_namespace} -e "kube-system" ] ; then
# No GPU Operator DCGM-Exporter Stack
setup_gpu_monitoring
fi
Expand Down

0 comments on commit d5415d1

Please sign in to comment.