Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CI][DOCKER] Fix cuda11 nvidia-docker support for non-Tesla gpus #8163

Merged
merged 1 commit into from
May 31, 2021

Conversation

tqchen
Copy link
Member

@tqchen tqchen commented May 29, 2021

Starting cuda11, libcuda can be linked to a version of libcuda in
/usr/local/cuda/compact. The particular linked library
does not work for non-Tesla GPUs, causing "no CUDA capable devices found"
even though nvidia-smi shows available GPUs.

This PR makes makes sure we always prioritize linking
/usr/lib/x86_64-linux-gnu/libcuda.so.1

so the nvidia docker cuda11 images works for non-Tesla GPU envs.

Starting cuda11, libcuda can be linked to a version of libcuda in
/usr/local/cuda/compact. The particular linked library
does not work for non-Tesla GPUs, causing "no CUDA capable devices found"
even though nvidia-smi shows available GPUs.

This PR makes makes sure we always prioritize linking
/usr/lib/x86_64-linux-gnu/libcuda.so.1

so the nvidia docker cuda11 images works for non-Tesla GPU envs.
@tqchen
Copy link
Member Author

tqchen commented May 29, 2021

cc @areusch @tkonolige @junrushao1994

This is likely the root cause to your previous problem of "no CUDA capable devices found" when updating the cuda image.

@tqchen tqchen force-pushed the ci branch 2 times, most recently from d8336ec to 9912bc4 Compare May 30, 2021 13:23
@masahi masahi merged commit 713de0c into apache:main May 31, 2021
mehrdadh pushed a commit to mehrdadh/tvm that referenced this pull request Jun 3, 2021
…che#8163)

Starting cuda11, libcuda can be linked to a version of libcuda in
/usr/local/cuda/compact. The particular linked library
does not work for non-Tesla GPUs, causing "no CUDA capable devices found"
even though nvidia-smi shows available GPUs.

This PR makes makes sure we always prioritize linking
/usr/lib/x86_64-linux-gnu/libcuda.so.1

so the nvidia docker cuda11 images works for non-Tesla GPU envs.
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Jun 17, 2021
…che#8163)

Starting cuda11, libcuda can be linked to a version of libcuda in
/usr/local/cuda/compact. The particular linked library
does not work for non-Tesla GPUs, causing "no CUDA capable devices found"
even though nvidia-smi shows available GPUs.

This PR makes makes sure we always prioritize linking
/usr/lib/x86_64-linux-gnu/libcuda.so.1

so the nvidia docker cuda11 images works for non-Tesla GPU envs.
trevor-m pushed a commit to neo-ai/tvm that referenced this pull request Jun 17, 2021
…che#8163)

Starting cuda11, libcuda can be linked to a version of libcuda in
/usr/local/cuda/compact. The particular linked library
does not work for non-Tesla GPUs, causing "no CUDA capable devices found"
even though nvidia-smi shows available GPUs.

This PR makes makes sure we always prioritize linking
/usr/lib/x86_64-linux-gnu/libcuda.so.1

so the nvidia docker cuda11 images works for non-Tesla GPU envs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants