You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
However, after completing the deployment, I found that the host's large language model was still using the CPU instead of the GPU. After some troubleshooting, I discovered that the CUDA version on Ubuntu was incompatible with the torch version inside TGI Docker, which caused the GPU not to load. I saw a great table in the documentation listing which Docker image versions should be used for different GPU series; this is very helpful. I also suggest providing recommendations for which CUDA versions should be installed on Ubuntu hosts so that users can complete their setup more smoothly and quickly transition to developing upper-level applications.
Overall, OPEA is fantastic.
The text was updated successfully, but these errors were encountered:
Priority
P3-Medium
OS type
Ubuntu
Hardware type
GPU-Nvidia
Running nodes
Single Node
Description
Due to the limited performance of my local Intel processor, I tried to deploy ChatQnA using a 3090 graphics card. According to the documentation at https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/docker_compose/nvidia/gpu/README.md, everything went smoothly.
However, after completing the deployment, I found that the host's large language model was still using the CPU instead of the GPU. After some troubleshooting, I discovered that the CUDA version on Ubuntu was incompatible with the torch version inside TGI Docker, which caused the GPU not to load. I saw a great table in the documentation listing which Docker image versions should be used for different GPU series; this is very helpful. I also suggest providing recommendations for which CUDA versions should be installed on Ubuntu hosts so that users can complete their setup more smoothly and quickly transition to developing upper-level applications.
Overall, OPEA is fantastic.
The text was updated successfully, but these errors were encountered: