[Feature] Configuration Document Optimization Suggestions - ChatQnA Setup and Deployment #1251

zhangyuting · 2024-12-14T02:51:09Z

Priority

P3-Medium

OS type

Ubuntu

Hardware type

GPU-Nvidia

Running nodes

Single Node

Description

Due to the limited performance of my local Intel processor, I tried to deploy ChatQnA using a 3090 graphics card. According to the documentation at https://github.com/opea-project/GenAIExamples/blob/main/ChatQnA/docker_compose/nvidia/gpu/README.md, everything went smoothly.

However, after completing the deployment, I found that the host's large language model was still using the CPU instead of the GPU. After some troubleshooting, I discovered that the CUDA version on Ubuntu was incompatible with the torch version inside TGI Docker, which caused the GPU not to load. I saw a great table in the documentation listing which Docker image versions should be used for different GPU series; this is very helpful. I also suggest providing recommendations for which CUDA versions should be installed on Ubuntu hosts so that users can complete their setup more smoothly and quickly transition to developing upper-level applications.

Overall, OPEA is fantastic.

zhangyuting added the feature New feature or request label Dec 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Configuration Document Optimization Suggestions - ChatQnA Setup and Deployment #1251

[Feature] Configuration Document Optimization Suggestions - ChatQnA Setup and Deployment #1251

zhangyuting commented Dec 14, 2024

[Feature] Configuration Document Optimization Suggestions - ChatQnA Setup and Deployment #1251

[Feature] Configuration Document Optimization Suggestions - ChatQnA Setup and Deployment #1251

Comments

zhangyuting commented Dec 14, 2024

Priority

OS type

Hardware type

Running nodes

Description