diff --git a/ChatQnA/README.md b/ChatQnA/README.md index 0d4798810..5d3f93e8f 100644 --- a/ChatQnA/README.md +++ b/ChatQnA/README.md @@ -224,6 +224,19 @@ Refer to the [Intel Technology enabling for Openshift readme](https://github.com ## Consume ChatQnA Service +Before consuming ChatQnA Service, make sure the TGI/vLLM service is ready (which takes up to 2 minutes to start). + +```bash +# TGI example +docker logs tgi-service | grep Connected +``` + +Consume ChatQnA service until you get the TGI response like below. + +```log +2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected +``` + Two ways of consuming ChatQnA Service: 1. Use cURL command on terminal diff --git a/ChatQnA/docker/gaudi/README.md b/ChatQnA/docker/gaudi/README.md index 053484f77..717988c6b 100644 --- a/ChatQnA/docker/gaudi/README.md +++ b/ChatQnA/docker/gaudi/README.md @@ -306,6 +306,22 @@ curl http://${host_ip}:8000/v1/reranking \ 6. LLM backend Service +In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. + +Try the command below to check whether the LLM serving is ready. + +```bash +docker logs ${CONTAINER_ID} | grep Connected +``` + +If the service is ready, you will get the response like below. + +```log +2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected +``` + +Then try the `cURL` command below to validate services. + ```bash #TGI Service curl http://${host_ip}:8005/generate \ diff --git a/ChatQnA/docker/gpu/README.md b/ChatQnA/docker/gpu/README.md index 48c287fb5..f559230b6 100644 --- a/ChatQnA/docker/gpu/README.md +++ b/ChatQnA/docker/gpu/README.md @@ -192,6 +192,22 @@ curl http://${host_ip}:8000/v1/reranking \ 6. TGI Service +In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. + +Try the command below to check whether the TGI service is ready. + +```bash +docker logs ${CONTAINER_ID} | grep Connected +``` + +If the service is ready, you will get the response like below. + +```log +2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected +``` + +Then try the `cURL` command below to validate TGI. + ```bash curl http://${host_ip}:8008/generate \ -X POST \ diff --git a/ChatQnA/docker/xeon/README.md b/ChatQnA/docker/xeon/README.md index dc8735928..675e74cea 100644 --- a/ChatQnA/docker/xeon/README.md +++ b/ChatQnA/docker/xeon/README.md @@ -303,9 +303,21 @@ curl http://${host_ip}:8000/v1/reranking\ 6. LLM backend Service -In first startup, this service will take more time to download the LLM file. After it's finished, the service will be ready. +In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. -Use `docker logs CONTAINER_ID` to check if the download is finished. +Try the command below to check whether the LLM serving is ready. + +```bash +docker logs ${CONTAINER_ID} | grep Connected +``` + +If the service is ready, you will get the response like below. + +```log +2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected +``` + +Then try the `cURL` command below to validate services. ```bash # TGI service diff --git a/ChatQnA/docker/xeon/README_qdrant.md b/ChatQnA/docker/xeon/README_qdrant.md index f103d5a73..a03b563b2 100644 --- a/ChatQnA/docker/xeon/README_qdrant.md +++ b/ChatQnA/docker/xeon/README_qdrant.md @@ -276,6 +276,22 @@ curl http://${host_ip}:6046/v1/reranking\ 6. TGI Service +In first startup, this service will take more time to download the model files. After it's finished, the service will be ready. + +Try the command below to check whether the TGI service is ready. + +```bash +docker logs ${CONTAINER_ID} | grep Connected +``` + +If the service is ready, you will get the response like below. + +```log +2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected +``` + +Then try the `cURL` command below to validate TGI. + ```bash curl http://${host_ip}:6042/generate \ -X POST \