Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine ChatQnA README for TGI #715

Merged
merged 3 commits into from
Sep 3, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 13 additions & 0 deletions ChatQnA/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -224,6 +224,19 @@ Refer to the [Intel Technology enabling for Openshift readme](https://github.com

## Consume ChatQnA Service

Before consuming ChatQnA Service, make sure the TGI/vLLM service is ready (which takes up to 2 minutes to start).

```bash
# TGI example
docker logs tgi-service | grep Connected
```

Consume ChatQnA service until you get the TGI response like below.

```log
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
```

Two ways of consuming ChatQnA Service:

1. Use cURL command on terminal
Expand Down
16 changes: 16 additions & 0 deletions ChatQnA/docker/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -306,6 +306,22 @@ curl http://${host_ip}:8000/v1/reranking \

6. LLM backend Service

In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.

Try the command below to check whether the LLM serving is ready.

```bash
docker logs ${CONTAINER_ID} | grep Connected
```

If the service is ready, you will get the response like below.

```log
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
```

Then try the `cURL` command below to validate services.

```bash
#TGI Service
curl http://${host_ip}:8005/generate \
Expand Down
16 changes: 16 additions & 0 deletions ChatQnA/docker/gpu/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -192,6 +192,22 @@ curl http://${host_ip}:8000/v1/reranking \

6. TGI Service

In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.

Try the command below to check whether the TGI service is ready.

```bash
docker logs ${CONTAINER_ID} | grep Connected
```

If the service is ready, you will get the response like below.

```log
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
```

Then try the `cURL` command below to validate TGI.

```bash
curl http://${host_ip}:8008/generate \
-X POST \
Expand Down
16 changes: 14 additions & 2 deletions ChatQnA/docker/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -303,9 +303,21 @@ curl http://${host_ip}:8000/v1/reranking\

6. LLM backend Service

In first startup, this service will take more time to download the LLM file. After it's finished, the service will be ready.
In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.

Use `docker logs CONTAINER_ID` to check if the download is finished.
Try the command below to check whether the LLM serving is ready.

```bash
docker logs ${CONTAINER_ID} | grep Connected
```

If the service is ready, you will get the response like below.

```log
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
```

Then try the `cURL` command below to validate services.

```bash
# TGI service
Expand Down
16 changes: 16 additions & 0 deletions ChatQnA/docker/xeon/README_qdrant.md
Original file line number Diff line number Diff line change
Expand Up @@ -276,6 +276,22 @@ curl http://${host_ip}:6046/v1/reranking\

6. TGI Service

In first startup, this service will take more time to download the model files. After it's finished, the service will be ready.

Try the command below to check whether the TGI service is ready.

```bash
docker logs ${CONTAINER_ID} | grep Connected
```

If the service is ready, you will get the response like below.

```log
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
```

Then try the `cURL` command below to validate TGI.

```bash
curl http://${host_ip}:6042/generate \
-X POST \
Expand Down
Loading