Skip to content

Commit

Permalink
[Doc] Refine READMEs (#841)
Browse files Browse the repository at this point in the history
Signed-off-by: letonghan <letong.han@intel.com>
  • Loading branch information
letonghan authored Sep 19, 2024
1 parent 933c3d3 commit 372d78c
Show file tree
Hide file tree
Showing 4 changed files with 25 additions and 6 deletions.
21 changes: 20 additions & 1 deletion ChatQnA/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,9 @@ Refer to the [AI PC Guide](./docker_compose/intel/cpu/aipc/README.md) for instru

Refer to the [Intel Technology enabling for Openshift readme](https://github.com/intel/intel-technology-enabling-for-openshift/blob/main/workloads/opea/chatqna/README.md) for instructions to deploy ChatQnA prototype on RHOCP with [Red Hat OpenShift AI (RHOAI)](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai).

## Consume ChatQnA Service
## Consume ChatQnA Service with RAG

### Check Service Status

Before consuming ChatQnA Service, make sure the TGI/vLLM service is ready (which takes up to 2 minutes to start).

Expand All @@ -260,6 +262,23 @@ Consume ChatQnA service until you get the TGI response like below.
2024-09-03T02:47:53.402023Z INFO text_generation_router::server: router/src/server.rs:2311: Connected
```

### Upload RAG Files (Optional)

To chat with retrieved information, you need to upload a file using `Dataprep` service.

Here is an example of `Nike 2023` pdf.

```bash
# download pdf file
wget https://raw.githubusercontent.com/opea-project/GenAIComps/main/comps/retrievers/redis/data/nke-10k-2023.pdf
# upload pdf file with dataprep
curl -X POST "http://${host_ip}:6007/v1/dataprep" \
-H "Content-Type: multipart/form-data" \
-F "files=@./nke-10k-2023.pdf"
```

### Consume Chat Service

Two ways of consuming ChatQnA Service:

1. Use cURL command on terminal
Expand Down
2 changes: 1 addition & 1 deletion DocSum/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -147,7 +147,7 @@ Two ways of consuming Document Summarization Service:

```bash
http_proxy=""
curl http://${your_ip}:8008/generate \
curl http://${host_ip}:8008/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
Expand Down
4 changes: 2 additions & 2 deletions DocSum/docker_compose/intel/cpu/xeon/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ docker compose up -d
1. TGI Service

```bash
curl http://${your_ip}:8008/generate \
curl http://${host_ip}:8008/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":17, "do_sample": true}}' \
-H 'Content-Type: application/json'
Expand All @@ -114,7 +114,7 @@ docker compose up -d
2. LLM Microservice

```bash
curl http://${your_ip}:9000/v1/chat/docsum \
curl http://${host_ip}:9000/v1/chat/docsum \
-X POST \
-d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \
-H 'Content-Type: application/json'
Expand Down
4 changes: 2 additions & 2 deletions DocSum/docker_compose/intel/hpu/gaudi/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,7 +96,7 @@ docker compose up -d
1. TGI Service

```bash
curl http://${your_ip}:8008/generate \
curl http://${host_ip}:8008/generate \
-X POST \
-d '{"inputs":"What is Deep Learning?","parameters":{"max_new_tokens":64, "do_sample": true}}' \
-H 'Content-Type: application/json'
Expand All @@ -105,7 +105,7 @@ docker compose up -d
2. LLM Microservice

```bash
curl http://${your_ip}:9000/v1/chat/docsum \
curl http://${host_ip}:9000/v1/chat/docsum \
-X POST \
-d '{"query":"Text Embeddings Inference (TEI) is a toolkit for deploying and serving open source text embeddings and sequence classification models. TEI enables high-performance extraction for the most popular models, including FlagEmbedding, Ember, GTE and E5."}' \
-H 'Content-Type: application/json'
Expand Down

0 comments on commit 372d78c

Please sign in to comment.