Skip to content

Commit

Permalink
Updated configuration logic in READMEs and YAMLs
Browse files Browse the repository at this point in the history
  • Loading branch information
arueth committed Feb 24, 2025
1 parent a6867f0 commit 3af0fe1
Show file tree
Hide file tree
Showing 9 changed files with 184 additions and 188 deletions.
13 changes: 5 additions & 8 deletions platforms/gke-aiml/playground/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,13 +21,9 @@ document.

### Project

<<<<<<< HEAD In this guide you can choose to bring your project (BYOP) or have
Terraform create a new project for you. The requirements are difference based on
the option that you choose. ======= In this guide you can choose to bring your
project (BYOP) or have Terraform create a new project for you. The requirements
are different based on the option that you choose.

> > > > > > > 04b228a (Added AlloyDB with PSC to the playground (#36))
In this guide you can choose to bring your project (BYOP) or have Terraform
create a new project for you. The requirements are different based on the option
that you choose.

#### Option 1: Bring your own project (BYOP)

Expand Down Expand Up @@ -339,7 +335,8 @@ Management API are enabled.
gcloud endpoints services undelete gradio.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null
gcloud endpoints services undelete locust.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null
gcloud endpoints services undelete mlflow-tracking.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null
gcloud endpoints services undelete ray-dashboard.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null
gcloud endpoints services undelete rag-frontend.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null
gcloud endpoints services undelete ray-dashboard.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null1
```

- Create the resources
Expand Down
105 changes: 36 additions & 69 deletions use-cases/rag-pipeline/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ the catalog. This approach not only improves the customer experience by
providing relevant alternatives but also helps reduce lost sales and potentially
increases average order value.

## Here's how it works:
## How it works

Understanding Customer Intent: The system analyzes the customer's search query
(e.g., "blue cotton t-shirt") to understand the key attributes and features they
Expand All @@ -24,7 +24,7 @@ even if they don't share the exact same words.
system generates a list of relevant product recommendations. These might
include:

**Cosine Simlarity:** Cosine similarity measures how similar two items are by
**Cosine Similarity:** Cosine similarity measures how similar two items are by
looking at the angle between their vector representations.

Cosine: Cosine is a mathematical function that tells you how much two vectors
Expand Down Expand Up @@ -55,14 +55,6 @@ sweaters.
relevance and presented to the customer in a clear and appealing way,
encouraging them to continue shopping.

## Data Preprocessing for RAG

We need a input dataset to feed to our RAG pipeline. We take a raw dataset and
filter and clean it up to prepare it for our RAG pipeline. Perform the data
preprocessing steps as described in
[README](/use-cases/rag-pipeline/data-preprocessing/README.md) to prepare input
dataset.

## Architecture

![RAG Architecture](./docs/arch-rag-architecture-flow.png)
Expand All @@ -83,8 +75,8 @@ Let's break down the flow step-by-step:

2. **Get Embeddings:**

The user's query is sent to the "Backend Application". Within the Backend
Application, the query is processed by an "Embedding Model Endpoint".
The user's query is sent to the "backend application". Within the backend
application, the query is processed by an "embedding model endpoint".

3. **Embedding Model Endpoint:**

Expand All @@ -96,8 +88,8 @@ Let's break down the flow step-by-step:

4. **Get Embeddings (Continued):**

The Backend Application receives the embedding vector from the Embedding
Model Endpoint.
The backend application receives the embedding vector from the embedding
model endpoint.

5. **Scann Index:**

Expand All @@ -121,12 +113,12 @@ Let's break down the flow step-by-step:
7. **Instruction Tuned Model Endpoint:**

These retrieved similar product information is then sent to an "Instruction
Tuned Model Endpoint". This endpoint hosts a specific version of Gemini
Tuned Model Endpoint". This endpoint hosts a specific version of the Gemma 2B
instruction tuned
model(gemma-2b-it)[https://huggingface.co/google/gemma-2b-it] that's been
trained with a focus on understanding and responding to instructions
effectively. Instruction tuned model is provided with a specific instructions
as a prompt to re-rank the search results.
effectively. The instruction tuned model is provided with a specific
instructions as a prompt to re-rank the search results.

8. **Re-Rank Search Results with LLM:**

Expand All @@ -136,7 +128,7 @@ Let's break down the flow step-by-step:

9. **Suggest Products:**

The Backend Application receives the re-ranked product list from the
The backend application receives the re-ranked product list from the
Instruction Tuned Model Endpoint.This list is sent as Product Recommendations
to the Gradio Chat Interface.

Expand All @@ -156,99 +148,74 @@ the Scann Index to reflect catalog changes
This section outlines the steps to set up the Retrieval Augmented Generation
(RAG) pipeline for product recommendations.

1. **Create the Vector Store:** Create the `product_catalog` database in
[AlloyDB](https://cloud.google.com/alloydb/docs/introduction). This database
will house the `clothes` table, which stores product catalog information.

2. **Deploy the Embedding Model:** Deploy the
1. **Deploy the Embedding Model:** Deploy the
[Blip2 multimodal](https://github.com/salesforce/LAVIS/blob/main/examples/blip_feature_extraction.ipynb)embedding
model. This model generates text, image, and multimodal embeddings for each
product.

3. **Generate and Store Embeddings:** Use an ETL pipeline to generate embeddings
1. **Create the Vector Store:** Create the `product_catalog` database in
[AlloyDB](https://cloud.google.com/alloydb/docs/introduction). This database
will house the `clothes` table, which stores product catalog information.

1. **Generate and Store Embeddings:** Use an ETL pipeline to generate embeddings
(text, image, and multimodal) using the deployed Blip2 model. Store these
embeddings in separate columns within the `clothes` table in AlloyDB.

4. **Deploy the Instruction-Tuned Model:** Deploy the
1. **Deploy the Instruction-Tuned Model:** Deploy the
[gemma-2b-it model](https://huggingface.co/google/gemma-2b-it). This model
generates natural language responses and product recommendations based on
user queries and retrieved product information.

5. **Deploy the Backend API:** Deploy the FastAPI backend. This API serves as
1. **Deploy the Backend API:** Deploy the FastAPI backend. This API serves as
the interface between the user interface, embedding model, instruction-tuned
model, and the AlloyDB vector store. It processes user prompts and generates
product recommendations.

6. **Deploy the Frontend UI:** Deploy the [gradio](https://gradio.app/) based
1. **Deploy the Frontend UI:** Deploy the [gradio](https://gradio.app/) based
frontend UI. This UI provides a chatbot interface for end-users to interact
with the RAG pipeline and receive product recommendations.

## Prerequisites
## Deploy RAG Components

This section outlines the steps to deploy the Retrieval Augmented Generation
(RAG) pipeline to the playground cluster.

### Prerequisites

- Use the existing
[playground AI/ML platform](/platforms/gke-aiml/playground/README.md). If you
are using a different environment the scripts and manifest will need to be
modified for that environment.
- Run [data preprocessing for RAG](#data-preprocessing-for-rag)

#### Set variable for the ML playground environment

- Clone the repository

```sh
git clone https://github.com/GoogleCloudPlatform/accelerated-platforms && \
cd accelerated-platforms
```

- Change directory to the guide directory

```sh
cd use-cases/rag-pipeline
```

- Ensure that your `MLP_ENVIRONMENT_FILE` is configured

```sh
cat ${MLP_ENVIRONMENT_FILE} && \
source ${MLP_ENVIRONMENT_FILE}
```

> You should see the various variables populated with the information specific
> to your environment.

- Get credentials for the GKE cluster
### Data Preprocessing for RAG

```sh
gcloud container fleet memberships get-credentials ${MLP_CLUSTER_NAME} --project ${MLP_PROJECT_ID}
```

# Deploy RAG Application Components

This section outlines the steps to deploy the Retrieval Augmented Generation
(RAG) pipeline to the playground cluster. The components should be deployed in
the following order:
We need a input dataset to feed to our RAG pipeline. We take a raw dataset and
filter and clean it up to prepare it for our RAG pipeline. Perform the data
preprocessing steps as described in
[README](/use-cases/rag-pipeline/data-preprocessing/README.md) to prepare input
dataset.

## Deploy the Multimodal Model on the playground cluster
### Deploy the Multimodal Model on the playground cluster

Deploy multimodal model on ML playground, follow the
[README](/use-cases/rag-pipeline/embedding-models/multimodal-embedding/README.md)

## Deploy instruction tuned model on the playground cluster
### Deploy instruction tuned model on the playground cluster

Deploy instruction tuned model on ML playground, follow the
[README](/use-cases/rag-pipeline/instruction-tuned-model/README.md)

## Create database `product_catalog` in alloyDB to import Product Catalog
### Create database `product_catalog` in alloyDB to import Product Catalog

Deploy database setup kubernetes job on the ML playground cluster, follow the
[README](/use-cases/rag-pipeline/alloy-db-setup/README.md)

## Deploy the backend on the playground cluster
### Deploy the backend on the playground cluster

Deploy backend application on the ML playground cluster, follow the
[README](/use-cases/rag-pipeline/backend/README.md)

## Deploy the frontend on the playground cluster
### Deploy the frontend on the playground cluster

Deploy frontend application on the MLP playground cluster, follow the
[README](/use-cases/rag-pipeline/frontend/README.md)
31 changes: 12 additions & 19 deletions use-cases/rag-pipeline/alloy-db-setup/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Process to set up AlloyDB
# RAG: Database Initialization

This kubernetes job helps you load the flipkart product catalog to the alloyDB
database named `product_catalog`.Also it creates separate columns to store the
Expand All @@ -7,39 +7,29 @@ embeddings(text, image and multimodal) in a table named `clothes` in the

## Prerequisites

<TODO> Write few lines about alloydb set up various users for IAM, workload
identity , different users in ML_ENV_FILE to use .

<TODO> Decide what how end users should get access to these buckets of the
image_uris and the associated datasets to load the product catalog?

MLP accounts MLP_DB_ADMIN_IAM and MLP_DB_USER_IAM need Storage object
permissions to retrieve, process and generate embeddings for image_uri stores in
Cloud Storage buckets.

- This guide was developed to be run on the
[playground AI/ML platform](/platforms/gke-aiml/playground/README.md). If you
are using a different environment the scripts and manifest will need to be
modified for that environment.
- Multimodal embedding model has been deployed as per instructions in the
embedding models folder (../embedding-models/README.md)
- [RAG: Multimodal BLIP2 model](/use-cases/rag-pipeline/embedding-models/multimodal-embedding/README.md)
has been deployed as per instructions.

## Preparation

- Clone the repository
- Clone the repository.

```sh
git clone https://github.com/GoogleCloudPlatform/accelerated-platforms && \
cd accelerated-platforms
```

- Change directory to the guide directory
- Change directory to the guide directory.

```sh
cd use-cases/rag-pipeline/alloy-db-setup
```

- Ensure that your `MLP_ENVIRONMENT_FILE` is configured
- Ensure that your `MLP_ENVIRONMENT_FILE` is configured.

```sh
cat ${MLP_ENVIRONMENT_FILE} && \
Expand All @@ -51,10 +41,13 @@ Cloud Storage buckets.
> You should see the various variables populated with the information specific
> to your environment.
- Get credentials for the GKE cluster
- Get credentials for the GKE cluster.

```sh
gcloud container fleet memberships get-credentials ${MLP_CLUSTER_NAME} --project ${MLP_PROJECT_ID}
```shell
gcloud container clusters get-credentials ${MLP_CLUSTER_NAME} \
--dns-endpoint \
--project=${MLP_PROJECT_ID} \
--region=${MLP_REGION}
```

## Build the container image
Expand Down
Loading

0 comments on commit 3af0fe1

Please sign in to comment.