diff --git a/platforms/gke-aiml/playground/README.md b/platforms/gke-aiml/playground/README.md index 893750e5..34251dfb 100644 --- a/platforms/gke-aiml/playground/README.md +++ b/platforms/gke-aiml/playground/README.md @@ -21,13 +21,9 @@ document. ### Project -<<<<<<< HEAD In this guide you can choose to bring your project (BYOP) or have -Terraform create a new project for you. The requirements are difference based on -the option that you choose. ======= In this guide you can choose to bring your -project (BYOP) or have Terraform create a new project for you. The requirements -are different based on the option that you choose. - -> > > > > > > 04b228a (Added AlloyDB with PSC to the playground (#36)) +In this guide you can choose to bring your project (BYOP) or have Terraform +create a new project for you. The requirements are different based on the option +that you choose. #### Option 1: Bring your own project (BYOP) @@ -339,7 +335,8 @@ Management API are enabled. gcloud endpoints services undelete gradio.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null gcloud endpoints services undelete locust.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null gcloud endpoints services undelete mlflow-tracking.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null - gcloud endpoints services undelete ray-dashboard.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null + gcloud endpoints services undelete rag-frontend.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null + gcloud endpoints services undelete ray-dashboard.ml-team.mlp-${MLP_ENVIRONMENT_NAME}.endpoints.${MLP_PROJECT_ID}.cloud.goog --quiet 2>/dev/null1 ``` - Create the resources diff --git a/use-cases/rag-pipeline/README.md b/use-cases/rag-pipeline/README.md index b35651fa..24c74c1d 100644 --- a/use-cases/rag-pipeline/README.md +++ b/use-cases/rag-pipeline/README.md @@ -9,7 +9,7 @@ the catalog. This approach not only improves the customer experience by providing relevant alternatives but also helps reduce lost sales and potentially increases average order value. -## Here's how it works: +## How it works Understanding Customer Intent: The system analyzes the customer's search query (e.g., "blue cotton t-shirt") to understand the key attributes and features they @@ -24,7 +24,7 @@ even if they don't share the exact same words. system generates a list of relevant product recommendations. These might include: -**Cosine Simlarity:** Cosine similarity measures how similar two items are by +**Cosine Similarity:** Cosine similarity measures how similar two items are by looking at the angle between their vector representations. Cosine: Cosine is a mathematical function that tells you how much two vectors @@ -55,14 +55,6 @@ sweaters. relevance and presented to the customer in a clear and appealing way, encouraging them to continue shopping. -## Data Preprocessing for RAG - -We need a input dataset to feed to our RAG pipeline. We take a raw dataset and -filter and clean it up to prepare it for our RAG pipeline. Perform the data -preprocessing steps as described in -[README](/use-cases/rag-pipeline/data-preprocessing/README.md) to prepare input -dataset. - ## Architecture ![RAG Architecture](./docs/arch-rag-architecture-flow.png) @@ -83,8 +75,8 @@ Let's break down the flow step-by-step: 2. **Get Embeddings:** - The user's query is sent to the "Backend Application". Within the Backend - Application, the query is processed by an "Embedding Model Endpoint". + The user's query is sent to the "backend application". Within the backend + application, the query is processed by an "embedding model endpoint". 3. **Embedding Model Endpoint:** @@ -96,8 +88,8 @@ Let's break down the flow step-by-step: 4. **Get Embeddings (Continued):** - The Backend Application receives the embedding vector from the Embedding - Model Endpoint. + The backend application receives the embedding vector from the embedding + model endpoint. 5. **Scann Index:** @@ -121,12 +113,12 @@ Let's break down the flow step-by-step: 7. **Instruction Tuned Model Endpoint:** These retrieved similar product information is then sent to an "Instruction - Tuned Model Endpoint". This endpoint hosts a specific version of Gemini + Tuned Model Endpoint". This endpoint hosts a specific version of the Gemma 2B instruction tuned model(gemma-2b-it)[https://huggingface.co/google/gemma-2b-it] that's been trained with a focus on understanding and responding to instructions - effectively. Instruction tuned model is provided with a specific instructions - as a prompt to re-rank the search results. + effectively. The instruction tuned model is provided with a specific + instructions as a prompt to re-rank the search results. 8. **Re-Rank Search Results with LLM:** @@ -136,7 +128,7 @@ Let's break down the flow step-by-step: 9. **Suggest Products:** - The Backend Application receives the re-ranked product list from the + The backend application receives the re-ranked product list from the Instruction Tuned Model Endpoint.This list is sent as Product Recommendations to the Gradio Chat Interface. @@ -156,99 +148,74 @@ the Scann Index to reflect catalog changes This section outlines the steps to set up the Retrieval Augmented Generation (RAG) pipeline for product recommendations. -1. **Create the Vector Store:** Create the `product_catalog` database in - [AlloyDB](https://cloud.google.com/alloydb/docs/introduction). This database - will house the `clothes` table, which stores product catalog information. - -2. **Deploy the Embedding Model:** Deploy the +1. **Deploy the Embedding Model:** Deploy the [Blip2 multimodal](https://github.com/salesforce/LAVIS/blob/main/examples/blip_feature_extraction.ipynb)embedding model. This model generates text, image, and multimodal embeddings for each product. -3. **Generate and Store Embeddings:** Use an ETL pipeline to generate embeddings +1. **Create the Vector Store:** Create the `product_catalog` database in + [AlloyDB](https://cloud.google.com/alloydb/docs/introduction). This database + will house the `clothes` table, which stores product catalog information. + +1. **Generate and Store Embeddings:** Use an ETL pipeline to generate embeddings (text, image, and multimodal) using the deployed Blip2 model. Store these embeddings in separate columns within the `clothes` table in AlloyDB. -4. **Deploy the Instruction-Tuned Model:** Deploy the +1. **Deploy the Instruction-Tuned Model:** Deploy the [gemma-2b-it model](https://huggingface.co/google/gemma-2b-it). This model generates natural language responses and product recommendations based on user queries and retrieved product information. -5. **Deploy the Backend API:** Deploy the FastAPI backend. This API serves as +1. **Deploy the Backend API:** Deploy the FastAPI backend. This API serves as the interface between the user interface, embedding model, instruction-tuned model, and the AlloyDB vector store. It processes user prompts and generates product recommendations. -6. **Deploy the Frontend UI:** Deploy the [gradio](https://gradio.app/) based +1. **Deploy the Frontend UI:** Deploy the [gradio](https://gradio.app/) based frontend UI. This UI provides a chatbot interface for end-users to interact with the RAG pipeline and receive product recommendations. -## Prerequisites +## Deploy RAG Components + +This section outlines the steps to deploy the Retrieval Augmented Generation +(RAG) pipeline to the playground cluster. + +### Prerequisites - Use the existing [playground AI/ML platform](/platforms/gke-aiml/playground/README.md). If you are using a different environment the scripts and manifest will need to be modified for that environment. -- Run [data preprocessing for RAG](#data-preprocessing-for-rag) - -#### Set variable for the ML playground environment - -- Clone the repository - - ```sh - git clone https://github.com/GoogleCloudPlatform/accelerated-platforms && \ - cd accelerated-platforms - ``` - -- Change directory to the guide directory - - ```sh - cd use-cases/rag-pipeline - ``` - -- Ensure that your `MLP_ENVIRONMENT_FILE` is configured - - ```sh - cat ${MLP_ENVIRONMENT_FILE} && \ - source ${MLP_ENVIRONMENT_FILE} - ``` - - > You should see the various variables populated with the information specific - > to your environment. -- Get credentials for the GKE cluster +### Data Preprocessing for RAG - ```sh - gcloud container fleet memberships get-credentials ${MLP_CLUSTER_NAME} --project ${MLP_PROJECT_ID} - ``` - -# Deploy RAG Application Components - -This section outlines the steps to deploy the Retrieval Augmented Generation -(RAG) pipeline to the playground cluster. The components should be deployed in -the following order: +We need a input dataset to feed to our RAG pipeline. We take a raw dataset and +filter and clean it up to prepare it for our RAG pipeline. Perform the data +preprocessing steps as described in +[README](/use-cases/rag-pipeline/data-preprocessing/README.md) to prepare input +dataset. -## Deploy the Multimodal Model on the playground cluster +### Deploy the Multimodal Model on the playground cluster Deploy multimodal model on ML playground, follow the [README](/use-cases/rag-pipeline/embedding-models/multimodal-embedding/README.md) -## Deploy instruction tuned model on the playground cluster +### Deploy instruction tuned model on the playground cluster Deploy instruction tuned model on ML playground, follow the [README](/use-cases/rag-pipeline/instruction-tuned-model/README.md) -## Create database `product_catalog` in alloyDB to import Product Catalog +### Create database `product_catalog` in alloyDB to import Product Catalog Deploy database setup kubernetes job on the ML playground cluster, follow the [README](/use-cases/rag-pipeline/alloy-db-setup/README.md) -## Deploy the backend on the playground cluster +### Deploy the backend on the playground cluster Deploy backend application on the ML playground cluster, follow the [README](/use-cases/rag-pipeline/backend/README.md) -## Deploy the frontend on the playground cluster +### Deploy the frontend on the playground cluster Deploy frontend application on the MLP playground cluster, follow the [README](/use-cases/rag-pipeline/frontend/README.md) diff --git a/use-cases/rag-pipeline/alloy-db-setup/README.md b/use-cases/rag-pipeline/alloy-db-setup/README.md index 57d68931..5c050ef3 100644 --- a/use-cases/rag-pipeline/alloy-db-setup/README.md +++ b/use-cases/rag-pipeline/alloy-db-setup/README.md @@ -1,4 +1,4 @@ -# Process to set up AlloyDB +# RAG: Database setup and initialization This kubernetes job helps you load the flipkart product catalog to the alloyDB database named `product_catalog`.Also it creates separate columns to store the @@ -7,41 +7,31 @@ embeddings(text, image and multimodal) in a table named `clothes` in the ## Prerequisites - Write few lines about alloydb set up various users for IAM, workload -identity , different users in ML_ENV_FILE to use . - - Decide what how end users should get access to these buckets of the -image_uris and the associated datasets to load the product catalog? - -MLP accounts MLP_DB_ADMIN_IAM and MLP_DB_USER_IAM need Storage object -permissions to retrieve, process and generate embeddings for image_uri stores in -Cloud Storage buckets. - - This guide was developed to be run on the [playground AI/ML platform](/platforms/gke-aiml/playground/README.md). If you are using a different environment the scripts and manifest will need to be modified for that environment. -- Multimodal embedding model has been deployed as per instructions in the - embedding models folder (../embedding-models/README.md) +- [RAG: Multimodal embedding model](/use-cases/rag-pipeline/embedding-models/multimodal-embedding/README.md) + has been deployed as per the instructions. ## Preparation -- Clone the repository +- Clone the repository. - ```sh + ```shell git clone https://github.com/GoogleCloudPlatform/accelerated-platforms && \ cd accelerated-platforms ``` -- Change directory to the guide directory +- Change directory to the guide directory. - ```sh + ```shell cd use-cases/rag-pipeline/alloy-db-setup ``` -- Ensure that your `MLP_ENVIRONMENT_FILE` is configured +- Ensure that your `MLP_ENVIRONMENT_FILE` is configured. - ```sh + ```shell cat ${MLP_ENVIRONMENT_FILE} && \ set -o allexport && \ source ${MLP_ENVIRONMENT_FILE} && \ @@ -51,10 +41,13 @@ Cloud Storage buckets. > You should see the various variables populated with the information specific > to your environment. -- Get credentials for the GKE cluster +- Get credentials for the GKE cluster. - ```sh - gcloud container fleet memberships get-credentials ${MLP_CLUSTER_NAME} --project ${MLP_PROJECT_ID} + ```shell + gcloud container clusters get-credentials ${MLP_CLUSTER_NAME} \ + --dns-endpoint \ + --project=${MLP_PROJECT_ID} \ + --region=${MLP_REGION} ``` ## Build the container image @@ -62,7 +55,7 @@ Cloud Storage buckets. - Build the container image using Cloud Build and push the image to Artifact Registry - ```sh + ```shell cd src git restore cloudbuild.yaml sed -i -e "s|^serviceAccount:.*|serviceAccount: projects/${MLP_PROJECT_ID}/serviceAccounts/${MLP_BUILD_GSA}|" cloudbuild.yaml @@ -79,22 +72,9 @@ Cloud Storage buckets. ## Run the job -**Steps to produce the `MASTER_CATALOG_FILE_NAME` need to be included -somewhere** - -- Temporary steps to populate required data - - ``` - gcloud storage cp gs://temporary-rag-data/master_product_catalog.csv . && \ - sed -i s"//${MLP_DATA_BUCKET}/g" master_product_catalog.csv && \ - gcloud storage cp master_product_catalog.csv gs://${MLP_DATA_BUCKET}/ && \ - rm -f master_product_catalog.csv && \ - gcloud storage rsync gs://temporary-rag-data/flipkart_images gs://${MLP_DATA_BUCKET}/flipkart_images/ - ``` - - Configure the job - ```sh + ```shell set -o nounset export CATALOG_DB_NAME="product_catalog" export CATALOG_TABLE_NAME="clothes" @@ -107,12 +87,15 @@ somewhere** export EMBEDDING_ENDPOINT_IMAGE="http://multimodal-embedding-model.ml-team:80/image_embeddings" export EMBEDDING_ENDPOINT_MULTIMODAL="http://multimodal-embedding-model.ml-team:80/multimodal_embeddings" export EMBEDDING_ENDPOINT_TEXT="http://multimodal-embedding-model.ml-team:80/text_embeddings" - export MASTER_CATALOG_FILE_NAME="master_product_catalog.csv" + export MASTER_CATALOG_FILE_NAME="RAG/master_product_catalog.csv" export NUM_LEAVES_VALUE="300" set +o nounset ``` - ```sh + > Ensure there are no `bash: unbound variable` error + > messages. + + ```shell git restore manifests/job-initialize-database.yaml manifests/job-populate-table.yaml envsubst < manifests/job-initialize-database.yaml | sponge manifests/job-initialize-database.yaml envsubst < manifests/job-populate-table.yaml | sponge manifests/job-populate-table.yaml @@ -120,7 +103,7 @@ somewhere** - Create the initialize database job. - ``` + ```shell kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/job-initialize-database.yaml ``` @@ -128,22 +111,27 @@ somewhere** - Watch the job until it is complete. - ``` + ```shell watch --color --interval 5 --no-title \ "kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} get job/initialize-database | GREP_COLORS='mt=01;92' egrep --color=always -e '^' -e 'Complete' echo '\nLogs(last 10 lines):' kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs job/initialize-database --tail 10" ``` + ``` + NAME STATUS COMPLETIONS DURATION AGE + initialize-database Complete 1/1 XXXXX XXXXX + ``` + - Check logs for any errors. - ``` + ```shell kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs job/initialize-database ``` - Create the populate table job. - ``` + ```shell kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/job-populate-table.yaml ``` @@ -151,15 +139,20 @@ somewhere** - Watch the job until it is complete. - ``` + ```shell watch --color --interval 5 --no-title \ "kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} get job/populate-table | GREP_COLORS='mt=01;92' egrep --color=always -e '^' -e 'Complete' echo '\nLogs(last 10 lines):' kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs job/populate-table --tail 10" ``` + ``` + NAME STATUS COMPLETIONS DURATION AGE + populate-table Complete 1/1 XXXXX XXXXX + ``` + - Check logs for any errors. - ``` + ```shell kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs job/populate-table ``` diff --git a/use-cases/rag-pipeline/backend/README.md b/use-cases/rag-pipeline/backend/README.md index e03b4eff..d7af432a 100644 --- a/use-cases/rag-pipeline/backend/README.md +++ b/use-cases/rag-pipeline/backend/README.md @@ -1,4 +1,4 @@ -# Backend application deployment +# RAG: Backend deployment ## Prerequisites @@ -9,41 +9,46 @@ ## Preparation -- Clone the repository +- Clone the repository. - ```sh + ```shell git clone https://github.com/GoogleCloudPlatform/accelerated-platforms && \ cd accelerated-platforms ``` -- Change directory to the guide directory +- Change directory to the guide directory. - ```sh + ```shell cd use-cases/rag-pipeline/backend ``` -- Ensure that your `MLP_ENVIRONMENT_FILE` is configured +- Ensure that your `MLP_ENVIRONMENT_FILE` is configured. - ```sh + ```shell cat ${MLP_ENVIRONMENT_FILE} && \ - source ${MLP_ENVIRONMENT_FILE} + set -o allexport && \ + source ${MLP_ENVIRONMENT_FILE} && \ + set +o allexport ``` > You should see the various variables populated with the information specific > to your environment. -- Get credentials for the GKE cluster +- Get credentials for the GKE cluster. - ```sh - gcloud container fleet memberships get-credentials ${MLP_CLUSTER_NAME} --project ${MLP_PROJECT_ID} + ```shell + gcloud container clusters get-credentials ${MLP_CLUSTER_NAME} \ + --dns-endpoint \ + --project=${MLP_PROJECT_ID} \ + --region=${MLP_REGION} ``` ## Build the container image - Build the container image using Cloud Build and push the image to Artifact - Registry + Registry. - ```sh + ```shell cd src git restore cloudbuild.yaml sed -i -e "s|^serviceAccount:.*|serviceAccount: projects/${MLP_PROJECT_ID}/serviceAccounts/${MLP_BUILD_GSA}|" cloudbuild.yaml @@ -55,67 +60,67 @@ cd - ``` -## Deploy the backend application +## Deploy the backend - Configure the deployment. - ```sh + ```shell + set -o nounset export CATALOG_DB="product_catalog" export CATALOG_TABLE_NAME="clothes" - export TEXT_EMBEDDING_ENDPOINT="http://multimodal-embedding-model.ml-team:80/text_embeddings" - export IMAGE_EMBEDDING_ENDPOINT="http://multimodal-embedding-model.ml-team:80/image_embeddings" - export MULTIMODAL_EMBEDDING_ENDPOINT="http://multimodal-embedding-model.ml-team:80/multimodal_embeddings" - export GEMMA_IT_ENDPOINT="http://rag-it-model.ml-team:8000/v1/chat/completions" - export EMBEDDING_COLUMN_TEXT="text_embeddings" + export CONTAINER_IMAGE_URL="${MLP_RAG_BACKEND_IMAGE}" + export DB_INSTANCE_URI="${MLP_DB_INSTANCE_URI}" export EMBEDDING_COLUMN_IMAGE="image_embeddings" export EMBEDDING_COLUMN_MULTIMODAL="multimodal_embeddings" - export ROW_COUNT="\"5\"" + export EMBEDDING_COLUMN_TEXT="text_embeddings" + export EMBEDDING_ENDPOINT_IMAGE="http://multimodal-embedding-model.ml-team:80/image_embeddings" + export EMBEDDING_ENDPOINT_MULTIMODAL="http://multimodal-embedding-model.ml-team:80/multimodal_embeddings" + export EMBEDDING_ENDPOINT_TEXT="http://multimodal-embedding-model.ml-team:80/text_embeddings" + export GEMMA_IT_ENDPOINT="http://rag-it-model.ml-team:8000/v1/chat/completions" + export KUBERNETES_SERVICE_ACCOUNT="${MLP_DB_USER_KSA}" + + export ROW_COUNT="5" + set +o nounset ``` - ```sh + > Ensure there are no `bash: unbound variable` error + > messages. + + ```shell git restore manifests/deployment.yaml - sed \ - -i -e "s|V_CATALOG_DB|${CATALOG_DB}|" \ - -i -e "s|V_CATALOG_TABLE_NAME|${CATALOG_TABLE_NAME}|" \ - -i -e "s|V_EMBEDDING_COLUMN_IMAGE|${EMBEDDING_COLUMN_IMAGE}|" \ - -i -e "s|V_EMBEDDING_COLUMN_MULTIMODAL|${EMBEDDING_COLUMN_MULTIMODAL}|" \ - -i -e "s|V_EMBEDDING_COLUMN_TEXT|${EMBEDDING_COLUMN_TEXT}|" \ - -i -e "s|V_EMBEDDING_ENDPOINT_IMAGE|${IMAGE_EMBEDDING_ENDPOINT}|" \ - -i -e "s|V_EMBEDDING_ENDPOINT_MULTIMODAL|${MULTIMODAL_EMBEDDING_ENDPOINT}|" \ - -i -e "s|V_EMBEDDING_ENDPOINT_TEXT|${TEXT_EMBEDDING_ENDPOINT}|" \ - -i -e "s|V_GEMMA_IT_ENDPOINT|${GEMMA_IT_ENDPOINT}|" \ - -i -e "s|V_IMAGE|${MLP_RAG_BACKEND_IMAGE}|" \ - -i -e "s|V_MLP_DB_ADMIN_IAM|${MLP_DB_ADMIN_IAM}|" \ - -i -e "s|V_MLP_DB_USER_KSA|${MLP_DB_USER_KSA}|" \ - -i -e "s|V_MLP_DB_INSTANCE_URI|${MLP_DB_INSTANCE_URI}|" \ - -i -e "s|V_MLP_KUBERNETES_NAMESPACE|${MLP_KUBERNETES_NAMESPACE}|" \ - -i -e "s|V_PROJECT_ID|${MLP_PROJECT_ID}|" \ - -i -e "s|V_ROW_COUNT|${ROW_COUNT}|" \ - manifests/deployment.yaml + envsubst < manifests/deployment.yaml | sponge manifests/deployment.yaml ``` - Create the deployment. - ```sh + ```shell kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/deployment.yaml ``` -## Test the backend application +- Watch the deployment until it is ready and available. -Validations: + ```shell + watch --color --interval 5 --no-title \ + "kubectl --namespace ${MLP_MODEL_OPS_NAMESPACE} get deployment/rag-backend | GREP_COLORS='mt=01;92' egrep --color=always -e '^' -e '1/1 1 1' + echo '\nLogs(last 10 lines):' + kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs deployment/rag-backend --tail 10" + ``` -```sh -kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} get pods -l app=rag-backend -``` + ``` + NAME READY UP-TO-DATE AVAILABLE AGE + rag-backend 1/1 1 1 XXXXX + ``` + +## Verify the backend -```sh -kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} get service/rag-backend -``` +- Create the curl job. -```sh -kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/curl.yaml -``` + ```shell + kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/curl.yaml + ``` + +- Get the logs for the curl job. -```sh -kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs job/rag-backend-curl -``` + ```shell + kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs --follow job/rag-backend-curl + ``` diff --git a/use-cases/rag-pipeline/backend/manifests/deployment.yaml b/use-cases/rag-pipeline/backend/manifests/deployment.yaml index 3aaa633d..0a183335 100644 --- a/use-cases/rag-pipeline/backend/manifests/deployment.yaml +++ b/use-cases/rag-pipeline/backend/manifests/deployment.yaml @@ -26,38 +26,36 @@ spec: labels: app: rag-backend spec: - serviceAccountName: V_MLP_DB_USER_KSA + serviceAccountName: ${KUBERNETES_SERVICE_ACCOUNT} containers: - name: rag - image: V_IMAGE + image: ${CONTAINER_IMAGE_URL} imagePullPolicy: Always env: - name: CATALOG_DB - value: V_CATALOG_DB + value: "${CATALOG_DB}" - name: CATALOG_TABLE_NAME - value: V_CATALOG_TABLE_NAME - - name: MLP_DB_ADMIN_IAM - value: V_MLP_DB_ADMIN_IAM + value: "${CATALOG_TABLE_NAME}" - name: MLP_DB_INSTANCE_URI - value: V_MLP_DB_INSTANCE_URI + value: "${DB_INSTANCE_URI}" - name: GEMMA_IT_ENDPOINT - value: V_GEMMA_IT_ENDPOINT + value: "${GEMMA_IT_ENDPOINT}" - name: MLP_KUBERNETES_NAMESPACE - value: V_MLP_KUBERNETES_NAMESPACE + value: "${MLP_KUBERNETES_NAMESPACE}" - name: TEXT_EMBEDDING_ENDPOINT - value: V_EMBEDDING_ENDPOINT_TEXT + value: "${EMBEDDING_ENDPOINT_TEXT}" - name: IMAGE_EMBEDDING_ENDPOINT - value: V_EMBEDDING_ENDPOINT_IMAGE + value: "${EMBEDDING_ENDPOINT_IMAGE}" - name: MULTIMODAL_EMBEDDING_ENDPOINT - value: V_EMBEDDING_ENDPOINT_MULTIMODAL + value: "${EMBEDDING_ENDPOINT_MULTIMODAL}" - name: EMBEDDING_COLUMN_TEXT - value: V_EMBEDDING_COLUMN_TEXT + value: "${EMBEDDING_COLUMN_TEXT}" - name: EMBEDDING_COLUMN_IMAGE - value: V_EMBEDDING_COLUMN_IMAGE + value: "${EMBEDDING_COLUMN_IMAGE}" - name: EMBEDDING_COLUMN_MULTIMODAL - value: V_EMBEDDING_COLUMN_MULTIMODAL + value: "${EMBEDDING_COLUMN_MULTIMODAL}" - name: ROW_COUNT - value: V_ROW_COUNT + value: "${ROW_COUNT}" resources: requests: cpu: "2" diff --git a/use-cases/rag-pipeline/data-preprocessing/README.md b/use-cases/rag-pipeline/data-preprocessing/README.md index 451938a4..6832e526 100644 --- a/use-cases/rag-pipeline/data-preprocessing/README.md +++ b/use-cases/rag-pipeline/data-preprocessing/README.md @@ -1,4 +1,4 @@ -# Data Preprocessing for RAG +# RAG: Data preprocessing ## Dataset @@ -36,66 +36,81 @@ The data preprocessing step takes approximately 18-20 minutes. ## Preparation -- Clone the repository +- Clone the repository. ```shell git clone https://github.com/GoogleCloudPlatform/accelerated-platforms && \ cd accelerated-platforms ``` -- Change directory to the guide directory +- Change directory to the guide directory. ```shell cd use-cases/rag-pipeline/data-preprocessing ``` -- Ensure that your `MLP_ENVIRONMENT_FILE` is configured +- Ensure that your `MLP_ENVIRONMENT_FILE` is configured. ```shell cat ${MLP_ENVIRONMENT_FILE} && \ - source ${MLP_ENVIRONMENT_FILE} + set -o allexport && \ + source ${MLP_ENVIRONMENT_FILE} && \ + set +o allexport ``` > You should see the various variables populated with the information specific > to your environment. +- Get credentials for the GKE cluster. + + ```shell + gcloud container clusters get-credentials ${MLP_CLUSTER_NAME} \ + --dns-endpoint \ + --project=${MLP_PROJECT_ID} \ + --region=${MLP_REGION} + ``` + ## Build the container image - Build container image using Cloud Build and push the image to Artifact - Registry + Registry. ```shell cd src + rm -rf datapreprocessing cp -r ${MLP_BASE_DIR}/modules/python/src/datapreprocessing . + git restore cloudbuild.yaml sed -i -e "s|^serviceAccount:.*|serviceAccount: projects/${MLP_PROJECT_ID}/serviceAccounts/${MLP_BUILD_GSA}|" cloudbuild.yaml gcloud beta builds submit \ --config cloudbuild.yaml \ --gcs-source-staging-dir gs://${MLP_CLOUDBUILD_BUCKET}/source \ --project ${MLP_PROJECT_ID} \ --substitutions _DESTINATION=${MLP_RAG_DATA_PROCESSING_IMAGE} - rm -rf datapreprocessing cd .. ``` ## Run the job -- Get credentials for the GKE cluster +- Configure the job. ```shell - gcloud container fleet memberships get-credentials ${MLP_CLUSTER_NAME} --project ${MLP_PROJECT_ID} + set -o nounset + export CONTAINER_IMAGE_URL="${MLP_RAG_DATA_PROCESSING_IMAGE}" + export DATA_BUCKET="${MLP_DATA_BUCKET}" + export KUBERNETES_SERVICE_ACCOUNT="${MLP_RAG_DATA_PROCESSING_KSA}" + export RAY_CLUSTER_HOST="ray-cluster-kuberay-head-svc.ml-team:10001" + set +o nounset ``` -- Configure the job + > Ensure there are no `bash: unbound variable` error + > messages. ```shell - sed \ - -i -e "s|V_DATA_BUCKET|${MLP_DATA_BUCKET}|" \ - -i -e "s|V_IMAGE_URL|${MLP_RAG_DATA_PROCESSING_IMAGE}|" \ - -i -e "s|V_KSA|${MLP_RAG_DATA_PROCESSING_KSA}|" \ - manifests/job.yaml + git restore manifests/job.yaml + envsubst < manifests/job.yaml | sponge manifests/job.yaml ``` -- Create the job +- Create the job. ```shell kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/job.yaml diff --git a/use-cases/rag-pipeline/data-preprocessing/manifests/job.yaml b/use-cases/rag-pipeline/data-preprocessing/manifests/job.yaml index 7982bc1a..d6ecc234 100644 --- a/use-cases/rag-pipeline/data-preprocessing/manifests/job.yaml +++ b/use-cases/rag-pipeline/data-preprocessing/manifests/job.yaml @@ -26,13 +26,13 @@ spec: spec: containers: - name: job - image: V_IMAGE_URL + image: ${CONTAINER_IMAGE_URL} imagePullPolicy: Always env: - name: "PROCESSING_BUCKET" - value: V_DATA_BUCKET + value: "${DATA_BUCKET}" - name: "RAY_CLUSTER_HOST" - value: ray-cluster-kuberay-head-svc.ml-team:10001 + value: "${RAY_CLUSTER_HOST}" resources: requests: cpu: 100m @@ -43,7 +43,7 @@ spec: nodeSelector: resource-type: cpu restartPolicy: Never - serviceAccountName: V_KSA + serviceAccountName: ${KUBERNETES_SERVICE_ACCOUNT} tolerations: - effect: NoSchedule key: on-demand diff --git a/use-cases/rag-pipeline/embedding-models/multimodal-embedding/README.md b/use-cases/rag-pipeline/embedding-models/multimodal-embedding/README.md index 99c747b1..8284a6ac 100644 --- a/use-cases/rag-pipeline/embedding-models/multimodal-embedding/README.md +++ b/use-cases/rag-pipeline/embedding-models/multimodal-embedding/README.md @@ -1,4 +1,4 @@ -# Multimodal blip2 model +# RAG: Multimodal embedding model To know more about the embedding model see original [blog](https://blog.salesforceairesearch.com/blip-2/) and @@ -13,41 +13,46 @@ To know more about the embedding model see original ## Preparation -- Clone the repository +- Clone the repository. - ```sh + ```shell git clone https://github.com/GoogleCloudPlatform/accelerated-platforms && \ cd accelerated-platforms ``` -- Change directory to the guide directory +- Change directory to the guide directory. - ```sh + ```shell cd use-cases/rag-pipeline/embedding-models/multimodal-embedding ``` -- Ensure that your `MLP_ENVIRONMENT_FILE` is configured +- Ensure that your `MLP_ENVIRONMENT_FILE` is configured. - ```sh + ```shell cat ${MLP_ENVIRONMENT_FILE} && \ - source ${MLP_ENVIRONMENT_FILE} + set -o allexport && \ + source ${MLP_ENVIRONMENT_FILE} && \ + set +o allexport ``` > You should see the various variables populated with the information specific > to your environment. -- Get credentials for the GKE cluster +- Get credentials for the GKE cluster. - ```sh - gcloud container fleet memberships get-credentials ${MLP_CLUSTER_NAME} --project ${MLP_PROJECT_ID} + ```shell + gcloud container clusters get-credentials ${MLP_CLUSTER_NAME} \ + --dns-endpoint \ + --project=${MLP_PROJECT_ID} \ + --region=${MLP_REGION} ``` ## Build the container image - Build the container image using Cloud Build and push the image to Artifact - Registry + Registry. - ```sh + ```shell cd src git restore cloudbuild.yaml sed -i -e "s|^serviceAccount:.*|serviceAccount: projects/${MLP_PROJECT_ID}/serviceAccounts/${MLP_BUILD_GSA}|" cloudbuild.yaml @@ -59,67 +64,73 @@ To know more about the embedding model see original cd - ``` -## Deploy the embedding model +## Deploy the model -- Configure the deployment +- Configure the deployment. - ```sh - git restore manifests/deployment.yaml - sed \ - -i -e "s|V_IMAGE|${MLP_MULTIMODAL_EMBEDDING_IMAGE}|" \ - -i -e "s|V_KSA|${MLP_DB_USER_KSA}|" \ - manifests/deployment.yaml + ```shell + set -o nounset + export CONTAINER_IMAGE_URL="${MLP_MULTIMODAL_EMBEDDING_IMAGE}" + export KUBERNETES_SERVICE_ACCOUNT="${MLP_DB_USER_KSA}" + set +o nounset ``` -- Create the deployment + > Ensure there are no `bash: unbound variable` error + > messages. - ```sh - kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/deployment.yaml + ```shell + git restore manifests/deployment.yaml + envsubst < manifests/deployment.yaml | sponge manifests/deployment.yaml ``` -## Validate the embedding model deployment - -```sh -kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} get pods -l app=multimodal-embedding-model -``` +- Create the deployment. -```sh -kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} get service/multimodal-embedding-model -``` - -## Verify the embedding model + ```shell + kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/deployment.yaml + ``` -**This assumes that the `flipkart_images` are already present in the -`MLP_DATA_BUCKET`. If they are not, run the following command** +- Watch the deployment until it is ready and available. -- Temporary steps to populate required data + ```shell + watch --color --interval 5 --no-title \ + "kubectl --namespace ${MLP_MODEL_OPS_NAMESPACE} get deployment/multimodal-embedding-model | GREP_COLORS='mt=01;92' egrep --color=always -e '^' -e '1/1 1 1' + echo '\nLogs(last 10 lines):' + kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs deployment/multimodal-embedding-model --tail 10" + ``` ``` - gcloud storage rsync gs://temporary-rag-data/flipkart_images gs://${MLP_DATA_BUCKET}/flipkart_images/ + NAME READY UP-TO-DATE AVAILABLE AGE + multimodal-embedding-model 1/1 1 1 XXXXX ``` +## Verify the model + - Configure the curl job. - ``` + ```shell export IMAGE_URI="$(gcloud storage ls gs://${MLP_DATA_BUCKET}/flipkart_images | head -1)" echo ${IMAGE_URI} ``` - ``` + ```shell git restore manifests/curl.yaml - sed \ - -i -e "s|V_IMAGE_URI|${IMAGE_URI}|" \ - manifests/curl.yaml + envsubst < manifests/curl.yaml | sponge manifests/curl.yaml ``` - Create the curl job. - ```sh + ```shell kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/curl.yaml ``` - Get the logs for the curl job. + ```shell + kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs --follow job/multimodal-curl + ``` + + The output should be similar to: + ``` - kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs job/multimodal-curl + {"multimodal_embeds":[0.5924106240272522,-0.20345857739448547,-0.14882200956344604,0.5351353883743286,0.07293718308210373,-0.15884630382061005,-0.8434014320373535,0.23294861614704132,0.1901591569185257,-0.2626086473464966,0.1809360533952713,0.6944392919540405,-0.6756398677825928,-0.4964878559112549,0.1505148708820343,0.38322848081588745,-0.16832123696804047,-1.578013300895691,-0.4120802581310272,0.7676566243171692,-0.043128449469804764,-1.2867095470428467,0.1959574967622757,0.19568923115730286,-0.3762733042240143,-0.2227010875940323,0.0216318741440773,1.1506505012512207,-0.3325354754924774,0.996836245059967,0.008820452727377415,-1.2283766269683838,1.116440773010254,0.5558998584747314,-1.233245849609375,-0.03321460634469986,-0.08853068202733994,0.36936497688293457,0.30411338806152344,-1.0682016611099243,0.685946524143219,0.46431964635849,-0.06503061205148697,-0.6995241045951843,-0.8500242829322815,-0.008173193782567978,0.04908387362957001,-0.17223168909549713,0.6433345675468445,-0.7146121263504028,-0.847214937210083,-0.46910935640335083,-0.057915765792131424,0.9733234643936157,-0.5221759676933289,0.20472489297389984,0.28791186213493347,0.5766440033912659,0.1494569033384323,0.075665183365345,-0.8721647262573242,0.3098238706588745,0.9528746604919434,-0.6286458373069763,-0.6070472598075867,-0.31975263357162476,0.5499343276023865,-0.690697193145752,-0.6751032471656799,0.017426932230591774,0.47378459572792053,1.2284945249557495,-0.10540176182985306,0.33025944232940674,0.03466665372252464,-0.2689518630504608,-0.08906934410333633,-0.39901867508888245,0.784071147441864,1.9048402309417725,0.22897273302078247,-1.060357689857483,-0.4512109160423279,0.16696316003799438,-0.2122073620557785,-0.6392346620559692,0.5564654469490051,0.2948237657546997,0.34229978919029236,-0.3559795916080475,0.17897234857082367,-0.3155965209007263,0.06322792917490005,0.6553076505661011,-0.4236254394054413,0.2814951539039612,-0.10399337857961655,-1.0849717855453491,-0.46524766087532043,0.5265820622444153,-0.039711661636829376,-0.3791617453098297,-0.8323513865470886,0.44320356845855713,-0.11838524043560028,-0.9316837191581726,0.24924667179584503,0.41757479310035706,0.038102470338344574,-1.9340052604675293,-0.025717858225107193,0.3303055167198181,1.2454990148544312,-0.1246604323387146,0.038160841912031174,-0.051868800073862076,0.8113735318183899,0.6895425319671631,0.574571430683136,0.6685034036636353,0.05573762208223343,0.9255302548408508,-0.04630526527762413,-0.6265912055969238,0.09264900535345078,1.1382992267608643,-0.12079610675573349,0.6352153420448303,-0.6007900834083557,-2.0102756023406982,-0.3941310942173004,-0.29737618565559387,0.38305985927581787,1.2179173231124878,0.3158678114414215,0.3176381289958954,-0.5689866542816162,0.415942907333374,0.6107836365699768,-0.8466510772705078,0.6475733518600464,0.2757197618484497,-1.7423759698867798,0.3180757462978363,-0.6129754185676575,0.18928393721580505,-0.28512609004974365,0.9512983560562134,0.48656705021858215,0.0262975562363863,-0.3403189182281494,0.9278243780136108,0.364020437002182,0.22098484635353088,0.9116793274879456,1.1627230644226074,0.31623464822769165,-0.8246462345123291,-0.6413137316703796,-1.249909520149231,-0.8452469110488892,-0.7048303484916687,-0.20750950276851654,-0.15846426784992218,-0.313335657119751,0.40668410062789917,1.014188528060913,0.39454370737075806,0.24124768376350403,0.08503568172454834,-0.12810391187667847,-1.1661150455474854,-0.057069167494773865,0.6312134861946106,-0.05011759698390961,0.07944561541080475,0.25088274478912354,-0.7156392931938171,0.5230376124382019,0.2992391884326935,0.5024787783622742,-0.5910110473632812,-0.6746929883956909,0.1383083611726761,1.1426094770431519,0.9035974144935608,0.2560117542743683,-0.19548475742340088,0.11429132521152496,-0.07424812763929367,0.5549275279045105,0.7408034801483154,-0.15848000347614288,-0.4925006031990051,0.3749609589576721,-0.7810171246528625,-0.8198821544647217,0.9027780294418335,-0.051281485706567764,-1.3418570756912231,-0.7416917681694031,-0.011724784038960934,0.9643922448158264,-0.9879753589630127,0.2065039575099945,1.679306983947754,-0.994052529335022,0.5123266577720642,0.6030668020248413,0.2596941292285919,-0.2549845576286316,0.25600358843803406,1.0323054790496826,-0.4192933142185211,0.006751406472176313,0.7416000366210938,0.063038170337677,0.28278666734695435,0.8160985112190247,-0.2981548309326172,0.09062497317790985,0.2493465691804886,-0.4097219705581665,1.0461561679840088,-0.48408523201942444,-0.030861463397741318,1.2159347534179688,0.4290057122707367,0.36419039964675903,-0.14557215571403503,0.2958556115627289,-1.8545191287994385,0.24612410366535187,-0.08295079320669174,-0.09110811352729797,0.604489266872406,-1.1589456796646118,0.7125701308250427,-0.02696867287158966,-0.20462752878665924,-0.3534247875213623,0.24426694214344025,-0.0363934189081192,0.29090404510498047,0.34047749638557434,-1.1260850429534912,-0.29075828194618225,-0.6621608734130859,0.16159023344516754,-0.19804151356220245,-1.2392868995666504,0.10724444687366486,0.030820529907941818,0.5301802754402161,-0.2038278430700302,-1.6347323656082153,0.4680101275444031,-0.3246447443962097,0.46105077862739563,0.2697536051273346,0.7527429461479187,1.300972580909729,-0.5727341771125793,-1.8289685249328613,2.159158229827881,-0.09365013986825943,-0.38880637288093567,0.024096481502056122,0.06783846020698547,0.8702865839004517,0.6069105267524719,-0.5236564874649048,-0.6403051018714905,-2.291926622390747,-0.9303463697433472,0.7924670577049255,0.23445598781108856,0.3862837851047516,1.5976613759994507,0.054507821798324585,-0.7957853078842163,0.037503622472286224,0.24895651638507843,0.2053094357252121,-0.3416963815689087,0.23919537663459778,0.7615337371826172,-0.3064285218715668,-0.11935160309076309,0.22598037123680115,-0.8022726774215698,0.9771227240562439,0.19433902204036713,-0.06242872029542923,-0.10598665475845337,-0.05884666368365288,-0.8490187525749207,0.31686854362487793,-0.7071172595024109,1.248034954071045,0.38836604356765747,0.09760317206382751,0.03638481721282005,-0.5906966328620911,0.5589482188224792,-0.38652920722961426,0.0038456094916909933,-0.8829982280731201,0.40402284264564514,-0.07063112407922745,0.48047932982444763,-0.505984365940094,-0.8374018669128418,-0.3932972252368927,-0.40560054779052734,-0.4289301633834839,-0.34754669666290283,-0.8173854351043701,-0.4326341152191162,-0.35555294156074524,-0.22077728807926178,-1.195460557937622,-0.12446943670511246,-0.7700096964836121,-0.05019836500287056,-0.10403405129909515,-0.8732231259346008,0.6984255909919739,0.5540904998779297,0.17234764993190765,-0.20329773426055908,-0.763291597366333,-0.7350960373878479,-0.5024373531341553,-0.07996673136949539,-0.561328113079071,0.3697052001953125,-0.7664240598678589,-0.5422042012214661,0.3661231994628906,-1.5195907354354858,0.30998358130455017,0.6927115321159363,0.9210965633392334,1.337175965309143,0.32856330275535583,-0.13148677349090576,-0.4731431007385254,1.7093837261199951,-0.9689330458641052,-0.17429831624031067,0.24648000299930573,-0.6872245073318481,0.6300744414329529,1.0549181699752808,-0.1501505821943283,0.07626151293516159,0.014565291814506054,0.6200234889984131,-0.7497552633285522,-0.13769806921482086,0.7262248992919922,-0.5686532258987427,0.3675328493118286,-0.6681644320487976,0.34052222967147827,-0.36161768436431885,0.39699244499206543,0.12775655090808868,0.8234163522720337,-0.6766327619552612,-0.37533244490623474,-0.5156548023223877,0.5635435581207275,-0.6431009769439697,-0.9025450348854065,1.0344597101211548,0.24740907549858093,-0.16893160343170166,-0.2802108824253082,-0.21651746332645416,-0.20085358619689941,-0.29250916838645935,0.9410440921783447,0.7237409949302673,0.1627369225025177,-0.2515818476676941,-0.3576256036758423,-1.1060004234313965,-0.702904999256134,-0.3310628831386566,0.713584840297699,1.4857008457183838,0.5749641060829163,0.6156712770462036,0.8447413444519043,0.2557753920555115,-0.06209149956703186,0.03618166968226433,-0.5274552702903748,0.9609156250953674,-0.5495775938034058,0.4336977005004883,-0.7704826593399048,-0.4095536470413208,-1.5273873805999756,1.0192725658416748,-0.2866295874118805,-0.18666572868824005,1.5948429107666016,-0.8420526385307312,-0.1537669450044632,-0.5174681544303894,0.5479308366775513,0.14274103939533234,0.49179983139038086,0.09275566786527634,-0.7460026144981384,0.8067554235458374,-0.19404305517673492,-1.097361445426941,1.5971705913543701,0.8954728841781616,-0.8676612973213196,-0.5234267115592957,-0.24115203320980072,0.14012376964092255,-0.9870575666427612,-0.7624554634094238,0.1638406664133072,0.12743917107582092,0.2848738729953766,0.27793166041374207,0.7528935074806213,-0.2460508793592453,-0.9296041131019592,0.24949029088020325,0.5737308859825134,0.36392930150032043,-0.4181497395038605,0.3214835524559021,-0.8477562069892883,0.5790620446205139,-0.07225629687309265,0.4959008991718292,0.20663633942604065,0.5882816314697266,0.6829842925071716,-0.03943353146314621,-1.014716386795044,0.4541794955730438,-0.793552577495575,0.7542141079902649,-0.6772245168685913,-1.0327435731887817,0.21520353853702545,0.3880807161331177,-0.44152843952178955,0.022519083693623543,0.28862690925598145,0.27966463565826416,-0.709011435508728,0.11298210918903351,-0.4660022556781769,-0.15788684785366058,0.5097206234931946,-0.4395550489425659,-0.1344972401857376,0.8646097779273987,-0.8858988881111145,-0.40364518761634827,-0.4181089401245117,-1.3044989109039307,0.6003195643424988,0.26611262559890747,0.01483498327434063,0.03671019524335861,-0.1783098578453064,-0.48309326171875,-0.021046839654445648,-0.24778850376605988,0.38099199533462524,0.46690458059310913,-0.7548868060112,0.2729513943195343,-0.22656816244125366,0.738362729549408,-0.4309113025665283,0.27661192417144775,0.3316297233104706,-0.10924235731363297,0.47210559248924255,-0.1425093412399292,-0.3397757411003113,-0.5119386315345764,-0.12244696170091629,-0.5983662605285645,0.883116602897644,0.5852846503257751,0.464468389749527,-0.5343906283378601,0.44732487201690674,0.5265068411827087,-0.8963383436203003,-0.5728183388710022,0.1321166455745697,-1.2303310632705688,-0.3244307041168213,-0.6088860034942627,-0.7326467633247375,-2.2441797256469727,-1.022038459777832,-1.1503384113311768,0.5305519104003906,0.8546370267868042,0.03200090676546097,0.6918140649795532,-0.4565896689891815,0.22333194315433502,0.4838320314884186,-0.15908588469028473,-0.42833369970321655,-1.110032558441162,-0.3613256812095642,-0.26740843057632446,-0.21148055791854858,-0.1429458111524582,2.030487537384033,-0.6748891472816467,0.32189834117889404,-0.07879403233528137,0.7610182762145996,-0.05038297548890114,-0.7954574823379517,-0.4545722007751465,-0.08980017155408859,-0.08362536132335663,-0.3386051058769226,0.7452195286750793,-0.8737925887107849,-1.2980282306671143,0.5953806042671204,0.170277401804924,0.31749382615089417,0.39990922808647156,-0.10998007655143738,-0.5869539380073547,-1.1650261878967285,1.1979150772094727,0.337417870759964,0.5308238863945007,0.2836620807647705,-0.1433805376291275,0.5892527103424072,0.6923660635948181,-1.340793490409851,0.3280276358127594,0.6116904020309448,0.7533330917358398,0.6744691133499146,0.29284119606018066,0.15764272212982178,-0.5961185693740845,-0.1882801651954651,0.42425429821014404,-1.1767736673355103,-0.0671968013048172,0.2050277590751648,0.7109279632568359,-0.12895965576171875,0.3474752604961395,-0.37441325187683105,0.055882904678583145,-0.6677221059799194,-0.24436338245868683,0.2832522988319397,-0.8741235733032227,0.4245906472206116,-0.5043145418167114,-0.008918159641325474,-0.4592393636703491,-0.14498953521251678,0.9676015377044678,-0.12521640956401825,0.05031321197748184,-0.048453543335199356,0.2883683443069458,-0.0019415427232161164,1.4985891580581665,-0.9938784241676331,-0.05295291170477867,0.8332653045654297,-0.018138568848371506,-0.20116651058197021,0.6984258890151978,0.22105801105499268,1.267309546470642,-0.3941604197025299,-0.6251749992370605,0.37921375036239624,0.1411542296409607,-0.2910098731517792,-0.8752447962760925,0.12156853079795837,-0.45161187648773193,-0.6973322033882141,-0.2819877564907074,0.26192042231559753,0.5357677936553955,-0.8252045512199402,0.2685393989086151,0.0403514988720417,-0.9001789093017578,0.5404233336448669,0.8685083985328674,-0.37962010502815247,-0.12918895483016968,-0.80665123462677,0.8366697430610657,-0.9948225617408752,0.2716467082500458,-1.4940584897994995,1.0096755027770996,0.7936453223228455,-0.37303614616394043,0.25767695903778076,0.6856227517127991,-0.9516211152076721,1.2921777963638306,0.1479228287935257,0.2804521918296814,0.7513853907585144,0.7206048965454102,-0.6532924175262451,1.1256612539291382,0.48770585656166077,-0.6284250020980835,0.18604932725429535,0.5757274627685547,0.031435687094926834,1.29680597782135,0.8489174842834473,-0.23384420573711395,-0.2413959950208664,0.4542517066001892,-0.02811950072646141,0.9807258248329163,-0.39759013056755066,0.05556461960077286,0.06736364215612411,0.4593423306941986,0.03177834302186966,0.39649659395217896,-0.5091902017593384,0.25192639231681824,-0.8313735127449036,0.4047306478023529,0.5473387837409973,-0.0973253846168518,-0.9670475721359253,-0.6689167618751526,0.4568917155265808,-0.8754829168319702,0.7289531230926514,-0.22822381556034088,0.05914885923266411,-0.7385119795799255,-2.0555343627929688,-0.20771734416484833,-0.02948600798845291,0.4359463155269623,0.5658693909645081,-0.37710848450660706,0.1891058087348938,0.5102226734161377,0.9142122268676758,-0.1439926028251648,-0.7270087003707886,0.370591938495636,0.016804920509457588,-0.8963223099708557,-0.11248458921909332,0.8208262324333191,-0.34005871415138245,0.09465346485376358,-0.29617178440093994,0.17824341356754303,0.2790500223636627,0.06746654212474823,0.21420317888259888,-0.934598445892334,-0.3456944227218628,-0.2549537420272827,0.7724777460098267,-0.8793022036552429,0.03459929674863815,0.4236677587032318,-0.2653326690196991,0.04149880260229111,-0.1393849104642868,-0.08927671611309052,0.9172636866569519,0.3412185311317444,-0.3941933214664459,0.9213281869888306,0.34835806488990784,0.005079601425677538,1.260278582572937,-0.3396773934364319,0.8979130983352661,-0.5272532105445862,0.4656248092651367,0.32662105560302734,-0.7254701256752014,0.4663284718990326,0.4497484564781189,0.021822882816195488,-1.0754454135894775,-0.1687462329864502,-0.6356937885284424,0.2550338804721832,0.3024436831474304,-0.5981853604316711,-0.18913383781909943,0.1669837385416031,-0.3048868775367737,0.831307590007782,-0.5515435338020325,0.038372695446014404,0.09054838120937347,0.10913554579019547,0.2590837776660919,-0.6824265718460083,0.2753410339355469,-0.32280510663986206,0.8028574585914612,0.3356193006038666,0.20385333895683289,-1.229520320892334,-1.5652217864990234,-0.13384152948856354,0.38042017817497253,-0.020127426832914352,-0.08777672052383423,0.6866685152053833,-0.6626774668693542,-0.13147133588790894,0.171450674533844,0.35092267394065857,0.6960228681564331,-0.26311707496643066,-0.5996111631393433,-3.0251731872558594,-0.5291815996170044,-0.5137613415718079,0.36210379004478455,-0.4394923448562622,-0.25105997920036316,-0.6869190335273743,0.23686374723911285,1.123238444328308,-1.5054799318313599,1.4507451057434082,0.049005743116140366,0.41741254925727844,0.0512530691921711,-0.3210649788379669,0.05774897336959839]} ``` diff --git a/use-cases/rag-pipeline/embedding-models/multimodal-embedding/manifests/curl.yaml b/use-cases/rag-pipeline/embedding-models/multimodal-embedding/manifests/curl.yaml index 25d4d3c4..540a73d9 100644 --- a/use-cases/rag-pipeline/embedding-models/multimodal-embedding/manifests/curl.yaml +++ b/use-cases/rag-pipeline/embedding-models/multimodal-embedding/manifests/curl.yaml @@ -45,4 +45,4 @@ spec: imagePullPolicy: IfNotPresent env: - name: IMAGE_URI - value: V_IMAGE_URI + value: "${IMAGE_URI}" diff --git a/use-cases/rag-pipeline/embedding-models/multimodal-embedding/manifests/deployment.yaml b/use-cases/rag-pipeline/embedding-models/multimodal-embedding/manifests/deployment.yaml index cbcd6dc3..a6bd61c8 100644 --- a/use-cases/rag-pipeline/embedding-models/multimodal-embedding/manifests/deployment.yaml +++ b/use-cases/rag-pipeline/embedding-models/multimodal-embedding/manifests/deployment.yaml @@ -26,12 +26,12 @@ spec: labels: app: multimodal-embedding-model spec: - serviceAccountName: V_KSA + serviceAccountName: ${KUBERNETES_SERVICE_ACCOUNT} containers: - env: - name: "PORT" value: "5000" - image: V_IMAGE + image: ${CONTAINER_IMAGE_URL} imagePullPolicy: Always name: multimodal-embedding-model resources: diff --git a/use-cases/rag-pipeline/instruction-tuned-model/README.md b/use-cases/rag-pipeline/instruction-tuned-model/README.md index 82e2ff2f..a091ac28 100644 --- a/use-cases/rag-pipeline/instruction-tuned-model/README.md +++ b/use-cases/rag-pipeline/instruction-tuned-model/README.md @@ -1,4 +1,4 @@ -# Steps to deploy instruction tuned model +# RAG: Instruction tuned model ## Prerequisites @@ -9,54 +9,59 @@ ## Preparation -- Clone the repository +- Clone the repository. - ```sh + ```shell git clone https://github.com/GoogleCloudPlatform/accelerated-platforms && \ cd accelerated-platforms ``` -- Change directory to the guide directory +- Change directory to the guide directory. - ```sh + ```shell cd use-cases/rag-pipeline/instruction-tuned-model ``` -- Ensure that your `MLP_ENVIRONMENT_FILE` is configured +- Ensure that your `MLP_ENVIRONMENT_FILE` is configured. - ```sh + ```shell cat ${MLP_ENVIRONMENT_FILE} && \ - source ${MLP_ENVIRONMENT_FILE} + set -o allexport && \ + source ${MLP_ENVIRONMENT_FILE} && \ + set +o allexport ``` > You should see the various variables populated with the information specific > to your environment. -- Get credentials for the GKE cluster +- Get credentials for the GKE cluster. - ```sh - gcloud container fleet memberships get-credentials ${MLP_CLUSTER_NAME} --project ${MLP_PROJECT_ID} + ```shell + gcloud container clusters get-credentials ${MLP_CLUSTER_NAME} \ + --dns-endpoint \ + --project=${MLP_PROJECT_ID} \ + --region=${MLP_REGION} ``` ### HuggingFace access token - Set `HF_TOKEN` to your HuggingFace access token. Go to - https://huggingface.co/settings/tokens , click `Create new token` , provide a + https://huggingface.co/settings/tokens , click `Create new token`, provide a token name, select `Read` in token type and click `Create token`. - ``` + ```shell HF_TOKEN= ``` - Create a Kubernetes secret with your HuggingFace token. - ```sh + ```shell kubectl create secret generic hf-secret \ --from-literal=hf_api_token=${HF_TOKEN} \ --dry-run=client -o yaml | kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f - ``` -## Deploy model +## Deploy the model - Create the deployment. @@ -64,24 +69,36 @@ kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/it-model-deployment.yaml ``` -- Wait for the deployment to be ready. +- Watch the deployment until it is ready and available. + + ```shell + watch --color --interval 5 --no-title \ + "kubectl --namespace ${MLP_MODEL_OPS_NAMESPACE} get deployment/rag-it-model | GREP_COLORS='mt=01;92' egrep --color=always -e '^' -e '1/1 1 1' + echo '\nLogs(last 10 lines):' + kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs deployment/rag-it-model --tail 10" + ``` ``` - kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} wait --for=condition=ready --timeout=900s pod -l app=rag-it-model + NAME READY UP-TO-DATE AVAILABLE AGE + rag-it-model 1/1 1 1 XXXXX ``` - When they deployment is ready you should see output similar to: +## Verify the model + +- Create the curl job. - ```output - pod/rag-it-model-XXXXXXXXX-XXXXX condition met + ```shell + kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/curl.yaml ``` -- Verify the deployment with the curl job. +- Get the logs for the curl job. + ```shell + kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs --follow job/it-curl ``` - kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} apply -f manifests/curl.yaml - ``` + + The output should be similar to: ``` - kubectl --namespace ${MLP_KUBERNETES_NAMESPACE} logs job/it-curl + {"id":"chat-XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX","object":"chat.completion","created":##########,"model":"google/gemma-2-2b-it","choices":[{"index":0,"message":{"role":"assistant","content":"You're in luck! There are tons of great cycling shorts for women out there. To help you find the perfect pair, I need a little more information about your needs and preferences. \n\n**Tell me about:**\n\n* **Your budget:** Are you looking for something affordable, mid-range, or high-end?\n* **Your riding style:** What type of cycling do you do? Road, mountain, gravel, triathlon, etc.?\n* **Your comfort priorities:** Do you prioritize chamois comfort, breathability, moisture-wicking, or something else?\n* **Your fit preferences:** Do you prefer a snug fit, a looser fit, or something in between?\n* **Your style preferences:** Do you want a specific color, pattern, or design?\n\n**Here are some popular brands known for their women's cycling shorts:**\n\n* **Specialized:** Known for their high-performance, technical shorts with excellent chamois.\n* **Pearl Izumi:** Offers a wide range of styles and features, including their popular \"Pro\" line.\n* **Castelli:** Italian brand known for their stylish and comfortable shorts.\n* **Rapha:** High-end brand with a focus on performance and style.\n* **","tool_calls":[]},"logprobs":null,"finish_reason":"length","stop_reason":null}],"usage":{"prompt_tokens":25,"total_tokens":281,"completion_tokens":256},"prompt_logprobs":null} ```