Skip to content

Latest commit

 

History

History
94 lines (68 loc) · 3.84 KB

File metadata and controls

94 lines (68 loc) · 3.84 KB

Structured Data RAG

Example Features

This example demonstrates how to use RAG with structured CSV data.

This example uses models from the NVIDIA API Catalog. This approach does not require embedding models or vector database solutions. Instead, the example uses PandasAI to manage the workflow.

For ingestion, the query server loads the structured data from a CSV file into a Pandas dataframe. The query server can ingest multiple CSV files, provided the files have identical columns. Ingestion of CSV files with differing columns is not supported and results in an exception.

The core functionality uses a PandasAI agent to extract information from the dataframe. This agent combines the query with the structure of the dataframe into an LLM prompt. The LLM then generates Python code to extract the required information from the dataframe. Subsequently, this generated code is executed on the dataframe and yields an output dataframe.

To demonstrate the example, sample CSV files are available. These are part of the structured data example Helm chart and represent a subset of the Microsoft Azure Predictive Maintenance from Kaggle. The CSV data retrieval prompt is specifically tuned for three CSV files from this dataset: PdM_machines.csv, PdM_errors.csv, and PdM_failures.csv. The CSV files to use are specified in the docker-compose.yaml file by updating the environment variable CSV_NAME. The default value is PdM_machines, but can be changed to PdM_errors or PdM_failures.

Model Embedding Framework Vector Database File Types
meta/llama3-70b-instruct None Custom None CSV

Diagram

Prerequisites

Complete the common prerequisites.

Build and Start the Containers

  1. Export your NVIDIA API key as an environment variable:

    export NVIDIA_API_KEY="nvapi-<...>"
    
  2. Start the containers:

    cd RAG/examples/advanced_rag/structured_data_rag/
    docker compose up -d --build

    Example Output

     ✔ Network nvidia-rag           Created
     ✔ Container rag-playground     Started
     ✔ Container milvus-minio       Started
     ✔ Container chain-server       Started
     ✔ Container milvus-etcd        Started
     ✔ Container milvus-standalone  Started
    
  3. Confirm the containers are running:

    docker ps --format "table {{.ID}}\t{{.Names}}\t{{.Status}}"

    Example Output

    CONTAINER ID   NAMES               STATUS
    39a8524829da   rag-playground      Up 2 minutes
    bfbd0193dbd2   chain-server        Up 2 minutes
    ec02ff3cc58b   milvus-standalone   Up 3 minutes
    6969cf5b4342   milvus-minio        Up 3 minutes (healthy)
    57a068d62fbb   milvus-etcd         Up 3 minutes (healthy)
    
  4. Open a web browser and access http://localhost:8090 to use the RAG Playground.

    Refer to Using the Sample Web Application for information about uploading documents and using the web interface.

Next Steps