Agentic RAG

Agentic RAG (Retrieval-Augmented Generation)

Agentic RAG is particularly useful for tasks that require multiple steps, such as those in autonomous research agents. This approach enables a system to not only retrieve relevant information but also make decisions, retain memory, and execute actions autonomously.

Key Components:

Routing: In an agentic setup, routing involves decision-making mechanisms where the system decides which tools or processes should be used based on the nature of the task. This routing system can dynamically delegate subtasks to different tools or engines depending on their suitability.
Tool Use: Autonomous agents often require interfaces to select appropriate tools (e.g., databases, APIs, or external services) for specific tasks. Along with selecting the tool, they must generate and pass the right arguments to invoke those tools effectively.
Memory Retention: One of the key features of an agentic RAG system is its ability to retain memory across interactions, allowing it to keep track of the context or previous actions and use this information to inform future steps.
Intervention: While the agent operates autonomously, there is still room for human intervention or guidance. By nudging the agent in between actions, you can adjust its behavior or refocus its attention on key areas before it proceeds further.

Lesson 1: Router Engine

To build the foundational components of an agentic RAG system, start by designing a Router Engine. This is the component responsible for intelligently routing queries and tasks to the appropriate sub-engines or tools.

Steps to Implement:

Load the Data:
Load the relevant datasets that the agent will interact with. This could include structured data (e.g., databases) or unstructured data (e.g., documents, websites).
Define LLM (Large Language Model) and Embedding Model:
Use a pre-trained LLM for generating responses and understanding user inputs. An embedding model will be used for transforming textual data into vector representations for semantic understanding and search.
Define Summary Index and Vector Index:
- Summary Index: Create a summary index for concise representations of the data. This helps in quick lookups or providing high-level overviews.
- Vector Index: Build a vector index for the data to allow for fast and relevant similarity-based queries using the embedding model.
Define Query Engines and Set Metadata:
Define different query engines that can handle various types of queries. Each engine could specialize in answering certain types of questions or retrieving specific datasets. Assign metadata to ensure that the right engine is triggered for the right query.
Define Router Query Engine:
The router query engine integrates all of the above elements. It decides which query engine to delegate a particular task to, based on the nature of the query, the metadata, and the available tools.

Tool Calling

Tool calling enables LLMs to interact wit external environments through a dynamic interface where tool calling not only helps choosing the appropriate tool but also infer necessary arguments for execution.
In standard RAG, LLMs are mainly used for synthesis of information only.
Tool calling enables LLMs to interact with external environments through a dynamic interface where too calling not only helps choosing the appropriate tool but also infer necessary arguments for execution.
In standard RAG, LLMs are mainly used for synthesis of information only.
Tool calling adds a layer of query understanding on top of a RAG pipeline, enable users to ask complex queries and get back more precise results.

Routing is a simplified version of tool calling.

We can also perform search on specific nodes of the vector index using the metadata properties assigned to the node

Building an Agent Reasoning Loop

An agent in LlamaIndex consists of "agent worker" and "agent runner" AgentRunner is the task orchestrator, agentState: mapping from task_id to taskstate. Taskstate, task, completed steps, step queue. Memory, Conversation memory. has Memory AgentWorker, task reasoning and execution, tools: vector tool and summary tool. is the LLM to use.

Agent Control Key benefits

Decoupling of task creation and execution: users gain the flexibility to schedule task execution according to their needs
Enhanced Debuggability: Offers deeper insights into each step of the execution process, improving troubleshooting capabilities.
Steerability: Allows users to directly modify intermediate steps and incorporate human feedback for refined control.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Agentic RAG

Agentic RAG (Retrieval-Augmented Generation)

Key Components:

Lesson 1: Router Engine

Steps to Implement:

Tool Calling

Building an Agent Reasoning Loop

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally