Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/llama index #1160

Merged
merged 16 commits into from
Nov 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions docs/.gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
/.quarto/
lib64
integrations/data
integrations/storage
5 changes: 4 additions & 1 deletion docs/examples/llamaindex-output-parsing.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,10 @@
"id": "9c48213d-6e6a-4c10-838a-2a7c710c3a05",
"metadata": {},
"source": [
"# Guardrails Output Parsing\n"
"# Guardrails Output Parsing (Deprecated)\n",
"\n",
"## DEPRECATION NOTE\n",
"This integration between LlamaIndex and Guardrails is only valid for llama-index ~0.9.x and guardrails-ai < 0.5.x. and thus has been deprecated. For an updated example of using Guardrails with LlamaIndex with their latest versions, see the [GuardrailsEngine](/docs/integrations/llama_index)\n"
]
},
{
Expand Down
286 changes: 286 additions & 0 deletions docs/integrations/llama_index.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,286 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# LlamaIndex\n",
"\n",
"## Overview\n",
"\n",
"This is a Quick Start guide that shows how to use Guardrails alongside LlamaIndex. As you'll see, the LlamaIndex portion comes directly from their starter examples [here](https://docs.llamaindex.ai/en/stable/getting_started/starter_example/). Our approach to intergration for LlamaIndex, similar to our LangChain integration, is the make the interaction feel as native to the tool as possible."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Installation\n",
"Install LlamaIndex and a version of Guardrails with LlamaIndex support."
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Found existing installation: guardrails-ai 0.6.0\n",
"Uninstalling guardrails-ai-0.6.0:\n",
" Successfully uninstalled guardrails-ai-0.6.0\n"
]
}
],
"source": [
"! pip uninstall guardrails-ai -y"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"! pip install llama-index -q\n",
"# ! pip install \"guardrails-ai>=0.6.1\"\n",
"! pip install /Users/calebcourier/Projects/guardrails -q"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Install a couple validators from the Guardrails Hub that we'll use to guard the query outputs."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Installing hub:\u001b[35m/\u001b[0m\u001b[35m/guardrails/\u001b[0m\u001b[95mdetect_pii...\u001b[0m\n",
"✅Successfully installed guardrails/detect_pii version \u001b[1;36m0.0\u001b[0m.\u001b[1;36m5\u001b[0m!\n",
"\n",
"\n",
"Installing hub:\u001b[35m/\u001b[0m\u001b[35m/guardrails/\u001b[0m\u001b[95mcompetitor_check...\u001b[0m\n",
"✅Successfully installed guardrails/competitor_check version \u001b[1;36m0.0\u001b[0m.\u001b[1;36m1\u001b[0m!\n",
"\n",
"\n"
]
}
],
"source": [
"! guardrails hub install hub://guardrails/detect_pii --no-install-local-models -q\n",
"! guardrails hub install hub://guardrails/competitor_check --no-install-local-models -q"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Download some sample data from the LlamaIndex docs."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
" % Total % Received % Xferd Average Speed Time Time Time Current\n",
" Dload Upload Total Spent Left Speed\n",
"100 75042 100 75042 0 0 959k 0 --:--:-- --:--:-- --:--:-- 964k\n"
]
}
],
"source": [
"! mkdir -p ./data\n",
"! curl https://raw.githubusercontent.com/run-llama/llama_index/main/docs/docs/examples/data/paul_graham/paul_graham_essay.txt > ./data/paul_graham_essay.txt"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Index Setup\n",
"\n",
"First we'll load some data and build an index as shown in the [starter tutorial](https://docs.llamaindex.ai/en/stable/getting_started/starter_example/)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"import os.path\n",
"from llama_index.core import (\n",
" VectorStoreIndex,\n",
" SimpleDirectoryReader,\n",
" StorageContext,\n",
" load_index_from_storage,\n",
")\n",
"\n",
"# check if storage already exists\n",
"PERSIST_DIR = \"./storage\"\n",
"if not os.path.exists(PERSIST_DIR):\n",
" # load the documents and create the index\n",
" documents = SimpleDirectoryReader(\"data\").load_data()\n",
" index = VectorStoreIndex.from_documents(documents)\n",
" # store it for later\n",
" index.storage_context.persist(persist_dir=PERSIST_DIR)\n",
"else:\n",
" # load the existing index\n",
" storage_context = StorageContext.from_defaults(persist_dir=PERSIST_DIR)\n",
" index = load_index_from_storage(storage_context)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Guard Setup\n",
"\n",
"Next we'll create our Guard and assign some validators to check the output from our queries."
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"from guardrails import Guard\n",
"from guardrails.hub import CompetitorCheck, DetectPII\n",
"\n",
"guard = Guard().use(\n",
" CompetitorCheck(\n",
" competitors=[\"Fortran\", \"Ada\", \"Pascal\"],\n",
" on_fail=\"fix\"\n",
" )\n",
").use(DetectPII(pii_entities=\"pii\", on_fail=\"fix\"))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Querying The Index\n",
"\n",
"To demonstrate it's plug-and-play capabilities, first we'll query the index un-guarded."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The author worked on writing short stories and programming, starting with early attempts on an IBM 1401 using Fortran in 9th grade, and later transitioning to microcomputers like the TRS-80 and Apple II to write games, rocket prediction programs, and a word processor.\n"
]
}
],
"source": [
"# Use index on it's own\n",
"query_engine = index.as_query_engine()\n",
"response = query_engine.query(\"What did the author do growing up?\")\n",
"print(response)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now we'll set up a guarded engine, and re-query the index to see how Guardrails applies the fixes we specified when assigning our validators to the Guard."
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The author worked on writing short stories and programming, starting with early attempts on an IBM 1401 using [COMPETITOR] in 9th <URL>er, the author transitioned to microcomputers, building a Heathkit kit and eventually getting a TRS-80 to write simple games and <URL>spite enjoying programming, the author initially planned to study philosophy in college but eventually switched to AI due to a lack of interest in philosophy courses.\n"
]
}
],
"source": [
"# Use index with Guardrails\n",
"from guardrails.integrations.llama_index import GuardrailsQueryEngine\n",
"\n",
"guardrails_query_engine = GuardrailsQueryEngine(engine=query_engine, guard=guard)\n",
"\n",
"response = guardrails_query_engine.query(\"What did the author do growing up?\")\n",
"print(response)\n",
" "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The GuardrailsEngine can also be used with LlamaIndex's chat engine, not just the query engine."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"The author worked on writing short stories and programming while growing <URL>ey started with early attempts on an IBM 1401 using [COMPETITOR] in 9th <URL>er, they transitioned to microcomputers, building simple games and a word processor on a TRS-80 in <DATE_TIME>.\n"
]
}
],
"source": [
"# For chat engine\n",
"from guardrails.integrations.llama_index import GuardrailsChatEngine\n",
"chat_engine = index.as_chat_engine()\n",
"guardrails_chat_engine = GuardrailsChatEngine(engine=chat_engine, guard=guard)\n",
"\n",
"response = guardrails_chat_engine.chat(\"Tell me what the author did growing up.\")\n",
"print(response)"
]
}
],
"metadata": {
"kernelspec": {
"display_name": ".venv",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.12.4"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
8 changes: 8 additions & 0 deletions guardrails/integrations/llama_index/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
from guardrails.integrations.llama_index.guardrails_query_engine import (
GuardrailsQueryEngine,
)
from guardrails.integrations.llama_index.guardrails_chat_engine import (
GuardrailsChatEngine,
)

__all__ = ["GuardrailsQueryEngine", "GuardrailsChatEngine"]
Loading
Loading