Skip to content

Commit b3078ef

Browse files
committed
LlamaIndex: Add example using MCP
1 parent 240c718 commit b3078ef

File tree

6 files changed

+168
-15
lines changed

6 files changed

+168
-15
lines changed

.github/workflows/ml-llamaindex.yml

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,10 +40,15 @@ jobs:
4040
'ubuntu-latest',
4141
]
4242
python-version: [
43-
'3.8',
43+
'3.10',
4444
'3.13',
4545
]
46-
cratedb-version: [ 'nightly' ]
46+
cratedb-version: [
47+
'nightly',
48+
]
49+
cratedb-mcp-version: [
50+
'pr-50',
51+
]
4752

4853
services:
4954
cratedb:
@@ -53,6 +58,15 @@ jobs:
5358
- 5432:5432
5459
env:
5560
CRATE_HEAP_SIZE: 4g
61+
cratedb-mcp:
62+
image: ghcr.io/crate/cratedb-mcp:${{ matrix.cratedb-mcp-version }}
63+
ports:
64+
- 8000:8000
65+
env:
66+
CRATEDB_MCP_TRANSPORT: streamable-http
67+
CRATEDB_MCP_HOST: 0.0.0.0
68+
CRATEDB_MCP_PORT: 8000
69+
CRATEDB_CLUSTER_URL: http://crate:crate@cratedb:4200/
5670

5771
env:
5872
OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}

topic/machine-learning/llama-index/README.md

Lines changed: 45 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,15 @@
1-
# Connecting CrateDB Data to an LLM with LlamaIndex and Azure OpenAI
1+
# NL2SQL with LlamaIndex: Querying CrateDB using natural language
22

3-
This folder contains the codebase for [this tutorial](https://community.cratedb.com/t/how-to-connect-your-cratedb-data-to-llm-with-llamaindex-and-azure-openai/1612) on the CrateDB community forum. You should read the tutorial for instructions on how to set up the components that you need on Azure, and use this README for setting up CrateDB and the Python code.
3+
Connecting CrateDB to an LLM with LlamaIndex and Azure OpenAI,
4+
optionally using MCP. See also the [LlamaIndex Text-to-SQL Guide].
45

5-
This has been tested using:
6+
This folder contains the codebase for the tutorial
7+
[How to connect your CrateDB data to LLM with LlamaIndex and Azure OpenAI]
8+
on the CrateDB community forum.
69

7-
* Python 3.12
8-
* macOS
9-
* CrateDB 5.8 and higher
10+
You should read the tutorial for instructions on how to set up the components
11+
that you need on Azure, and use this README for setting up CrateDB and the
12+
Python code.
1013

1114
## Database Setup
1215

@@ -57,7 +60,7 @@ VALUES
5760

5861
Create and activate a virtual environment:
5962

60-
```
63+
```shell
6164
python3 -m venv .venv
6265
source .venv/bin/activate
6366
```
@@ -81,23 +84,25 @@ OPENAI_AZURE_ENDPOINT=https://<Your endpoint from Azure e.g. myendpoint.openai.a
8184
OPENAI_AZURE_API_VERSION=2024-08-01-preview
8285
LLM_INSTANCE=<The name of your Chat GPT 3.5 turbo instance from Azure>
8386
EMBEDDING_MODEL_INSTANCE=<The name of your Text Embedding Ada 2.0 instance from Azure>
84-
CRATEDB_SQLALCHEMY_URL="crate://<Database user name>:<Database password>@<Database host>:4200/?ssl=true"
87+
CRATEDB_SQLALCHEMY_URL=crate://<Database user name>:<Database password>@<Database host>:4200/?ssl=true
8588
CRATEDB_TABLE_NAME=time_series_data
8689
```
8790

8891
Save your changes.
8992

9093
## Run the Code
9194

92-
Run the code like so:
95+
### NLSQL
9396

97+
[LlamaIndex's NLSQLTableQueryEngine] is a natural language SQL table query engine.
98+
99+
Run the code like so:
94100
```bash
95101
python demo_nlsql.py
96102
```
97103

98104
Here's the expected output:
99-
100-
```
105+
```text
101106
Creating SQLAlchemy engine...
102107
Connecting to CrateDB...
103108
Creating SQLDatabase instance...
@@ -124,4 +129,32 @@ Answer was: The average value for sensor 1 is 17.033333333333335.
124129
'avg(value)'
125130
]
126131
}
127-
```
132+
```
133+
134+
### MCP
135+
136+
Spin up the [CrateDB MCP server], connecting it to CrateDB on localhost.
137+
```bash
138+
export CRATEDB_CLUSTER_URL=http://crate:crate@localhost:4200/
139+
export CRATEDB_MCP_TRANSPORT=streamable-http
140+
uvx cratedb-mcp serve
141+
```
142+
143+
Run the code using OpenAI API:
144+
```bash
145+
export OPENAI_API_KEY=<YOUR_OPENAI_API_KEY>
146+
python demo_mcp.py
147+
```
148+
Expected output:
149+
```text
150+
Running query
151+
Inquiring MCP server
152+
Query was: What is the average value for sensor 1?
153+
Answer was: The average value for sensor 1 is approximately 17.03.
154+
```
155+
156+
157+
[CrateDB MCP server]: https://cratedb.com/docs/guide/integrate/mcp/cratedb-mcp.html
158+
[How to connect your CrateDB data to LLM with LlamaIndex and Azure OpenAI]: https://community.cratedb.com/t/how-to-connect-your-cratedb-data-to-llm-with-llamaindex-and-azure-openai/1612
159+
[LlamaIndex's NLSQLTableQueryEngine]: https://docs.llamaindex.ai/en/stable/api_reference/query_engine/NL_SQL_table/
160+
[LlamaIndex Text-to-SQL Guide]: https://docs.llamaindex.ai/en/stable/examples/index_structs/struct_indices/SQLIndexDemo/
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
"""
2+
Use an LLM to query a database in human language via MCP.
3+
Example code using LlamaIndex with vanilla Open AI and Azure Open AI.
4+
5+
https://github.com/run-llama/llama_index/tree/main/llama-index-integrations/tools/llama-index-tools-mcp
6+
7+
## Start CrateDB MCP Server
8+
```
9+
export CRATEDB_CLUSTER_URL="http://localhost:4200/"
10+
cratedb-mcp serve --transport=streamable-http
11+
```
12+
13+
## Usage
14+
```
15+
source env.standalone
16+
export OPENAI_API_KEY=sk-XJZ7pfog5Gp8Kus8D--invalid--0CJ5lyAKSefZLaV1Y9S1
17+
python demo_mcp.py
18+
```
19+
"""
20+
import asyncio
21+
import os
22+
23+
from cratedb_about.instruction import Instructions
24+
25+
from dotenv import load_dotenv
26+
from llama_index.core.agent.workflow import FunctionAgent
27+
from llama_index.llms.openai import OpenAI
28+
from llama_index.tools.mcp import BasicMCPClient, McpToolSpec
29+
30+
from boot import configure_llm
31+
32+
33+
class Agent:
34+
35+
async def get_tools(self):
36+
# Connect to the CrateDB MCP server using `streamable-http` transport.
37+
mcp_url = os.getenv("CRATEDB_MCP_URL", "http://127.0.0.1:8000/mcp/")
38+
mcp_client = BasicMCPClient(mcp_url)
39+
mcp_tool_spec = McpToolSpec(
40+
client=mcp_client,
41+
# Optional: Filter the tools by name
42+
# allowed_tools=["tool1", "tool2"],
43+
# Optional: Include resources in the tool list
44+
# include_resources=True,
45+
)
46+
return await mcp_tool_spec.to_tool_list_async()
47+
48+
async def get_agent(self):
49+
return FunctionAgent(
50+
name="Agent",
51+
description="CrateDB text-to-SQL agent",
52+
llm=OpenAI(model="gpt-4o"),
53+
tools=await self.get_tools(),
54+
system_prompt=Instructions.full(),
55+
)
56+
57+
async def aquery(self, query):
58+
return await (await self.get_agent()).run(query)
59+
60+
def query(self, query):
61+
print("Inquiring MCP server")
62+
return asyncio.run(self.aquery(query))
63+
64+
65+
def main():
66+
"""
67+
Use an LLM to query a database in human language.
68+
"""
69+
70+
# Configure application.
71+
load_dotenv()
72+
configure_llm()
73+
74+
# Use an agent that uses the CrateDB MCP server.
75+
agent = Agent()
76+
77+
# Invoke an inquiry.
78+
print("Running query")
79+
QUERY_STR = "What is the average value for sensor 1?"
80+
answer = agent.query(QUERY_STR)
81+
print("Query was:", QUERY_STR)
82+
print("Answer was:", answer)
83+
84+
85+
if __name__ == "__main__":
86+
main()

topic/machine-learning/llama-index/demo_nlsql.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
"""
2-
Use an LLM to query a database in human language.
2+
Use an LLM to query a database in human language via NLSQLTableQueryEngine.
33
Example code using LlamaIndex with vanilla Open AI and Azure Open AI.
44
"""
55

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
1+
cratedb-about @ git+https://github.com/crate/about.git@instructions
12
langchain-openai<0.4
23
llama-index-embeddings-langchain<0.4
34
llama-index-embeddings-openai<0.4
45
llama-index-llms-azure-openai<0.4
56
llama-index-llms-openai<0.5
7+
llama-index-tools-mcp<0.3
68
python-dotenv
79
sqlalchemy-cratedb

topic/machine-learning/llama-index/test.py

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,3 +38,21 @@ def test_nlsql(cratedb, capsys):
3838
# Verify the outcome.
3939
out = capsys.readouterr().out
4040
assert "Answer was: The average value for sensor 1 is approximately 17.03." in out
41+
42+
43+
def test_mcp(cratedb, capsys):
44+
"""
45+
Execute `demo_mcp.py` and verify outcome.
46+
"""
47+
48+
# Load the standalone configuration also for software testing.
49+
# On CI, `OPENAI_API_KEY` will need to be supplied externally.
50+
load_dotenv("env.standalone")
51+
52+
# Invoke the workload, in-process.
53+
from demo_mcp import main
54+
main()
55+
56+
# Verify the outcome.
57+
out = capsys.readouterr().out
58+
assert "Answer was: The average value for sensor 1 is approximately 17.03." in out

0 commit comments

Comments
 (0)