LlamaIndex: Add example using MCP #1032

amotl · 2025-07-20T00:01:06Z

About

For Text-to-SQL purposes, exercise the llama-index-tools-mcp package together with the CrateDB MCP Server, specifically to validate its OCI standard image published to GHCR.

References

Backlog

Software tests
Documentation

coderabbitai · 2025-07-20T00:01:13Z

## Walkthrough

The changes add two new demonstration scripts for querying CrateDB using LlamaIndex with natural language SQL and MCP interfaces, update environment files to export variables, introduce new dependencies, enhance the README with setup and usage details, revise tests to run the new demos, and extend the CI workflow to include MCP service testing.

## Changes

| File(s)                              | Change Summary                                                                                                               |
|------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
| README.md                          | Updated title and intro; added MCP usage instructions; changed example run command from `python main.py` to `python demo_nlsql.py`; improved formatting and added external links. |
| boot.py                            | Added new `configure_llm()` function to initialize LLM and embedding models based on environment variables for OpenAI/Azure. |
| demo_nlsql.py, demo_mcp.py         | Added new demo scripts illustrating querying CrateDB with LlamaIndex via natural language SQL and MCP protocol respectively.   |
| env.azure, env.standalone           | Added `export` keyword to all environment variable declarations for proper shell export.                                       |
| requirements.txt                   | Added `cratedb-about` (version 0.0.6) and `llama-index-tools-mcp` (<0.3) dependencies.                                          |
| test.py                           | Renamed `test_main` to `test_nlsql` to test `demo_nlsql.py`; added `test_mcp` to test `demo_mcp.py`, both asserting expected output. |
| .github/workflows/ml-llamaindex.yml | Updated Python version matrix; added `cratedb-mcp` service and matrix dimension for MCP version in CI workflow.                 |
| main.py                           | Deleted legacy script replaced by new demos.                                                                                  |

## Sequence Diagram(s)

```mermaid
sequenceDiagram
    participant User
    participant DemoScript (demo_nlsql.py/demo_mcp.py)
    participant LLM (LlamaIndex)
    participant CrateDB

    User->>DemoScript: Run main()
    DemoScript->>LLM: Send natural language query
    LLM->>CrateDB: Generate and execute SQL query
    CrateDB-->>LLM: Return query result
    LLM-->>DemoScript: Return answer
    DemoScript-->>User: Print answer and metadata

Estimated code review effort

3 (~45 minutes)

Suggested reviewers

hlcianfagna
kneth

Poem

🐇 A bunny hops with joy anew,
Two demos fresh, just made for you!
Exported vars and tests in line,
CrateDB queries, simply divine.
LlamaIndex guides the way,
Query with ease, come run and play! 🌟


<!-- walkthrough_end -->


---

<details>
<summary>📜 Recent review details</summary>

**Configuration used: CodeRabbit UI**
**Review profile: CHILL**
**Plan: Pro**


<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 6f7cd572842f630da519ba05ef662ec283610785 and a43ee4eae8f3619b8752185a015a9cfd8fb50ffe.

</details>

<details>
<summary>📒 Files selected for processing (6)</summary>

* `.github/workflows/ml-llamaindex.yml` (2 hunks)
* `topic/machine-learning/llama-index/README.md` (4 hunks)
* `topic/machine-learning/llama-index/boot.py` (1 hunks)
* `topic/machine-learning/llama-index/demo_mcp.py` (1 hunks)
* `topic/machine-learning/llama-index/demo_nlsql.py` (1 hunks)
* `topic/machine-learning/llama-index/requirements.txt` (1 hunks)

</details>

<details>
<summary>🚧 Files skipped from review as they are similar to previous changes (5)</summary>

* topic/machine-learning/llama-index/requirements.txt
* topic/machine-learning/llama-index/demo_nlsql.py
* topic/machine-learning/llama-index/README.md
* .github/workflows/ml-llamaindex.yml
* topic/machine-learning/llama-index/boot.py

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🧠 Learnings (2)</summary>

<details>
<summary>📓 Common learnings</summary>

Learnt from: amotl
PR: #1032
File: topic/machine-learning/llama-index/demo_nlsql.py:28-29
Timestamp: 2025-07-20T00:14:38.710Z
Learning: In demonstration and example code within the cratedb-examples repository, prefer simpler code without extensive error handling to maintain clarity and readability of the examples.

Learnt from: amotl
PR: #1033
File: topic/machine-learning/llm-langchain/README.md:138-156
Timestamp: 2025-07-21T18:46:07.502Z
Learning: In CrateDB MCP server configuration, the correct environment variable name is CRATEDB_MCP_TRANSPORT, not CRATEDB_MCP_ADAPTER_TRANSPORT. This variable is used throughout the CrateDB MCP codebase to specify the transport protocol.


</details>
<details>
<summary>topic/machine-learning/llama-index/demo_mcp.py (3)</summary>

Learnt from: amotl
PR: crate/cratedb-examples#1032
File: topic/machine-learning/llama-index/demo_mcp.py:16-16
Timestamp: 2025-07-20T00:15:45.273Z
Learning: When providing API key examples in documentation, use clear placeholders like "your_api_key_here" instead of realistic-looking fake keys, even if they contain "--invalid--" markers, to avoid triggering security scanners and provide clearer guidance to users.

Learnt from: amotl
PR: crate/cratedb-examples#1032
File: topic/machine-learning/llama-index/demo_nlsql.py:28-29
Timestamp: 2025-07-20T00:14:38.710Z
Learning: In demonstration and example code within the cratedb-examples repository, prefer simpler code without extensive error handling to maintain clarity and readability of the examples.

Learnt from: amotl
PR: crate/cratedb-examples#1033
File: topic/machine-learning/llm-langchain/README.md:138-156
Timestamp: 2025-07-21T18:46:07.502Z
Learning: In CrateDB MCP server configuration, the correct environment variable name is `CRATEDB_MCP_TRANSPORT`, not `CRATEDB_MCP_ADAPTER_TRANSPORT`. This variable is used throughout the CrateDB MCP codebase to specify the transport protocol.

</details>

</details><details>
<summary>🪛 Gitleaks (8.27.2)</summary>

<details>
<summary>topic/machine-learning/llama-index/demo_mcp.py</summary>

16-16: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

</details>

</details>

</details>

<details>
<summary>🔇 Additional comments (3)</summary><blockquote>

<details>
<summary>topic/machine-learning/llama-index/demo_mcp.py (3)</summary>

`20-31`: **LGTM! Clean and well-organized imports.**

The imports are properly structured following Python conventions and all appear to be necessary for the demo functionality.

---

`33-66`: **Excellent Agent implementation for MCP integration demo.**

The Agent class provides a clean abstraction with:
- Configurable MCP server URL via environment variable
- Proper async/sync pattern with `aquery` and `query` methods
- Clear separation of concerns between tool retrieval and agent creation
- Appropriate simplicity for demonstration purposes

---

`68-89`: **Well-structured main function for MCP demo.**

The main function effectively demonstrates the MCP integration with:
- Proper environment setup and LLM configuration
- Configurable demo query via `DEMO_QUERY` environment variable
- Clear output showing both the query and answer for educational purposes
- Appropriate simplicity for demonstration code

</details>

</blockquote></details>

</details>
<!-- internal state start -->


<!-- DwQgtGAEAqAWCWBnSTIEMB26CuAXA9mAOYCmGJATmriQCaQDG+Ats2bgFyQAOFk+AIwBWJBrngA3EsgEBPRvlqU0AgfFwA6NPEgQAfACgjoCEYDEZyAAUASpETZWaCrKNwSPbABsvkCiQBHbGlcSHFcLzpIACIAGS80ZjQASQwlAA8uAEFaehJ0xO5IyGxEeAwiSABZAGEraMgAdzRkBwFmdRp6OTDYD1LKdGZ8CPRkbm9ffyCQyAxHAUGARgAGAGYAJhQsXD6/Em58MoIXRioaAHoGc7oBMHzCyMQNGD2JikPED3wAM17UWwoZAEHgUfASeBKdBYB7MIoeJTDDCIXDncqVXb9L78P6YyA+RJoMDlDJgAj4LyIMDMBjcHhoBgAazQpAANE0EAxYED0BJtAkBMV8FgrLIrMkXgBlbiieA/eAMNA+WTs9SQfKUBhIaT/cYM5mkML4UiYviNdTcvE1G4AEQAQtU6pBJZQpHwQXyvJDqB51MgAPI1ZL2XCYWjOejwJKGiaCpB9egggDi6gAEtgBJAasLQ+VBjYSEQkKj5AAKJOpmo2ACUL3ckCUiGu8G44mF+x+lDIDB1/gSXU8Pn2MxRiC4ZgAbABOJbbM4+q43Wh3WHwxDssOQMwAVjW0PoO5Wc+uC5PXTuNO4db2An1XmNKBozGQP3w7oQyEB5QYXmwUKYcKRAOiC/LgzT+GEITIJutD4AwjjsNQ8DCteHiAoqWCLJAEIkI0UTUJAsC4Lg3BjhcFxFrsGYaABi6nkuK4FIB0gXBMPgXKsmwaEY+jGOAUBkPQvw4AQxBkMoA4AWwGCcKC/DCKI4hSDI8hMEoVCqOoWg6LxJhQHAqCoJgImEKQ5BLgorDsFwVCNPYjhJKcPRqcommaNouhgIYfGmAYBDcAqFxJFyeZgJEzgYOiFwEkkxJpPkFw2AAolkNpVElGjMLQHAGNEeUGBYkBZMkYnmT69AOE4pzCVymCkIgbh7MlqXpWE6jFM0yC1RUUQgtEABysQbJKACKsRNBakDxISqQZFwI3BC46JZraDqlMtGDUNgVC+AkFTYCyJANCCizEYM/g/JEYi9B4TAyewqHbKiijYGIb7yF8YjIVgnXquk3Bhr1+CQNJbZYPgrbfUqJTYsJtRWPuRodl2GA9jdU0JEks35DA+S4GShCjeNSbYJCJCPbgeBvvA0NehgjJNC0HZvkkZ30JueJfJTdLlCiFCvWDyB4RBP7OHK8B0KqqO/rQy2IDKWo/LIy14mQEJghgoPoF9fJg5ZSRpGM6D2H0Q6IJF3AyporweGr8Aa1rfIUDTgq20x8KQK+fAAAZVlk0BJfaAD6RNZLENSpklVQAJpBwAqjYsTe4z0G0EIpQDiC/jDFIkBBCM0iPVJHPA/zOx7Ek5R/Y8Hi/dg3DhgOPxgswkDe9wsi7O2lcYBoHfJyC7ed7A7aIvgQcYJSAReH3sje+y5q7NC6C5FEGAF70BHTVj8XpAA5Mgg1E9AKiRAtlCyElFR5jyaBzFtO34nVB2GkTYSnx4MynGQRbkI9+QKwHPgPAExQhe1ZuICoKcUBwjBFIWgLwshzFwvYRS31oFoFXvQRszY1BQJHnZLO2By4eGtD6e0joEZfAoG6fEcElReBVNsH8f5lp2wduwbCYsP6oO5ojPExdDYghRM4UInNXSUBeKkPmAtvovjfH4YhkUoF3zHjDFWex/QygwMVIq4oJpL3hpAUsQ8u5YDHkHS8s9vbViaF2FeShaAbnvFAxe3IAGKSiMAkieApQkDdNDPs/jMChDxpQTau1yiMiFvYzBjj+B8Hro3KIXtKHsjxJTE4NMvAbkNtvFIu8GxwQQjJJCKEjDmEsFkLwNA0TCmBMDPEShRZ1ORDiP6hwKBAL4LGL0DB1QyXUBLBqBh+rCnJrlfKBgIBgCMP5QKwUEDkDCiQCKUUYpEhJAlAQ+ARizxynlaIBUqklTMhJKIlVHLyBqrAOq0gjBIPIHZYYtBvC+mQOUZ6ryeyJmBndeURBtoeDvntQFh0QaKBIL4UssRYhVFsZuYyJBmCLFyMtF5ULIC3i+EJGEGB1bCkdtw12rQFZymVlAvEWiyC6KyPo3AsgZSI2uHQdg2Tng2zbv8+AgL/BBx8MwZOPxiFfXbOUIZSp4AAC8dRUu0bo0UZjGBek4W4vRwYGUynZIJQ4nz2RujKMKXJ9BGQkHkM3FgAyCWa04U7F2TwpGhAhmDBh8gyAf2gvYKMHsiD3lvL4JQAhsCVFuWkSIfBUn5OxukFAuI9je0DcGoVCRKioC5o9YVqM9Ysp9NBLAsKqjbBEajLEbCLSDDxHySKBJIDUp0cGBRWQpVAtrfK4MotEBfBfC3dGUbClegEFQFw7IlDaNllA9seI6UasZSQdkXNIEYgrpC3wm02BI2iEQVsYAAAs+AGgc2RTKc4LaQQrA0OeyAAAxBRyLUXjqIOuR8qCrq4BkCQMCJAyBTTqrVKujQqCW0oNBMExD6B1uKklFFdAH3IEbc2/wEHkhQfveiJ93LeVRDVXiNAlswS8BpjQK19tCW2uJQ6yAyQ40eGnWEWdPJ16hH8EwIgkUZVOOfVQbUnrvYADUlTBCShQMEFBvYZpFXrfwlMKBtLvtzIU1GFAYABUC+gBbEZ3pg+ildRbQwlqfQo72YzyBCoUaraDaKoEYt8HKdGtHNXvO2J6SE3EKmFWqbUspbSQRNNEAkVpcG/gALfN0zwcZ+lsvEPc6Z16JPoLiXQLgCaSB/Aw0C/lXhmClkTUQLgOyKSQAALzXqVF8WxXkYD10iAAbQLeyO0LQSAoa0xUAAusnKu3t5kMCCgyJZJAVlrIqNFTGmzd4XHy5ofukyjnTN8t13rIVlnhRk+s0bcUMgXAsVYjuBz8rudOeJCylznDXNS6G+qDzkF2UVSPLATZnath5J8sE3ygYNmRfU1EPo86LQpZUVR1AVCNfUSop+FBDSgpfh4azxiC22IhHJvYRjSxVD6zfbMrBiEKi89YMEBAmBeFsZ8ygPwGTk05Q9lsoQBjID7RkAx3IkMJKKghjwzPrMNKesoa6aq75kJoBQox1C3SOo+/KcgeaxiyFRrADWwDkBZDMqEDtwJbkq+FOQMQXO8TC4kYksoYOGtlAYPDGoKqZL8FoamaA0ArDsik87fxOo0B8ngAKYo5JKTYRptUWk0BdleGlKIY1ZxVk0E9VeuLwolecLS/4egfPIBJisNAXdwNYeIvsLIFEyLQQsCexa1uAuSAUN5qiWR9THqx8t6r/PEJGxYpGNyFoMuuTy9KIjRAbe5fCgV79i+IMP0j1oGrgiZd77Sdps/cFX9hnoCINoZEYi9iN2B18R6PdPbR6wPeTByB2Gkct3aj17J4+yr2HV3TITCMX5o8r0PZdPWIBrgP04pZogAHV1c8hwwEw0nowQnsCiXwyICiSwAA/NENWOyARoMpSnsF/PIBzHsFJttOQOzMiMLI9FTk9t+DLDqKUOCuXvzKKm0qkiIl0hoqQqtJQqgjQoMJuAumwviiRjasfuRjqKkszrRqajnq5pUkVDUhJHIkjL5i0l5oFh0iFl4j0hmH0gMuEMMjxEVFgsqi0MgN7DXrgMnGqmwF3KPjlJALoG3K3qjOLpACaEHF7ogKWF8F4D8NWN7AYMYVAN7GYf0koH8FYYdDJHYVCo4c4a4aYd3uYV4egEgf4Q4eyEgU4S4SYcln8JEfYT8DEX9nEVADkPEpmmQW3OET3KWHEYcjxPNhDAsujstqsqtsNhshtglBYpPIgNPPsjNsckVIdmVMBA5Kdu0t1JdgYI8igrdu2LgaEIZDJK9q9O9mPMvjcG/v9sbGvtiqWlAptFPrtDPgAb7sZOpiTkQBZGqvTvkDgc2E9nvqPsRhwhwc7Kfkpipv4Grh4OppuJppZpUNZqHiEKfPGC7kplrnrCCPzrQUsSDutFAqHF4FyMirIGLjmhHsvETDaEDssU0IBsesAe+B4PLLKPKL1DwpuOKuIJKjKlLkfGNCfK7OfC4FfL/LXJNHiFfq8Q+hCkoDkojCCRvpTqcSvt+k/sbC/sxJPttNPvtLPn9sYl4WgN4Iuhut/gRKgH/soAAQJh4BQWQCBHwJAdAQvkviiOjByR4LsKBkQJaIgeKT/HmLAc7PAUuqqSzNQMBMAhQD2E+niNTL/NDEgeknsJgIgMLKHhoXBLfvQPoZgkDpvkvtvlmugt2Lhg4P2BfqgI0G+IyJdPgHZASQfukKIHgFhn0CQvYNyYorJo8SDJGS8m8gIW0R5iIfUmIavn5s4JIe0sFl0rIWFgoZFsoTFlHtGe2AltlHkSlmWeUIUR1lgF1mUT1osqFCtsokQCNoSHUekFtp9hPFPDPNNsUXNrMn5FOYtv1oNtUQubUVsiuWrFoGzntkcgdqVOchVN0dVOdnciMvWIfuwaECfq7B9hIYLHOHiN7BeWgGzkKu7rXPYi8uLO9rwAcIJOjIBf9CFsnHwcmRQL8uqH1lwjcd+RoTyh+deKgH0bbMiECmPjyVhfai7hBK2ZnI0nsIgKbL4O+aDOyEkIyNQa3AyC6WUN+cIhmLwHBNIF2l3tyVmTmU3D2pzIxY9F+cUGui7obIAb2MipGSKhdnQAANz8CTzyA4Zop6zCQIWdI6EYKWwqr0DY7gKMJIwML4h5jPAVLVnCEBb1kIiNkuXCQ0Xtm9IKiKFDLRbGYTLbkzJzL7kzmVFDYnnrZnkXAXnFrhguIkDXnVkdH3n2RVRnaMDqWvl7DMVkbYXFDNL+bNlVwAVxV6YJXjKgUdQQWKBQXoX4F/iGnxo0XIVmqoUIJ47+LISlCMLekeCyU6jez+hWBJT9TFRBzTpBzQDRyjXzxtx+wBzByhzhyRwxzxyJzzWbi+w2D+yBx2jTVZB2ixBJRBzjXpS6GxKdp4VYYWjeLSFUEIF54+b0WMWXFH7WxZA/C1LoxEX9XYiDUgZuxFAKjtRuqIVtkcZsUcW8j8g8IgghReD0ACXcUKUPkCAMVQqUiububOXNkvVuW/miGeUQ2hY+URaDJRYjJQBjJZUvlIxeVCRyHhZ+VU3HiQpaXCjWV5XXGUUPWhAiJPjsAxIQQDn/mvVY3vUfl3E8rCl6zyiRDcTBWlEBTTkVEDZzlrZLkxXTCkzZzC0aC4DpCcCtG3lnLHaPmZVEWvnJnXYfZjrdjz7CzApqEE3Dh63IrC2exgVJZni3BgAqDeLJwBQYAYFIwGroLnrnoTih7eynm7wEwUhUiXjJwPGogKi0XYTAaiFPBj5YDnprAvC035DFjLSjqCSO06jO37A5ztmQW4kIKCE1kuVu1FVNl/kk3GXeXyG+VdkBXjJK1TIhV7mq0HmzlVHzmLmxQxUR5TayDJVm1HblTpVXK9HZWNRuyl2UqzA5F6xdYhCWJL66FMz+DyXoX70ojrlNFeDJybj10SzoUajwREbJbDBX3NH9y6arJCR/Dew9zWKPTeohYhg+hax1wNzL1cVvgPrWVF5txwNb5ANdIjkYADzAzexwMNEbkwLGUoNiZFTfXnRKLUFtwFFOFRlkH9WQR6nrx2SKitikXozxURj8AgJ4CIwaGUBvqbyhB3zYmKy+W+nCwhjWmVAf5ZBYGDCdRcD1iu5KkDUqkYmoJgGam3x4b4DpBRg+jWVLAADsF6hd2puGMozgHyBZ3ioCrmWRQyworqG4dts9FDe9s9litIx9qcjiBFyATy1DAtUY7uzg1l5xB+rBVxn5nB3alqgF+KGgzDiV81SD3Dm4fJAFZDQqO+nsPar948O2c8Z+uG0md+rDPiuAAZnaXDpZMFoBaMwknMiQwKkjfAfMy0pVewFjviRgdozevjyAfY1yBZ3sfty4oFxtLaqSBpvC9ciMAwXKcZOeaDjABTLacBQCbDYCCibo4sioYMVZuNnmf5rd7lzZndMhTNHZvdlN3ZUAvZuRp99Tg5F9uAh9o5QzAg+TpEOe5DgASYRtwuONHTylivPvPd6IDkOdYLbhUa0T1a3T3jaz3WI9kZMDlJYuOXiAsMTAufPjm/NhXq1HmT3x2bbwtblD2+QaCUSwAZgXCoWpn3iNCIBBReBhSjZnkaCyDMBeAL0nJ3kW0ZVr0vkb3J5pgZhFRkFCwplpl2SpJ4jRCz0NBCCCDQJJLL0I3qXozDFYAR3dzUDOwxo9D+BFAMjLR7yF0AAce8jOkAJrGgqwFrjQCAxQUmS+xrhdSwawe8j0gzDEYAWrqDZZ6dMav0F09pbMNlIM0p8AYUN8zgVA8gQaoQpqT2nM6IxQSlVrkUJpEQsgHrRUdtrMurDYUY6p6CXr5UF4tIPr2dwo7jDit1S8laije8PcOb2Ywm0ghwaQKbTCd8PjIuCoHgpb541Ibj0CL2L0Py6ALij6ZM6MJdKIy0g7twycfbPYXjdtK7WIRTWjho3sJp1wGgyEdElwrzw73AA8LIpAie9JFcOr8AMag1bcAAJAAN7Pv+u6s0TeuXiVsUCGpYAAC+/781wWwluDprKwEHoeXMIT1qRKBVQ1i1+1Qc8M01u1/UkoVg/oNg0A81O1e1wcKHqY/okoOH7IeHS1B1KHmH2HW1hs5HSHNQsQccJHSUNgG1Sclr8o2Z9ASlHKtNzegw1tSMCrmYuepELOPN4T8HItMOmCEybmVSeNBzdFhNxVHdQWpN3dLNfd1NkAtNQnIIjNLO5NrN8+VcLknNOlWYwYNLkrMtvKuOv099dAg9s2w9ELeLmtNR0V42/9u2pt3L5ty9J2T5dNPUOVblQEKSYFODkQoM+EhZj2K+BEtOxs/mkOmxMOOmMKcKtiIISBxsb8EzVcaxwpGxoppAYuDglspz9nct6CqSJA5afAVa7uCQraNKDafATaLaPB4oT6zBUCUzDJcKGmFmzJnO2sTAaFXbWKjWuKUtcHlFHK9YoxdCcneQoTH1FFtxXxcYmN7MfxaCk6wMQJ5CDoEziOzoY01SUJzA3bhscJCXRxMaBXFp5AdXCelrTSyJIOm4BaYumIMI2Zz9CXtyaFLk9ApXj8UOYpg+H+cpYxpZcjVAypv4dpTT6p4BUBCKi+5e+pv32IRpwCJp6Mb3185AVpnyJD4CDpFyTpLpVDBXzTFQVDQjlAZTIEWoy9oZa+RckKzCBBeQwmCioatAdMlQqS5Qzm9A9m9G8euOm4Aw2CBc+KSjknZZm0pAoM/PExBWdnv0fM2gmbXs4Eg55+U0cKZ+muikSMBpZ+/gP25PtJDuxC8xoe7TTG0g0pDlCnQh+zohhzRNdZJzkNxnPdFNSh0WUAV84gEECtA7HnS2ULkVU9Y2m2fnc8H2UXHG2c4I6IRhJhNze95+GWWWcRwRxfJbaTxh8R+k9FRZ7ApwuqluYKaFUQPQ3stmQcE89TPfRWxW0QPfPcPf0QXAaTrRc23kekAyP9JkPLy9Uk1kfgaAdkIXcbqkkKGkag7kOkXkBgM/Uk6gQckIiAQc/gOEeEtAQclBoQvEB//EkA24GwO6EHKwujejPwnYawPw7/U4O6U4CDjujQA/BVA7/EgDuiWBcVX+ujbcGgG3CeRDAM/FLAwA2B0BTWP/WgCsBIDbgewO6AQBsF0YkABAuAndGTgEBEC1gKwU1tuB3SmtaAPwH/nfF0iP8SBejXRkQNoBTgpw24JYG6w2CYI0BPwMgWgGAEpYv0U4NADQN0YADTWM4Zgfvxn5LAeBPwWgBsAYB8C0AaAXRuBxIArAGAWAgQJIIECmsJwHA+gT8F3BSC3WfAgQLOBYEQBIAGwCcIsA2Cmsd0r/FYKoA2C8C1gOAlLEAIYC6MpwE4BgD8AnDhgpwOAxUHwPsGKDH+E4H4LowMHbhdGbgndBsHCFUDwwvAqcLeBWDbgUsE4CcGgNQEYCJwqwHQQgIcFQBRBfgiAashICmtGBFQvIaa1gEbAlgNAtACsCWBwCpwYQ2gM0JIErAv+Hge/ofxYAdAnmp/c/t1Vwh0Ag4cFfQEAA= -->

<!-- internal state end -->
<!-- finishing_touch_checkbox_start -->

<details>
<summary>✨ Finishing Touches</summary>

- [ ] <!-- {"checkboxId": "7962f53c-55bc-4827-bfbf-6a18da830691"} --> 📝 Generate Docstrings
<details>
<summary>🧪 Generate unit tests</summary>

- [ ] <!-- {"checkboxId": "f47ac10b-58cc-4372-a567-0e02b2c3d479", "radioGroupId": "utg-output-choice-group-unknown_comment_id"} -->   Create PR with unit tests
- [ ] <!-- {"checkboxId": "07f1e7d6-8a8e-4e23-9900-8731c2c87f58", "radioGroupId": "utg-output-choice-group-unknown_comment_id"} -->   Post copyable unit tests in a comment
- [ ] <!-- {"checkboxId": "6ba7b810-9dad-11d1-80b4-00c04fd430c8", "radioGroupId": "utg-output-choice-group-unknown_comment_id"} -->   Commit unit tests in branch `llamaindex-mcp`

</details>

</details>

<!-- finishing_touch_checkbox_end -->
<!-- tips_start -->

---

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

<details>
<summary>❤️ Share</summary>

- [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai)
- [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai)
- [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai)
- [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)

</details>

<details>
<summary>🪧 Tips</summary>

### Chat

There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai?utm_source=oss&utm_medium=github&utm_campaign=crate/cratedb-examples&utm_content=1032):

> ‼️ **IMPORTANT**
> Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.
- Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples:
  - `@coderabbitai explain this code block.`
  -	`@coderabbitai modularize this function.`
- PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
  - `@coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.`
  - `@coderabbitai read src/utils.ts and explain its main purpose.`
  - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.`
  - `@coderabbitai help me debug CodeRabbit configuration file.`

### Support

Need help? Create a ticket on our [support page](https://www.coderabbit.ai/contact-us/support) for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

### CodeRabbit Commands (Invoked using PR comments)

- `@coderabbitai pause` to pause the reviews on a PR.
- `@coderabbitai resume` to resume the paused reviews.
- `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
- `@coderabbitai full review` to do a full review from scratch and review all the files again.
- `@coderabbitai summary` to regenerate the summary of the PR.
- `@coderabbitai generate docstrings` to [generate docstrings](https://docs.coderabbit.ai/finishing-touches/docstrings) for this PR.
- `@coderabbitai generate sequence diagram` to generate a sequence diagram of the changes in this PR.
- `@coderabbitai generate unit tests` to generate unit tests for this PR.
- `@coderabbitai resolve` resolve all the CodeRabbit review comments.
- `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository.
- `@coderabbitai help` to get help.

### Other keywords and placeholders

- Add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed.
- Add `@coderabbitai summary` to generate the high-level summary at a specific location in the PR description.
- Add `@coderabbitai` anywhere in the PR title to generate the title automatically.

### CodeRabbit Configuration File (`.coderabbit.yaml`)

- You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository.
- Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information.
- If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json`

### Documentation and Community

- Visit our [Documentation](https://docs.coderabbit.ai) for detailed information on how to use CodeRabbit.
- Join our [Discord Community](http://discord.gg/coderabbit) to get help, request features, and share feedback.
- Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.

</details>

<!-- tips_end -->

coderabbitai

Actionable comments posted: 7

🔭 Outside diff range comments (1)

topic/machine-learning/llama-index/test.py (1)

43-58: Add connection error handling for MCP test.

The test fails due to MCP server connection issues as indicated in the pipeline failures. The test should handle connection failures gracefully or skip when the MCP server is unavailable.

 def test_mcp(cratedb, capsys):
     """
     Execute `demo_mcp.py` and verify outcome.
     """

     # Load the standalone configuration also for software testing.
     # On CI, `OPENAI_API_KEY` will need to be supplied externally.
     load_dotenv("env.standalone")

-    # Invoke the workload, in-process.
-    from demo_mcp import main
-    main()
+    # Invoke the workload, in-process.
+    try:
+        from demo_mcp import main
+        main()
+    except Exception as e:
+        # Skip test if MCP server is not available
+        if "connection" in str(e).lower() or "connect" in str(e).lower():
+            pytest.skip(f"MCP server not available: {e}")
+        raise

     # Verify the outcome.
     out = capsys.readouterr().out
     assert "Answer was: The average value for sensor 1 is approximately 17.03." in out

🧹 Nitpick comments (3)

topic/machine-learning/llama-index/README.md (1)
95-95: Update reflects new demo structure correctly.

The command update from python main.py to python demo_nlsql.py correctly reflects the new demo structure.

Consider documenting both available demos since the PR introduces multiple examples:
 python demo_nlsql.py
+
+# Alternative: Run the MCP server demo
+python demo_mcp.py
topic/machine-learning/llama-index/demo_nlsql.py (1)
41-41: Consider making the query configurable.

The query string is hardcoded, which limits the demo's flexibility for different use cases.
-QUERY_STR = "What is the average value for sensor 1?"
+QUERY_STR = os.getenv("DEMO_QUERY", "What is the average value for sensor 1?")
topic/machine-learning/llama-index/demo_mcp.py (1)
49-49: Consider making the LLM model configurable.

The GPT-4o model is hardcoded, which may not be suitable for all environments or cost requirements.
+import os
+
 async def get_agent(self):
     return FunctionAgent(
         name="Agent",
         description="CrateDB text-to-SQL agent",
-        llm=OpenAI(model="gpt-4o"),
+        llm=OpenAI(model=os.getenv("OPENAI_MODEL", "gpt-4o")),
         tools=await self.get_tools(),
         system_prompt=Instructions.full(),
     )

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30e6ec6 and 5240007.

📒 Files selected for processing (8)

topic/machine-learning/llama-index/README.md (1 hunks)
topic/machine-learning/llama-index/boot.py (1 hunks)
topic/machine-learning/llama-index/demo_mcp.py (1 hunks)
topic/machine-learning/llama-index/demo_nlsql.py (1 hunks)
topic/machine-learning/llama-index/env.azure (1 hunks)
topic/machine-learning/llama-index/env.standalone (1 hunks)
topic/machine-learning/llama-index/requirements.txt (1 hunks)
topic/machine-learning/llama-index/test.py (1 hunks)

🧰 Additional context used

🧠 Learnings (1)

topic/machine-learning/llama-index/requirements.txt (1)

Learnt from: amotl
PR: crate/cratedb-examples#937
File: topic/machine-learning/llm-langchain/requirements-dev.txt:2-2
Timestamp: 2025-05-12T20:10:38.614Z
Learning: The cratedb-toolkit package supports various extras including "io", "datasets", "influxdb", "mongodb", "testing", and many others.

🧬 Code Graph Analysis (2)

topic/machine-learning/llama-index/test.py (2)

topic/machine-learning/llama-index/demo_nlsql.py (1)

main (17-46)

topic/machine-learning/llama-index/demo_mcp.py (1)

main (61-78)

topic/machine-learning/llama-index/demo_nlsql.py (1)

topic/machine-learning/llama-index/boot.py (1)

configure_llm (12-47)

🪛 GitHub Actions: LlamaIndex

topic/machine-learning/llama-index/requirements.txt

[error] 1-1: Failed to get requirements to build wheel for cratedb-about package due to invalid pyproject.toml license configuration.

topic/machine-learning/llama-index/demo_mcp.py

[error] 43-76: Runtime error in async tool fetching and query execution: HTTP connection attempts failed, resulting in unhandled exception group and test failure.

topic/machine-learning/llama-index/test.py

[error] 54-54: Test failure in test_mcp: ConnectError 'All connection attempts failed' during async HTTP request in mcp client, causing test to fail.

🪛 Gitleaks (8.27.2)

topic/machine-learning/llama-index/demo_mcp.py

16-16: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🔇 Additional comments (5)

topic/machine-learning/llama-index/requirements.txt (1)

7-7: LGTM: Appropriate version constraint for MCP tools.

The version constraint <0.3 for llama-index-tools-mcp is appropriate for ensuring compatibility with the demo while allowing patch updates.

topic/machine-learning/llama-index/boot.py (1)

1-48: LGTM: Clean refactoring preserves essential LLM configuration.

The refactoring successfully extracts the main application logic while preserving the essential configure_llm function and its required imports. This modular approach allows the new demo scripts (demo_nlsql.py and demo_mcp.py) to reuse the LLM configuration logic.

The remaining code is well-structured and maintains support for both OpenAI and Azure OpenAI configurations.

topic/machine-learning/llama-index/env.azure (1)

1-8: Excellent improvement: Export keywords ensure proper environment variable availability.

Adding export keywords to all environment variable declarations ensures they are properly available to subprocesses and the Python demo scripts. This is a best practice for shell environment configuration files.

topic/machine-learning/llama-index/env.standalone (1)

2-4: Consistent improvement: Export keywords properly applied.

The addition of export keywords is consistent with the env.azure file changes and correctly ensures environment variables are available to the demo scripts. The commented OPENAI_API_KEY line appropriately remains unchanged.

topic/machine-learning/llama-index/test.py (1)

25-40: LGTM! Test structure follows good practices.

The test properly loads configuration, executes the demo in-process, and verifies expected output.

topic/machine-learning/llama-index/requirements.txt

topic/machine-learning/llama-index/demo_nlsql.py

topic/machine-learning/llama-index/demo_mcp.py

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

.github/workflows/ml-llamaindex.yml (1)

49-51: Hard-coding a PR tag (pr-50) makes the workflow fragile

Building against a transient image tag tied to another PR means nightly and manual runs will suddenly break once that PR is rebased or the tag is removed. Consider defaulting to a released version (e.g. main or a semver tag) and overriding it only in the rare jobs that explicitly need the PR build.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5240007 and efc2ed8.

📒 Files selected for processing (1)

.github/workflows/ml-llamaindex.yml (2 hunks)

🧰 Additional context used

🧠 Learnings (1)

.github/workflows/ml-llamaindex.yml (1)

Learnt from: amotl
PR: crate/cratedb-examples#937
File: topic/machine-learning/llm-langchain/requirements-dev.txt:2-2
Timestamp: 2025-05-12T20:10:38.614Z
Learning: The cratedb-toolkit package supports various extras including "io", "datasets", "influxdb", "mongodb", "testing", and many others.

.github/workflows/ml-llamaindex.yml

amotl · 2025-07-20T01:12:34Z

.github/workflows/ml-llamaindex.yml

+        cratedb-mcp-version: [
+          'pr-50',
+        ]


This needs to be adjusted after the next release of cratedb-mcp.

OCI: Add building OCI images cratedb-mcp and cratedb-mcpo cratedb-mcp#50

Resolved with a43ee4e.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (5)

topic/machine-learning/llama-index/README.md (5)
63-66: Add language identifier to the venv code block

Markdown-lint flags this fence (MD040).
Adding the language (shell) improves syntax highlighting and keeps the doc lint-clean.
-```
+```shell
 python3 -m venv .venv
 source .venv/bin/activate
---

`105-105`: **Specify a language for the “expected output” block**

Same MD040 issue here. Use `text` (or `console`) to satisfy the linter and give readers proper formatting.

```diff
-```
+```text
149-149: Missing language identifier in second output block

Repeat of the previous comment for the MCP expected output section.
-```
+```text
81-89: Remove superfluous quotes around the connection string

.env parsers typically treat surrounding quotes as literal characters, breaking the URL.
Safer to drop them so copy-pasting “just works”.
-CRATEDB_SQLALCHEMY_URL="crate://<Database user name>:<Database password>@<Database host>:4200/?ssl=true"
+CRATEDB_SQLALCHEMY_URL=crate://<Database user name>:<Database password>@<Database host>:4200/?ssl=true
95-97: Minor grammar / consistency tweak for the link label

“Apostrophe-s” reads better and matches earlier “NL2SQL” terminology.
-### NLSQL
-[LlamaIndex' NLSQLTableQueryEngine] is a natural language SQL table query engine.
+### NL2SQL
+[LlamaIndex's NLSQLTableQueryEngine] is a natural-language SQL table query engine.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b51777e and 195fd2c.

📒 Files selected for processing (4)

.github/workflows/ml-llamaindex.yml (2 hunks)
topic/machine-learning/llama-index/README.md (3 hunks)
topic/machine-learning/llama-index/demo_mcp.py (1 hunks)
topic/machine-learning/llama-index/demo_nlsql.py (1 hunks)

🚧 Files skipped from review as they are similar to previous changes (3)

topic/machine-learning/llama-index/demo_nlsql.py
.github/workflows/ml-llamaindex.yml
topic/machine-learning/llama-index/demo_mcp.py

🧰 Additional context used

🧠 Learnings (1)

📓 Common learnings

Learnt from: amotl
PR: crate/cratedb-examples#1032
File: topic/machine-learning/llama-index/demo_nlsql.py:28-29
Timestamp: 2025-07-20T00:14:38.691Z
Learning: In demonstration and example code within the cratedb-examples repository, prefer simpler code without extensive error handling to maintain clarity and readability of the examples.

🪛 markdownlint-cli2 (0.17.2)

topic/machine-learning/llama-index/README.md

149-149: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (1)

topic/machine-learning/llama-index/README.md (1)

136-142: Clarify uvx prerequisite

uvx isn’t a standard tool; newcomers may hit “command not found”.
Please mention how to install it (e.g. pipx install uvx or pip install cratedb-mcp[cli]) or offer an alternative launcher.

amotl · 2025-07-20T01:17:52Z

topic/machine-learning/llama-index/demo_mcp.py

+    async def get_agent(self):
+        return FunctionAgent(
+            name="Agent",
+            description="CrateDB text-to-SQL agent",
+            llm=OpenAI(model="gpt-4o"),
+            tools=await self.get_tools(),
+            system_prompt=Instructions.full(),
+        )


Better use the full instructions (general+mcp) including how to use the provided tools after this patch has been merged.

Prompt: Add general and MCP tool instructions to system prompt of MCP server cratedb-mcp#53

kneth

LGTM

coderabbitai bot reviewed Jul 20, 2025

View reviewed changes

.github/workflows/ml-llamaindex.yml Outdated Show resolved Hide resolved

.github/workflows/ml-llamaindex.yml Outdated Show resolved Hide resolved

amotl force-pushed the llamaindex-mcp branch 3 times, most recently from b51777e to 26648cb Compare July 20, 2025 00:51

amotl commented Jul 20, 2025

View reviewed changes

coderabbitai bot reviewed Jul 20, 2025

View reviewed changes

amotl commented Jul 20, 2025

View reviewed changes

amotl changed the title ~~LlamaIndex: Add MCP server example~~ LlamaIndex: Add example using MCP server Jul 20, 2025

amotl requested review from WalBeh, hammerhead, joncombe, kneth and surister July 20, 2025 02:20

amotl marked this pull request as ready for review July 20, 2025 02:21

amotl changed the title ~~LlamaIndex: Add example using MCP server~~ LlamaIndex: Add example using MCP Jul 20, 2025

amotl mentioned this pull request Jul 20, 2025

LangChain: Add example using MCP #1033

Merged

2 tasks

amotl force-pushed the llamaindex-mcp branch from 26be284 to 91e372b Compare July 20, 2025 13:33

amotl added 3 commits July 20, 2025 15:34

LlamaIndex: This and that. Naming things.

a807a14

LlamaIndex: Add example using MCP

08e0167

LlamaIndex: Make model configurable

6f7cd57

amotl force-pushed the llamaindex-mcp branch from 91e372b to 6f7cd57 Compare July 20, 2025 13:34

kneth approved these changes Jul 21, 2025

View reviewed changes

amotl added 3 commits July 21, 2025 20:25

LlamaIndex: Optionally use debugging when configuring llm

11445fa

LlamaIndex: Address review comments by CodeRabbit

a850d9f

LlamaIndex: Use cratedb-about 0.0.6

d21b80d

LlamaIndex: Use OCI cratedb-mcp:main

a43ee4e

amotl force-pushed the llamaindex-mcp branch from f013abc to a43ee4e Compare July 21, 2025 19:07

amotl merged commit a0212ae into main Jul 21, 2025
3 checks passed

amotl deleted the llamaindex-mcp branch July 21, 2025 19:13

This was referenced Jul 23, 2025

Open WebUI: Add full end-to-end example rig #1038

Merged

ML: Refactor locations and layout #1047

Merged

LlamaIndex: Add example using MCP #1032

LlamaIndex: Add example using MCP #1032

Uh oh!

Conversation

amotl commented Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

About

References

Backlog

Uh oh!

coderabbitai bot commented Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Estimated code review effort

Suggested reviewers

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

amotl Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

amotl Jul 21, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

amotl Jul 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kneth left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

amotl commented Jul 20, 2025 •

edited

Loading

coderabbitai bot commented Jul 20, 2025 •

edited

Loading

amotl Jul 20, 2025 •

edited

Loading

amotl Jul 20, 2025 •

edited

Loading