Skip to content

Conversation

@amotl
Copy link
Member

@amotl amotl commented Jul 20, 2025

About

For Text-to-SQL purposes, exercise the llama-index-tools-mcp package together with the CrateDB MCP Server, specifically to validate its OCI standard image published to GHCR.

References

Backlog

  • Software tests
  • Documentation

@coderabbitai
Copy link

coderabbitai bot commented Jul 20, 2025

## Walkthrough

The changes add two new demonstration scripts for querying CrateDB using LlamaIndex with natural language SQL and MCP interfaces, update environment files to export variables, introduce new dependencies, enhance the README with setup and usage details, revise tests to run the new demos, and extend the CI workflow to include MCP service testing.

## Changes

| File(s)                              | Change Summary                                                                                                               |
|------------------------------------|------------------------------------------------------------------------------------------------------------------------------|
| README.md                          | Updated title and intro; added MCP usage instructions; changed example run command from `python main.py` to `python demo_nlsql.py`; improved formatting and added external links. |
| boot.py                            | Added new `configure_llm()` function to initialize LLM and embedding models based on environment variables for OpenAI/Azure. |
| demo_nlsql.py, demo_mcp.py         | Added new demo scripts illustrating querying CrateDB with LlamaIndex via natural language SQL and MCP protocol respectively.   |
| env.azure, env.standalone           | Added `export` keyword to all environment variable declarations for proper shell export.                                       |
| requirements.txt                   | Added `cratedb-about` (version 0.0.6) and `llama-index-tools-mcp` (<0.3) dependencies.                                          |
| test.py                           | Renamed `test_main` to `test_nlsql` to test `demo_nlsql.py`; added `test_mcp` to test `demo_mcp.py`, both asserting expected output. |
| .github/workflows/ml-llamaindex.yml | Updated Python version matrix; added `cratedb-mcp` service and matrix dimension for MCP version in CI workflow.                 |
| main.py                           | Deleted legacy script replaced by new demos.                                                                                  |

## Sequence Diagram(s)

```mermaid
sequenceDiagram
    participant User
    participant DemoScript (demo_nlsql.py/demo_mcp.py)
    participant LLM (LlamaIndex)
    participant CrateDB

    User->>DemoScript: Run main()
    DemoScript->>LLM: Send natural language query
    LLM->>CrateDB: Generate and execute SQL query
    CrateDB-->>LLM: Return query result
    LLM-->>DemoScript: Return answer
    DemoScript-->>User: Print answer and metadata

Estimated code review effort

3 (~45 minutes)

Suggested reviewers

  • hlcianfagna
  • kneth

Poem

🐇 A bunny hops with joy anew,
Two demos fresh, just made for you!
Exported vars and tests in line,
CrateDB queries, simply divine.
LlamaIndex guides the way,
Query with ease, come run and play! 🌟


<!-- walkthrough_end -->


---

<details>
<summary>📜 Recent review details</summary>

**Configuration used: CodeRabbit UI**
**Review profile: CHILL**
**Plan: Pro**


<details>
<summary>📥 Commits</summary>

Reviewing files that changed from the base of the PR and between 6f7cd572842f630da519ba05ef662ec283610785 and a43ee4eae8f3619b8752185a015a9cfd8fb50ffe.

</details>

<details>
<summary>📒 Files selected for processing (6)</summary>

* `.github/workflows/ml-llamaindex.yml` (2 hunks)
* `topic/machine-learning/llama-index/README.md` (4 hunks)
* `topic/machine-learning/llama-index/boot.py` (1 hunks)
* `topic/machine-learning/llama-index/demo_mcp.py` (1 hunks)
* `topic/machine-learning/llama-index/demo_nlsql.py` (1 hunks)
* `topic/machine-learning/llama-index/requirements.txt` (1 hunks)

</details>

<details>
<summary>🚧 Files skipped from review as they are similar to previous changes (5)</summary>

* topic/machine-learning/llama-index/requirements.txt
* topic/machine-learning/llama-index/demo_nlsql.py
* topic/machine-learning/llama-index/README.md
* .github/workflows/ml-llamaindex.yml
* topic/machine-learning/llama-index/boot.py

</details>

<details>
<summary>🧰 Additional context used</summary>

<details>
<summary>🧠 Learnings (2)</summary>

<details>
<summary>📓 Common learnings</summary>

Learnt from: amotl
PR: #1032
File: topic/machine-learning/llama-index/demo_nlsql.py:28-29
Timestamp: 2025-07-20T00:14:38.710Z
Learning: In demonstration and example code within the cratedb-examples repository, prefer simpler code without extensive error handling to maintain clarity and readability of the examples.


Learnt from: amotl
PR: #1033
File: topic/machine-learning/llm-langchain/README.md:138-156
Timestamp: 2025-07-21T18:46:07.502Z
Learning: In CrateDB MCP server configuration, the correct environment variable name is CRATEDB_MCP_TRANSPORT, not CRATEDB_MCP_ADAPTER_TRANSPORT. This variable is used throughout the CrateDB MCP codebase to specify the transport protocol.


</details>
<details>
<summary>topic/machine-learning/llama-index/demo_mcp.py (3)</summary>

Learnt from: amotl
PR: crate/cratedb-examples#1032
File: topic/machine-learning/llama-index/demo_mcp.py:16-16
Timestamp: 2025-07-20T00:15:45.273Z
Learning: When providing API key examples in documentation, use clear placeholders like "your_api_key_here" instead of realistic-looking fake keys, even if they contain "--invalid--" markers, to avoid triggering security scanners and provide clearer guidance to users.

Learnt from: amotl
PR: crate/cratedb-examples#1032
File: topic/machine-learning/llama-index/demo_nlsql.py:28-29
Timestamp: 2025-07-20T00:14:38.710Z
Learning: In demonstration and example code within the cratedb-examples repository, prefer simpler code without extensive error handling to maintain clarity and readability of the examples.

Learnt from: amotl
PR: crate/cratedb-examples#1033
File: topic/machine-learning/llm-langchain/README.md:138-156
Timestamp: 2025-07-21T18:46:07.502Z
Learning: In CrateDB MCP server configuration, the correct environment variable name is `CRATEDB_MCP_TRANSPORT`, not `CRATEDB_MCP_ADAPTER_TRANSPORT`. This variable is used throughout the CrateDB MCP codebase to specify the transport protocol.

</details>

</details><details>
<summary>🪛 Gitleaks (8.27.2)</summary>

<details>
<summary>topic/machine-learning/llama-index/demo_mcp.py</summary>

16-16: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

</details>

</details>

</details>

<details>
<summary>🔇 Additional comments (3)</summary><blockquote>

<details>
<summary>topic/machine-learning/llama-index/demo_mcp.py (3)</summary>

`20-31`: **LGTM! Clean and well-organized imports.**

The imports are properly structured following Python conventions and all appear to be necessary for the demo functionality.

---

`33-66`: **Excellent Agent implementation for MCP integration demo.**

The Agent class provides a clean abstraction with:
- Configurable MCP server URL via environment variable
- Proper async/sync pattern with `aquery` and `query` methods
- Clear separation of concerns between tool retrieval and agent creation
- Appropriate simplicity for demonstration purposes

---

`68-89`: **Well-structured main function for MCP demo.**

The main function effectively demonstrates the MCP integration with:
- Proper environment setup and LLM configuration
- Configurable demo query via `DEMO_QUERY` environment variable
- Clear output showing both the query and answer for educational purposes
- Appropriate simplicity for demonstration code

</details>

</blockquote></details>

</details>
<!-- internal state start -->


<!-- DwQgtGAEAqAWCWBnSTIEMB26CuAXA9mAOYCmGJATmriQCaQDG+Ats2bgFyQAOFk+AIwBWJBrngA3EsgEBPRvlqU0AgfFwA6NPEgQAfACgjoCEYDEZyAAUASpETZWaCrKNwSPbABsvkCiQBHbGlcSHFcLzpIACIAGS80ZjQASQwlAA8uAEFaehJ0xO5IyGxEeAwiSABZAGEraMgAdzRkBwFmdRp6OTDYD1LKdGZ8CPRkbm9ffyCQyAxHAUGARgAGAGYAJhQsXD6/Em58MoIXRioaAHoGc7oBMHzCyMQNGD2JikPED3wAM17UWwoZAEHgUfASeBKdBYB7MIoeJTDDCIXDncqVXb9L78P6YyA+RJoMDlDJgAj4LyIMDMBjcHhoBgAazQpAANE0EAxYED0BJtAkBMV8FgrLIrMkXgBlbiieA/eAMNA+WTs9SQfKUBhIaT/cYM5mkML4UiYviNdTcvE1G4AEQAQtU6pBJZQpHwQXyvJDqB51MgAPI1ZL2XCYWjOejwJKGiaCpB9egggDi6gAEtgBJAasLQ+VBjYSEQkKj5AAKJOpmo2ACUL3ckCUiGu8G44mF+x+lDIDB1/gSXU8Pn2MxRiC4ZgAbABOJbbM4+q43Wh3WHwxDssOQMwAVjW0PoO5Wc+uC5PXTuNO4db2An1XmNKBozGQP3w7oQyEB5QYXmwUKYcKRAOiC/LgzT+GEITIJutD4AwjjsNQ8DCteHiAoqWCLJAEIkI0UTUJAsC4Lg3BjhcFxFrsGYaABi6nkuK4FIB0gXBMPgXKsmwaEY+jGOAUBkPQvw4AQxBkMoA4AWwGCcKC/DCKI4hSDI8hMEoVCqOoWg6LxJhQHAqCoJgImEKQ5BLgorDsFwVCNPYjhJKcPRqcommaNouhgIYfGmAYBDcAqFxJFyeZgJEzgYOiFwEkkxJpPkFw2AAolkNpVElGjMLQHAGNEeUGBYkBZMkYnmT69AOE4pzCVymCkIgbh7MlqXpWE6jFM0yC1RUUQgtEABysQbJKACKsRNBakDxISqQZFwI3BC46JZraDqlMtGDUNgVC+AkFTYCyJANCCizEYM/g/JEYi9B4TAyewqHbKiijYGIb7yF8YjIVgnXquk3Bhr1+CQNJbZYPgrbfUqJTYsJtRWPuRodl2GA9jdU0JEks35DA+S4GShCjeNSbYJCJCPbgeBvvA0NehgjJNC0HZvkkZ30JueJfJTdLlCiFCvWDyB4RBP7OHK8B0KqqO/rQy2IDKWo/LIy14mQEJghgoPoF9fJg5ZSRpGM6D2H0Q6IJF3AyporweGr8Aa1rfIUDTgq20x8KQK+fAAAZVlk0BJfaAD6RNZLENSpklVQAJpBwAqjYsTe4z0G0EIpQDiC/jDFIkBBCM0iPVJHPA/zOx7Ek5R/Y8Hi/dg3DhgOPxgswkDe9wsi7O2lcYBoHfJyC7ed7A7aIvgQcYJSAReH3sje+y5q7NC6C5FEGAF70BHTVj8XpAA5Mgg1E9AKiRAtlCyElFR5jyaBzFtO34nVB2GkTYSnx4MynGQRbkI9+QKwHPgPAExQhe1ZuICoKcUBwjBFIWgLwshzFwvYRS31oFoFXvQRszY1BQJHnZLO2By4eGtD6e0joEZfAoG6fEcElReBVNsH8f5lp2wduwbCYsP6oO5ojPExdDYghRM4UInNXSUBeKkPmAtvovjfH4YhkUoF3zHjDFWex/QygwMVIq4oJpL3hpAUsQ8u5YDHkHS8s9vbViaF2FeShaAbnvFAxe3IAGKSiMAkieApQkDdNDPs/jMChDxpQTau1yiMiFvYzBjj+B8Hro3KIXtKHsjxJTE4NMvAbkNtvFIu8GxwQQjJJCKEjDmEsFkLwNA0TCmBMDPEShRZ1ORDiP6hwKBAL4LGL0DB1QyXUBLBqBh+rCnJrlfKBgIBgCMP5QKwUEDkDCiQCKUUYpEhJAlAQ+ARizxynlaIBUqklTMhJKIlVHLyBqrAOq0gjBIPIHZYYtBvC+mQOUZ6ryeyJmBndeURBtoeDvntQFh0QaKBIL4UssRYhVFsZuYyJBmCLFyMtF5ULIC3i+EJGEGB1bCkdtw12rQFZymVlAvEWiyC6KyPo3AsgZSI2uHQdg2Tng2zbv8+AgL/BBx8MwZOPxiFfXbOUIZSp4AAC8dRUu0bo0UZjGBek4W4vRwYGUynZIJQ4nz2RujKMKXJ9BGQkHkM3FgAyCWa04U7F2TwpGhAhmDBh8gyAf2gvYKMHsiD3lvL4JQAhsCVFuWkSIfBUn5OxukFAuI9je0DcGoVCRKioC5o9YVqM9Ysp9NBLAsKqjbBEajLEbCLSDDxHySKBJIDUp0cGBRWQpVAtrfK4MotEBfBfC3dGUbClegEFQFw7IlDaNllA9seI6UasZSQdkXNIEYgrpC3wm02BI2iEQVsYAAAs+AGgc2RTKc4LaQQrA0OeyAAAxBRyLUXjqIOuR8qCrq4BkCQMCJAyBTTqrVKujQqCW0oNBMExD6B1uKklFFdAH3IEbc2/wEHkhQfveiJ93LeVRDVXiNAlswS8BpjQK19tCW2uJQ6yAyQ40eGnWEWdPJ16hH8EwIgkUZVOOfVQbUnrvYADUlTBCShQMEFBvYZpFXrfwlMKBtLvtzIU1GFAYABUC+gBbEZ3pg+ildRbQwlqfQo72YzyBCoUaraDaKoEYt8HKdGtHNXvO2J6SE3EKmFWqbUspbSQRNNEAkVpcG/gALfN0zwcZ+lsvEPc6Z16JPoLiXQLgCaSB/Aw0C/lXhmClkTUQLgOyKSQAALzXqVF8WxXkYD10iAAbQLeyO0LQSAoa0xUAAusnKu3t5kMCCgyJZJAVlrIqNFTGmzd4XHy5ofukyjnTN8t13rIVlnhRk+s0bcUMgXAsVYjuBz8rudOeJCylznDXNS6G+qDzkF2UVSPLATZnath5J8sE3ygYNmRfU1EPo86LQpZUVR1AVCNfUSop+FBDSgpfh4azxiC22IhHJvYRjSxVD6zfbMrBiEKi89YMEBAmBeFsZ8ygPwGTk05Q9lsoQBjID7RkAx3IkMJKKghjwzPrMNKesoa6aq75kJoBQox1C3SOo+/KcgeaxiyFRrADWwDkBZDMqEDtwJbkq+FOQMQXO8TC4kYksoYOGtlAYPDGoKqZL8FoamaA0ArDsik87fxOo0B8ngAKYo5JKTYRptUWk0BdleGlKIY1ZxVk0E9VeuLwolecLS/4egfPIBJisNAXdwNYeIvsLIFEyLQQsCexa1uAuSAUN5qiWR9THqx8t6r/PEJGxYpGNyFoMuuTy9KIjRAbe5fCgV79i+IMP0j1oGrgiZd77Sdps/cFX9hnoCINoZEYi9iN2B18R6PdPbR6wPeTByB2Gkct3aj17J4+yr2HV3TITCMX5o8r0PZdPWIBrgP04pZogAHV1c8hwwEw0nowQnsCiXwyICiSwAA/NENWOyARoMpSnsF/PIBzHsFJttOQOzMiMLI9FTk9t+DLDqKUOCuXvzKKm0qkiIl0hoqQqtJQqgjQoMJuAumwviiRjasfuRjqKkszrRqajnq5pUkVDUhJHIkjL5i0l5oFh0iFl4j0hmH0gMuEMMjxEVFgsqi0MgN7DXrgMnGqmwF3KPjlJALoG3K3qjOLpACaEHF7ogKWF8F4D8NWN7AYMYVAN7GYf0koH8FYYdDJHYVCo4c4a4aYd3uYV4egEgf4Q4eyEgU4S4SYcln8JEfYT8DEX9nEVADkPEpmmQW3OET3KWHEYcjxPNhDAsujstqsqtsNhshtglBYpPIgNPPsjNsckVIdmVMBA5Kdu0t1JdgYI8igrdu2LgaEIZDJK9q9O9mPMvjcG/v9sbGvtiqWlAptFPrtDPgAb7sZOpiTkQBZGqvTvkDgc2E9nvqPsRhwhwc7Kfkpipv4Grh4OppuJppZpUNZqHiEKfPGC7kplrnrCCPzrQUsSDutFAqHF4FyMirIGLjmhHsvETDaEDssU0IBsesAe+B4PLLKPKL1DwpuOKuIJKjKlLkfGNCfK7OfC4FfL/LXJNHiFfq8Q+hCkoDkojCCRvpTqcSvt+k/sbC/sxJPttNPvtLPn9sYl4WgN4Iuhut/gRKgH/soAAQJh4BQWQCBHwJAdAQvkviiOjByR4LsKBkQJaIgeKT/HmLAc7PAUuqqSzNQMBMAhQD2E+niNTL/NDEgeknsJgIgMLKHhoXBLfvQPoZgkDpvkvtvlmugt2Lhg4P2BfqgI0G+IyJdPgHZASQfukKIHgFhn0CQvYNyYorJo8SDJGS8m8gIW0R5iIfUmIavn5s4JIe0sFl0rIWFgoZFsoTFlHtGe2AltlHkSlmWeUIUR1lgF1mUT1osqFCtsokQCNoSHUekFtp9hPFPDPNNsUXNrMn5FOYtv1oNtUQubUVsiuWrFoGzntkcgdqVOchVN0dVOdnciMvWIfuwaECfq7B9hIYLHOHiN7BeWgGzkKu7rXPYi8uLO9rwAcIJOjIBf9CFsnHwcmRQL8uqH1lwjcd+RoTyh+deKgH0bbMiECmPjyVhfai7hBK2ZnI0nsIgKbL4O+aDOyEkIyNQa3AyC6WUN+cIhmLwHBNIF2l3tyVmTmU3D2pzIxY9F+cUGui7obIAb2MipGSKhdnQAANz8CTzyA4Zop6zCQIWdI6EYKWwqr0DY7gKMJIwML4h5jPAVLVnCEBb1kIiNkuXCQ0Xtm9IKiKFDLRbGYTLbkzJzL7kzmVFDYnnrZnkXAXnFrhguIkDXnVkdH3n2RVRnaMDqWvl7DMVkbYXFDNL+bNlVwAVxV6YJXjKgUdQQWKBQXoX4F/iGnxo0XIVmqoUIJ47+LISlCMLekeCyU6jez+hWBJT9TFRBzTpBzQDRyjXzxtx+wBzByhzhyRwxzxyJzzWbi+w2D+yBx2jTVZB2ixBJRBzjXpS6GxKdp4VYYWjeLSFUEIF54+b0WMWXFH7WxZA/C1LoxEX9XYiDUgZuxFAKjtRuqIVtkcZsUcW8j8g8IgghReD0ACXcUKUPkCAMVQqUiububOXNkvVuW/miGeUQ2hY+URaDJRYjJQBjJZUvlIxeVCRyHhZ+VU3HiQpaXCjWV5XXGUUPWhAiJPjsAxIQQDn/mvVY3vUfl3E8rCl6zyiRDcTBWlEBTTkVEDZzlrZLkxXTCkzZzC0aC4DpCcCtG3lnLHaPmZVEWvnJnXYfZjrdjz7CzApqEE3Dh63IrC2exgVJZni3BgAqDeLJwBQYAYFIwGroLnrnoTih7eynm7wEwUhUiXjJwPGogKi0XYTAaiFPBj5YDnprAvC035DFjLSjqCSO06jO37A5ztmQW4kIKCE1kuVu1FVNl/kk3GXeXyG+VdkBXjJK1TIhV7mq0HmzlVHzmLmxQxUR5TayDJVm1HblTpVXK9HZWNRuyl2UqzA5F6xdYhCWJL66FMz+DyXoX70ojrlNFeDJybj10SzoUajwREbJbDBX3NH9y6arJCR/Dew9zWKPTeohYhg+hax1wNzL1cVvgPrWVF5txwNb5ANdIjkYADzAzexwMNEbkwLGUoNiZFTfXnRKLUFtwFFOFRlkH9WQR6nrx2SKitikXozxURj8AgJ4CIwaGUBvqbyhB3zYmKy+W+nCwhjWmVAf5ZBYGDCdRcD1iu5KkDUqkYmoJgGam3x4b4DpBRg+jWVLAADsF6hd2puGMozgHyBZ3ioCrmWRQyworqG4dts9FDe9s9litIx9qcjiBFyATy1DAtUY7uzg1l5xB+rBVxn5nB3alqgF+KGgzDiV81SD3Dm4fJAFZDQqO+nsPar948O2c8Z+uG0md+rDPiuAAZnaXDpZMFoBaMwknMiQwKkjfAfMy0pVewFjviRgdozevjyAfY1yBZ3sfty4oFxtLaqSBpvC9ciMAwXKcZOeaDjABTLacBQCbDYCCibo4sioYMVZuNnmf5rd7lzZndMhTNHZvdlN3ZUAvZuRp99Tg5F9uAh9o5QzAg+TpEOe5DgASYRtwuONHTylivPvPd6IDkOdYLbhUa0T1a3T3jaz3WI9kZMDlJYuOXiAsMTAufPjm/NhXq1HmT3x2bbwtblD2+QaCUSwAZgXCoWpn3iNCIBBReBhSjZnkaCyDMBeAL0nJ3kW0ZVr0vkb3J5pgZhFRkFCwplpl2SpJ4jRCz0NBCCCDQJJLL0I3qXozDFYAR3dzUDOwxo9D+BFAMjLR7yF0AAce8jOkAJrGgqwFrjQCAxQUmS+xrhdSwawe8j0gzDEYAWrqDZZ6dMav0F09pbMNlIM0p8AYUN8zgVA8gQaoQpqT2nM6IxQSlVrkUJpEQsgHrRUdtrMurDYUY6p6CXr5UF4tIPr2dwo7jDit1S8laije8PcOb2Ywm0ghwaQKbTCd8PjIuCoHgpb541Ibj0CL2L0Py6ALij6ZM6MJdKIy0g7twycfbPYXjdtK7WIRTWjho3sJp1wGgyEdElwrzw73AA8LIpAie9JFcOr8AMag1bcAAJAAN7Pv+u6s0TeuXiVsUCGpYAAC+/781wWwluDprKwEHoeXMIT1qRKBVQ1i1+1Qc8M01u1/UkoVg/oNg0A81O1e1wcKHqY/okoOH7IeHS1B1KHmH2HW1hs5HSHNQsQccJHSUNgG1Sclr8o2Z9ASlHKtNzegw1tSMCrmYuepELOPN4T8HItMOmCEybmVSeNBzdFhNxVHdQWpN3dLNfd1NkAtNQnIIjNLO5NrN8+VcLknNOlWYwYNLkrMtvKuOv099dAg9s2w9ELeLmtNR0V42/9u2pt3L5ty9J2T5dNPUOVblQEKSYFODkQoM+EhZj2K+BEtOxs/mkOmxMOOmMKcKtiIISBxsb8EzVcaxwpGxoppAYuDglspz9nct6CqSJA5afAVa7uCQraNKDafATaLaPB4oT6zBUCUzDJcKGmFmzJnO2sTAaFXbWKjWuKUtcHlFHK9YoxdCcneQoTH1FFtxXxcYmN7MfxaCk6wMQJ5CDoEziOzoY01SUJzA3bhscJCXRxMaBXFp5AdXCelrTSyJIOm4BaYumIMI2Zz9CXtyaFLk9ApXj8UOYpg+H+cpYxpZcjVAypv4dpTT6p4BUBCKi+5e+pv32IRpwCJp6Mb3185AVpnyJD4CDpFyTpLpVDBXzTFQVDQjlAZTIEWoy9oZa+RckKzCBBeQwmCioatAdMlQqS5Qzm9A9m9G8euOm4Aw2CBc+KSjknZZm0pAoM/PExBWdnv0fM2gmbXs4Eg55+U0cKZ+muikSMBpZ+/gP25PtJDuxC8xoe7TTG0g0pDlCnQh+zohhzRNdZJzkNxnPdFNSh0WUAV84gEECtA7HnS2ULkVU9Y2m2fnc8H2UXHG2c4I6IRhJhNze95+GWWWcRwRxfJbaTxh8R+k9FRZ7ApwuqluYKaFUQPQ3stmQcE89TPfRWxW0QPfPcPf0QXAaTrRc23kekAyP9JkPLy9Uk1kfgaAdkIXcbqkkKGkag7kOkXkBgM/Uk6gQckIiAQc/gOEeEtAQclBoQvEB//EkA24GwO6EHKwujejPwnYawPw7/U4O6U4CDjujQA/BVA7/EgDuiWBcVX+ujbcGgG3CeRDAM/FLAwA2B0BTWP/WgCsBIDbgewO6AQBsF0YkABAuAndGTgEBEC1gKwU1tuB3SmtaAPwH/nfF0iP8SBejXRkQNoBTgpw24JYG6w2CYI0BPwMgWgGAEpYv0U4NADQN0YADTWM4Zgfvxn5LAeBPwWgBsAYB8C0AaAXRuBxIArAGAWAgQJIIECmsJwHA+gT8F3BSC3WfAgQLOBYEQBIAGwCcIsA2Cmsd0r/FYKoA2C8C1gOAlLEAIYC6MpwE4BgD8AnDhgpwOAxUHwPsGKDH+E4H4LowMHbhdGbgndBsHCFUDwwvAqcLeBWDbgUsE4CcGgNQEYCJwqwHQQgIcFQBRBfgiAashICmtGBFQvIaa1gEbAlgNAtACsCWBwCpwYQ2gM0JIErAv+Hge/ofxYAdAnmp/c/t1Vwh0Ag4cFfQEAA= -->

<!-- internal state end -->
<!-- finishing_touch_checkbox_start -->

<details>
<summary>✨ Finishing Touches</summary>

- [ ] <!-- {"checkboxId": "7962f53c-55bc-4827-bfbf-6a18da830691"} --> 📝 Generate Docstrings
<details>
<summary>🧪 Generate unit tests</summary>

- [ ] <!-- {"checkboxId": "f47ac10b-58cc-4372-a567-0e02b2c3d479", "radioGroupId": "utg-output-choice-group-unknown_comment_id"} -->   Create PR with unit tests
- [ ] <!-- {"checkboxId": "07f1e7d6-8a8e-4e23-9900-8731c2c87f58", "radioGroupId": "utg-output-choice-group-unknown_comment_id"} -->   Post copyable unit tests in a comment
- [ ] <!-- {"checkboxId": "6ba7b810-9dad-11d1-80b4-00c04fd430c8", "radioGroupId": "utg-output-choice-group-unknown_comment_id"} -->   Commit unit tests in branch `llamaindex-mcp`

</details>

</details>

<!-- finishing_touch_checkbox_end -->
<!-- tips_start -->

---

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

<details>
<summary>❤️ Share</summary>

- [X](https://twitter.com/intent/tweet?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A&url=https%3A//coderabbit.ai)
- [Mastodon](https://mastodon.social/share?text=I%20just%20used%20%40coderabbitai%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20the%20proprietary%20code.%20Check%20it%20out%3A%20https%3A%2F%2Fcoderabbit.ai)
- [Reddit](https://www.reddit.com/submit?title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&text=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code.%20Check%20it%20out%3A%20https%3A//coderabbit.ai)
- [LinkedIn](https://www.linkedin.com/sharing/share-offsite/?url=https%3A%2F%2Fcoderabbit.ai&mini=true&title=Great%20tool%20for%20code%20review%20-%20CodeRabbit&summary=I%20just%20used%20CodeRabbit%20for%20my%20code%20review%2C%20and%20it%27s%20fantastic%21%20It%27s%20free%20for%20OSS%20and%20offers%20a%20free%20trial%20for%20proprietary%20code)

</details>

<details>
<summary>🪧 Tips</summary>

### Chat

There are 3 ways to chat with [CodeRabbit](https://coderabbit.ai?utm_source=oss&utm_medium=github&utm_campaign=crate/cratedb-examples&utm_content=1032):

> ‼️ **IMPORTANT**
> Auto-reply has been disabled for this repository in the CodeRabbit settings. The CodeRabbit bot will not respond to your replies unless it is explicitly tagged.
- Files and specific lines of code (under the "Files changed" tab): Tag `@coderabbitai` in a new review comment at the desired location with your query. Examples:
  - `@coderabbitai explain this code block.`
  -	`@coderabbitai modularize this function.`
- PR comments: Tag `@coderabbitai` in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
  - `@coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.`
  - `@coderabbitai read src/utils.ts and explain its main purpose.`
  - `@coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.`
  - `@coderabbitai help me debug CodeRabbit configuration file.`

### Support

Need help? Create a ticket on our [support page](https://www.coderabbit.ai/contact-us/support) for assistance with any issues or questions.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

### CodeRabbit Commands (Invoked using PR comments)

- `@coderabbitai pause` to pause the reviews on a PR.
- `@coderabbitai resume` to resume the paused reviews.
- `@coderabbitai review` to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
- `@coderabbitai full review` to do a full review from scratch and review all the files again.
- `@coderabbitai summary` to regenerate the summary of the PR.
- `@coderabbitai generate docstrings` to [generate docstrings](https://docs.coderabbit.ai/finishing-touches/docstrings) for this PR.
- `@coderabbitai generate sequence diagram` to generate a sequence diagram of the changes in this PR.
- `@coderabbitai generate unit tests` to generate unit tests for this PR.
- `@coderabbitai resolve` resolve all the CodeRabbit review comments.
- `@coderabbitai configuration` to show the current CodeRabbit configuration for the repository.
- `@coderabbitai help` to get help.

### Other keywords and placeholders

- Add `@coderabbitai ignore` anywhere in the PR description to prevent this PR from being reviewed.
- Add `@coderabbitai summary` to generate the high-level summary at a specific location in the PR description.
- Add `@coderabbitai` anywhere in the PR title to generate the title automatically.

### CodeRabbit Configuration File (`.coderabbit.yaml`)

- You can programmatically configure CodeRabbit by adding a `.coderabbit.yaml` file to the root of your repository.
- Please see the [configuration documentation](https://docs.coderabbit.ai/guides/configure-coderabbit) for more information.
- If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: `# yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json`

### Documentation and Community

- Visit our [Documentation](https://docs.coderabbit.ai) for detailed information on how to use CodeRabbit.
- Join our [Discord Community](http://discord.gg/coderabbit) to get help, request features, and share feedback.
- Follow us on [X/Twitter](https://twitter.com/coderabbitai) for updates and announcements.

</details>

<!-- tips_end -->

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 7

🔭 Outside diff range comments (1)
topic/machine-learning/llama-index/test.py (1)

43-58: Add connection error handling for MCP test.

The test fails due to MCP server connection issues as indicated in the pipeline failures. The test should handle connection failures gracefully or skip when the MCP server is unavailable.

 def test_mcp(cratedb, capsys):
     """
     Execute `demo_mcp.py` and verify outcome.
     """

     # Load the standalone configuration also for software testing.
     # On CI, `OPENAI_API_KEY` will need to be supplied externally.
     load_dotenv("env.standalone")

-    # Invoke the workload, in-process.
-    from demo_mcp import main
-    main()
+    # Invoke the workload, in-process.
+    try:
+        from demo_mcp import main
+        main()
+    except Exception as e:
+        # Skip test if MCP server is not available
+        if "connection" in str(e).lower() or "connect" in str(e).lower():
+            pytest.skip(f"MCP server not available: {e}")
+        raise

     # Verify the outcome.
     out = capsys.readouterr().out
     assert "Answer was: The average value for sensor 1 is approximately 17.03." in out
🧹 Nitpick comments (3)
topic/machine-learning/llama-index/README.md (1)

95-95: Update reflects new demo structure correctly.

The command update from python main.py to python demo_nlsql.py correctly reflects the new demo structure.

Consider documenting both available demos since the PR introduces multiple examples:

 python demo_nlsql.py
+
+# Alternative: Run the MCP server demo
+python demo_mcp.py
topic/machine-learning/llama-index/demo_nlsql.py (1)

41-41: Consider making the query configurable.

The query string is hardcoded, which limits the demo's flexibility for different use cases.

-QUERY_STR = "What is the average value for sensor 1?"
+QUERY_STR = os.getenv("DEMO_QUERY", "What is the average value for sensor 1?")
topic/machine-learning/llama-index/demo_mcp.py (1)

49-49: Consider making the LLM model configurable.

The GPT-4o model is hardcoded, which may not be suitable for all environments or cost requirements.

+import os
+
 async def get_agent(self):
     return FunctionAgent(
         name="Agent",
         description="CrateDB text-to-SQL agent",
-        llm=OpenAI(model="gpt-4o"),
+        llm=OpenAI(model=os.getenv("OPENAI_MODEL", "gpt-4o")),
         tools=await self.get_tools(),
         system_prompt=Instructions.full(),
     )
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 30e6ec6 and 5240007.

📒 Files selected for processing (8)
  • topic/machine-learning/llama-index/README.md (1 hunks)
  • topic/machine-learning/llama-index/boot.py (1 hunks)
  • topic/machine-learning/llama-index/demo_mcp.py (1 hunks)
  • topic/machine-learning/llama-index/demo_nlsql.py (1 hunks)
  • topic/machine-learning/llama-index/env.azure (1 hunks)
  • topic/machine-learning/llama-index/env.standalone (1 hunks)
  • topic/machine-learning/llama-index/requirements.txt (1 hunks)
  • topic/machine-learning/llama-index/test.py (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
topic/machine-learning/llama-index/requirements.txt (1)
Learnt from: amotl
PR: crate/cratedb-examples#937
File: topic/machine-learning/llm-langchain/requirements-dev.txt:2-2
Timestamp: 2025-05-12T20:10:38.614Z
Learning: The cratedb-toolkit package supports various extras including "io", "datasets", "influxdb", "mongodb", "testing", and many others.
🧬 Code Graph Analysis (2)
topic/machine-learning/llama-index/test.py (2)
topic/machine-learning/llama-index/demo_nlsql.py (1)
  • main (17-46)
topic/machine-learning/llama-index/demo_mcp.py (1)
  • main (61-78)
topic/machine-learning/llama-index/demo_nlsql.py (1)
topic/machine-learning/llama-index/boot.py (1)
  • configure_llm (12-47)
🪛 GitHub Actions: LlamaIndex
topic/machine-learning/llama-index/requirements.txt

[error] 1-1: Failed to get requirements to build wheel for cratedb-about package due to invalid pyproject.toml license configuration.

topic/machine-learning/llama-index/demo_mcp.py

[error] 43-76: Runtime error in async tool fetching and query execution: HTTP connection attempts failed, resulting in unhandled exception group and test failure.

topic/machine-learning/llama-index/test.py

[error] 54-54: Test failure in test_mcp: ConnectError 'All connection attempts failed' during async HTTP request in mcp client, causing test to fail.

🪛 Gitleaks (8.27.2)
topic/machine-learning/llama-index/demo_mcp.py

16-16: Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

(generic-api-key)

🔇 Additional comments (5)
topic/machine-learning/llama-index/requirements.txt (1)

7-7: LGTM: Appropriate version constraint for MCP tools.

The version constraint <0.3 for llama-index-tools-mcp is appropriate for ensuring compatibility with the demo while allowing patch updates.

topic/machine-learning/llama-index/boot.py (1)

1-48: LGTM: Clean refactoring preserves essential LLM configuration.

The refactoring successfully extracts the main application logic while preserving the essential configure_llm function and its required imports. This modular approach allows the new demo scripts (demo_nlsql.py and demo_mcp.py) to reuse the LLM configuration logic.

The remaining code is well-structured and maintains support for both OpenAI and Azure OpenAI configurations.

topic/machine-learning/llama-index/env.azure (1)

1-8: Excellent improvement: Export keywords ensure proper environment variable availability.

Adding export keywords to all environment variable declarations ensures they are properly available to subprocesses and the Python demo scripts. This is a best practice for shell environment configuration files.

topic/machine-learning/llama-index/env.standalone (1)

2-4: Consistent improvement: Export keywords properly applied.

The addition of export keywords is consistent with the env.azure file changes and correctly ensures environment variables are available to the demo scripts. The commented OPENAI_API_KEY line appropriately remains unchanged.

topic/machine-learning/llama-index/test.py (1)

25-40: LGTM! Test structure follows good practices.

The test properly loads configuration, executes the demo in-process, and verifies expected output.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
.github/workflows/ml-llamaindex.yml (1)

49-51: Hard-coding a PR tag (pr-50) makes the workflow fragile

Building against a transient image tag tied to another PR means nightly and manual runs will suddenly break once that PR is rebased or the tag is removed. Consider defaulting to a released version (e.g. main or a semver tag) and overriding it only in the rare jobs that explicitly need the PR build.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5240007 and efc2ed8.

📒 Files selected for processing (1)
  • .github/workflows/ml-llamaindex.yml (2 hunks)
🧰 Additional context used
🧠 Learnings (1)
.github/workflows/ml-llamaindex.yml (1)
Learnt from: amotl
PR: crate/cratedb-examples#937
File: topic/machine-learning/llm-langchain/requirements-dev.txt:2-2
Timestamp: 2025-05-12T20:10:38.614Z
Learning: The cratedb-toolkit package supports various extras including "io", "datasets", "influxdb", "mongodb", "testing", and many others.

@amotl amotl force-pushed the llamaindex-mcp branch 3 times, most recently from b51777e to 26648cb Compare July 20, 2025 00:51
Comment on lines 49 to 51
cratedb-mcp-version: [
'pr-50',
]
Copy link
Member Author

@amotl amotl Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be adjusted after the next release of cratedb-mcp.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved with a43ee4e.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (5)
topic/machine-learning/llama-index/README.md (5)

63-66: Add language identifier to the venv code block

Markdown-lint flags this fence (MD040).
Adding the language (shell) improves syntax highlighting and keeps the doc lint-clean.

-```
+```shell
 python3 -m venv .venv
 source .venv/bin/activate

---

`105-105`: **Specify a language for the “expected output” block**

Same MD040 issue here. Use `text` (or `console`) to satisfy the linter and give readers proper formatting.

```diff
-```
+```text

149-149: Missing language identifier in second output block

Repeat of the previous comment for the MCP expected output section.

-```
+```text

81-89: Remove superfluous quotes around the connection string

.env parsers typically treat surrounding quotes as literal characters, breaking the URL.
Safer to drop them so copy-pasting “just works”.

-CRATEDB_SQLALCHEMY_URL="crate://<Database user name>:<Database password>@<Database host>:4200/?ssl=true"
+CRATEDB_SQLALCHEMY_URL=crate://<Database user name>:<Database password>@<Database host>:4200/?ssl=true

95-97: Minor grammar / consistency tweak for the link label

“Apostrophe-s” reads better and matches earlier “NL2SQL” terminology.

-### NLSQL
-[LlamaIndex' NLSQLTableQueryEngine] is a natural language SQL table query engine.
+### NL2SQL
+[LlamaIndex's NLSQLTableQueryEngine] is a natural-language SQL table query engine.
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b51777e and 195fd2c.

📒 Files selected for processing (4)
  • .github/workflows/ml-llamaindex.yml (2 hunks)
  • topic/machine-learning/llama-index/README.md (3 hunks)
  • topic/machine-learning/llama-index/demo_mcp.py (1 hunks)
  • topic/machine-learning/llama-index/demo_nlsql.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (3)
  • topic/machine-learning/llama-index/demo_nlsql.py
  • .github/workflows/ml-llamaindex.yml
  • topic/machine-learning/llama-index/demo_mcp.py
🧰 Additional context used
🧠 Learnings (1)
📓 Common learnings
Learnt from: amotl
PR: crate/cratedb-examples#1032
File: topic/machine-learning/llama-index/demo_nlsql.py:28-29
Timestamp: 2025-07-20T00:14:38.691Z
Learning: In demonstration and example code within the cratedb-examples repository, prefer simpler code without extensive error handling to maintain clarity and readability of the examples.
🪛 markdownlint-cli2 (0.17.2)
topic/machine-learning/llama-index/README.md

149-149: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (1)
topic/machine-learning/llama-index/README.md (1)

136-142: Clarify uvx prerequisite

uvx isn’t a standard tool; newcomers may hit “command not found”.
Please mention how to install it (e.g. pipx install uvx or pip install cratedb-mcp[cli]) or offer an alternative launcher.

Comment on lines 48 to 58
async def get_agent(self):
return FunctionAgent(
name="Agent",
description="CrateDB text-to-SQL agent",
llm=OpenAI(model="gpt-4o"),
tools=await self.get_tools(),
system_prompt=Instructions.full(),
)
Copy link
Member Author

@amotl amotl Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better use the full instructions (general+mcp) including how to use the provided tools after this patch has been merged.

@amotl amotl changed the title LlamaIndex: Add MCP server example LlamaIndex: Add example using MCP server Jul 20, 2025
@amotl amotl marked this pull request as ready for review July 20, 2025 02:21
@amotl amotl changed the title LlamaIndex: Add example using MCP server LlamaIndex: Add example using MCP Jul 20, 2025
@amotl amotl mentioned this pull request Jul 20, 2025
2 tasks
Copy link
Member

@kneth kneth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@amotl amotl merged commit a0212ae into main Jul 21, 2025
3 checks passed
@amotl amotl deleted the llamaindex-mcp branch July 21, 2025 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants