🧠 QueryCraft-AI

Natural Language → SQL for MySQL (Streamlit + LangChain + Gemini)

Ask business questions in plain English and get instant answers from your MySQL database — no SQL required.
Built with LangChain’s SQLDatabaseChain, Google Gemini 2.5 Flash, semantic few-shot prompting (Chroma + HuggingFace), and a clean Streamlit UI.

✨ Key Features

🧠 LLM → SQL Conversion
Converts your natural-language questions into safe, schema-aware MySQL queries — avoids SELECT *, uses backticks, and respects table structure.
🗃️ Real Database Connectivity
Connects directly to MySQL using SQLAlchemy with full CRUD guardrails.
mysql+pymysql://username:password@localhost:3306/database_name
🧩 Few-Shot Semantic Guidance
Embeddings (HuggingFace + Chroma) select the most relevant Q/A examples to guide SQL generation.
📅 Date-Aware Reasoning
Automatically interprets “today,” “this week,” or “last month” using CURDATE() and SQL functions.
⚡ Streamlit UI + Caching
Smooth, stable interaction via @st.cache_resource — optimized for repeated queries.

🖥️ Screenshots

Add your images to the repo under ./screenshots/ and update file names below.

🧰 Tech Stack

Layer	Tools
🖥️ Frontend / App	Streamlit
🧠 LLM Orchestration	LangChain (`SQLDatabaseChain`)
💬 Model	Google Gemini 2.5 Flash (`langchain_google_genai`)
🔍 Semantic Selector	HuggingFace Embeddings + Chroma
🗃️ Database	MySQL (SQLAlchemy + PyMySQL)

🗂️ Project Structure

├── main.py # Streamlit UI (input, execution, display) ├── data_base.py # MySQL connection, embeddings, few-shot selector, SQL chain ├── few_shot.py # Example Q/A/SQL triplets for semantic prompting ├── .env # API keys and DB credentials (excluded from repo) └── screenshots/ # UI screenshots for README

🧪 Usage

Ask questions like:

“How many t-shirts do we have left for Nike, size XS, color white?”
“Total sales if we sell all L size Nike today after discounts — return only the number.”
“Top 5 brands by inventory count.”

Behind the scenes:

Your question is embedded → compared with few-shots → best example selected.
Gemini generates schema-aware SQL with backticks and joins.
SQL executes via MySQL connection → results shown instantly in Streamlit.

🧠 Prompting Strategy

The base prompt instructs the model to:

Avoid SELECT *; select only required columns

Wrap column names in backticks (column)

Use CURDATE() for “today” filters

Prefer LEFT JOIN when applying discounts and treat missing discounts as 0 via COALESCE

Follow a strict output format: Question: ... SQLQuery: ... SQLResult: ... Answer: ...

⚙️ Troubleshooting

Issue	Fix
Streamlit app collapses after one run	Don’t assign functions (e.g., `st.header = "..."`), always call them: `st.header("...")`
Re-runs too slow	Use `@st.cache_resource` for chain caching
Ambiguous queries	Add custom few-shots to `few_shot.py`
MySQL connection error	Check credentials, driver (`pymysql`), and database URL
No Google API access	Ensure `GOOGLE_API_KEY` is set and `langchain_google_genai` installed

🗺️ Roadmap

✅ Query checker + explainability with return_intermediate_steps=True

✅ “Show SQL” toggle + CSV export

🧩 Demo dataset for quick testing

🐳 Dockerfile for easy deployment

🔭 Future Scope: MCP, UTCP & AgentKit Integration

⚡ 1. Model Context Protocol (MCP)

Goal: Make QueryCraft-AI portable across LLM ecosystems by exposing standardized tools like:

get_schema() → returns metadata

run_sql(query) → executes read-only SQL safely

discounted_revenue(brand, size, date) → predefined helper

✅ Deliverables

/mcp/ package

Auth, audit logging, and rate-limit docs

⚡ 2. Universal Tool Calling Protocol (UTCP)

Goal: Direct low-latency calls to analytics endpoints (/sql/run, /metrics/revenue_today) without wrapper servers.

✅ Deliverables

/utcp/manifest.json

Benchmarks for MCP vs UTCP performance

⚡ 3. OpenAI Apps + AgentKit Integration

Goal: Bring QueryCraft-AI inside ChatGPT with Apps SDK and AgentKit — no local install needed.

Guided flow: “Question → SQL Plan → Execution → Answer”

Built-in guardrails, UI panel, and CSV export

AgentKit for step-by-step validation and evals

✅ Deliverables

/apps/ folder

App logs, evals, and “Show SQL” toggle

⚡ 4. Security & Compliance

Read-only DB roles & parameterized queries

Query timeouts and rate limiting

Log sanitization & PII scrubbing

Key rotation & explicit consent for data retention

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
README.md		README.md
atliq_tshirts_discounts.sql		atliq_tshirts_discounts.sql
atliq_tshirts_t_shirts.sql		atliq_tshirts_t_shirts.sql
data		data
data_base.py		data_base.py
few_shot.py		few_shot.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🧠 QueryCraft-AI

Natural Language → SQL for MySQL (Streamlit + LangChain + Gemini)

✨ Key Features

🖥️ Screenshots

🧰 Tech Stack

🗂️ Project Structure

🧪 Usage

🧠 Prompting Strategy

Follow a strict output format: Question: ... SQLQuery: ... SQLResult: ... Answer: ...

🗺️ Roadmap

🔭 Future Scope: MCP, UTCP & AgentKit Integration

⚡ 4. Security & Compliance

About

Uh oh!

Releases

Packages

Languages

karthikab5/QueryCraft-AI-

Folders and files

Latest commit

History

Repository files navigation

🧠 QueryCraft-AI

Natural Language → SQL for MySQL (Streamlit + LangChain + Gemini)

✨ Key Features

🖥️ Screenshots

🧰 Tech Stack

🗂️ Project Structure

🧪 Usage

🧠 Prompting Strategy

Follow a strict output format: Question: ... SQLQuery: ... SQLResult: ... Answer: ...

🗺️ Roadmap

🔭 Future Scope: MCP, UTCP & AgentKit Integration

⚡ 4. Security & Compliance

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages