A powerful PostgreSQL extension that generates SQL queries from natural language using state-of-the-art AI models from OpenAI and Anthropic.
- Natural Language to SQL: Convert plain English descriptions into valid PostgreSQL queries
- Available OpenAI Models:
You can use any valid OpenAI model name. Here is a comparison of common models:
| Model Name | Type | Cost | Pros | Cons |
|---|---|---|---|---|
gpt-5 |
Next Gen | High | Ultimate reasoning and capabilities | High cost, availability limited |
gpt-4o |
Flagship | Moderate | Best overall performance, fast | Higher cost than mini |
gpt-4o-mini |
Economy | Low | Very cheap, fast, good for simple tasks | Less capable reasoning than 4o |
- Automatic Schema Discovery: Analyzes your database schema to understand table structures and relationships
- Intelligent Query Generation: Creates optimized queries with appropriate JOINs, WHERE clauses, and LIMIT constraints
- Query Performance Analysis: Run EXPLAIN ANALYZE on queries and get AI-powered performance insights and optimization suggestions
- Configurable Response Formatting: Choose between plain SQL, enhanced text with explanations, or structured JSON responses
- Safety First: Built-in protections against dangerous operations and unauthorized system table access
- Flexible Configuration: File-based configuration with support for API keys, model selection, and response formatting
-
Prerequisites:
- PostgreSQL 12+ with development headers
- CMake 3.16+
- C++20 compatible compiler
- API key from OpenAI or Anthropic
-
Build and Install:
git clone --recurse-submodules https://github.com/benodiwal/pg_ai_query.git cd pg_ai_query mkdir build && cd build cmake .. make && sudo make install
-
Enable Extension:
CREATE EXTENSION pg_ai_query;
Create ~/.pg_ai.config:
[general]
log_level = "INFO"
enable_logging = false
[query]
enforce_limit = true
default_limit = 1000
[response]
show_explanation = true
show_warnings = true
show_suggested_visualization = false
use_formatted_response = false
[openai]
api_key = "your-openai-api-key-here"
default_model = "gpt-4o"
[anthropic]
api_key = "your-anthropic-api-key-here"
default_model = "claude-3-5-sonnet-20241022"You can also configure API keys using environment variables. These will override values in the configuration file:
OPENAI_API_KEY: API key for OpenAIANTHROPIC_API_KEY: API key for Anthropic
Example:
export OPENAI_API_KEY="sk-..."
export ANTHROPIC_API_KEY="sk-ant-..."Available Anthropic Models:
You can use any valid Anthropic model name. Here is a comparison of common models:
| Model Name | Type | Cost | Pros | Cons |
|---|---|---|---|---|
claude-4.5-opus |
Next Gen | High | Ultimate reasoning and capabilities | High cost, availability limited |
claude-sonnet-4-5-20250929 |
Flagship | Moderate | Top-tier reasoning and coding | - |
claude-3-haiku-20240307 |
Economy | Low | Extremely fast and cheap | Lower reasoning capability |
-- Generate simple queries
SELECT generate_query('show all customers');
-- Generate complex analytical queries
SELECT generate_query('monthly sales trend for the last year by category');
-- Generate queries with business logic
SELECT generate_query('customers who have not placed orders in the last 6 months');
-- Schema discovery functions
SELECT get_database_tables();
SELECT get_table_details('orders');
-- Explain and analyze query performance
SELECT explain_query('SELECT * FROM users WHERE created_at > NOW() - INTERVAL ''7 days''');
SELECT explain_query('SELECT u.name, COUNT(o.id) FROM users u LEFT JOIN orders o ON u.id = o.user_id GROUP BY u.id');Plain SQL (default):
SELECT * FROM customers WHERE created_at >= NOW() - INTERVAL '7 days' LIMIT 1000;Enhanced with explanations and warnings:
SELECT * FROM customers WHERE created_at >= NOW() - INTERVAL '7 days' LIMIT 1000;
-- Explanation:
-- Retrieves all customers who were created within the last 7 days
-- Warnings:
-- 1. Large dataset: Consider adding specific filters for better performanceJSON format (set use_formatted_response = true):
{
"query": "SELECT * FROM customers WHERE created_at >= NOW() - INTERVAL '7 days' LIMIT 1000;",
"success": true,
"explanation": "Retrieves all customers who were created within the last 7 days",
"warnings": ["Large dataset: Consider adding specific filters for better performance"],
"suggested_visualization": "table",
"row_limit_applied": true
}The explain_query function runs EXPLAIN ANALYZE on your queries and provides AI-generated performance insights:
-- Analyze a simple query
SELECT explain_query('SELECT * FROM users WHERE created_at > NOW() - INTERVAL ''7 days''');
-- Analyze complex queries with joins
SELECT explain_query('
SELECT u.username, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
GROUP BY u.id, u.username
ORDER BY order_count DESC
');
-- Use with specific API key and provider
SELECT explain_query(
'SELECT * FROM products WHERE price > 100 ORDER BY price DESC LIMIT 10',
'your-api-key-here',
'anthropic'
);Example Output:
Query Overview:
This query retrieves all users created within the last 7 days, selecting key user information including ID, username, email, creation date, last login, and account status.
Performance Summary:
- Overall Execution Time: 0.043 milliseconds
- Total Cost: 12.10 (PostgreSQL's relative cost estimate)
- Rows Processed: 4 rows returned from 4 rows examined
Execution Plan Analysis:
- Limit Node: Restricts output to 1000 rows (though only 4 rows exist)
- Sequential Scan: Reads through the entire users table
- Filter Applied: created_at > (NOW() - '7 days'::interval)
Performance Issues:
No significant performance issues detected for this small dataset.
Optimization Suggestions:
- For larger datasets, consider adding an index on created_at column
- If this query runs frequently, a partial index on created_at WHERE created_at > NOW() - INTERVAL '30 days' could be beneficial
Index Recommendations:
CREATE INDEX idx_users_created_at ON users(created_at);
Complete documentation is available at: https://benodiwal.github.io/pg_ai_query/
- Installation Guide
- Configuration Reference
- Response Formatting
- Usage Examples
- API Reference
- Troubleshooting
- System Table Protection: Blocks access to
information_schemaandpg_catalog - Model Availability: Accepts any valid model name string (verify availability with provider)
- Limited Scope: Only operates on user tables
- Configurable Limits: Built-in row limit enforcement
- API Key Security: Secure handling of API credentials
We welcome contributions! Please see CONTRIBUTING.md for guidelines on how to contribute to this project.
This project is licensed under the terms specified in the LICENSE file.
- Documentation: https://benodiwal.github.io/pg_ai_query/
- Issues: GitHub Issues
- Discussions: GitHub Discussions