Skip to content

SAP HANA meets modern Python. Rust-powered driver with zero-copy Arrow, native Polars/pandas support, async pooling. Includes MCP server for AI assistants.

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT
Notifications You must be signed in to change notification settings

bug-ops/pyhdb-rs

pyhdb-rs

Unlock your SAP HANA data for modern analytics and AI workflows.

Pure Rust toolkit — no SAP client installation required. Connect to HANA from anywhere, stream data directly into Polars, pandas, or DuckDB via Apache Arrow. Let AI assistants explore your schemas through a secure MCP interface.

CI Security codecov CodSpeed PyPI Crates.io License

Why pyhdb-rs?

Pain Point Solution
SAP client installation & licensing Pure Rust — no SAP dependencies, just pip install
Memory explodes with big datasets Zero-copy Arrow streams data directly to DataFrames
Manual schema discovery for AI tools MCP server gives Claude/Cline native HANA access
Complex ETL for analytics One-liner to Polars LazyFrame or pandas

Quick Start

For Data Engineers & Scientists

pip install pyhdb_rs
from pyhdb_rs import ConnectionBuilder
import polars as pl

# Connect and extract — data flows directly to Polars without copies
conn = ConnectionBuilder.from_url("hdbsql://user:pass@hana:30015").build()
df = pl.from_arrow(conn.execute_arrow("""
    SELECT material, plant, SUM(quantity) as total
    FROM sapabap1.mard
    GROUP BY material, plant
"""))

# Lazy evaluation for memory-efficient analytics
result = df.lazy().filter(pl.col("total") > 1000).collect()

Tip

Data exports via Apache Arrow — the universal columnar format. Integrates with the entire Arrow ecosystem out of the box.

Arrow Ecosystem Compatibility

Stream HANA data directly into any Arrow-compatible tool:

Category Tools
DataFrames Polars, pandas, Vaex, Dask
Query Engines DuckDB, DataFusion, ClickHouse
ETL / Streaming Apache Spark, Apache Flink, Kafka + Arrow
ML / AI Ray, Hugging Face Datasets, PyTorch
Data Lakes Delta Lake, Apache Iceberg, Lance
Serialization Parquet, Arrow IPC/Feather

Zero-copy data transfer means no serialization overhead between HANA and your analytics stack.

For AI-Assisted Development

cargo install hdbconnect-mcp

Add to Claude Desktop config:

{
  "mcpServers": {
    "hana": {
      "command": "hdbconnect-mcp",
      "args": ["--url", "hdbsql://user:pass@hana:30015"]
    }
  }
}

Now ask Claude: "Show me the top 10 customers by revenue from VBAK/VBAP" — it queries HANA directly.

Components

Package Use Case Install
pyhdb_rs Python analytics, ETL pipelines, ML feature extraction pip install pyhdb_rs
hdbconnect-mcp AI assistants, natural language queries, schema exploration cargo install hdbconnect-mcp
hdbconnect-arrow Rust applications, custom Arrow integrations cargo add hdbconnect-arrow

Python Driver

Full DB-API 2.0 compliance with native Arrow integration. No SAP client required — works anywhere Python runs.

Async ETL with connection pooling
from pyhdb_rs.aio import ConnectionPoolBuilder
import polars as pl

pool = ConnectionPoolBuilder().url("hdbsql://user:pass@hana:30015").max_size(10).build()

async def extract_sales(region: str) -> pl.DataFrame:
    async with pool.acquire() as conn:
        return pl.from_arrow(await conn.execute_arrow(f"""
            SELECT vbeln, erdat, netwr FROM sapabap1.vbak
            WHERE region = '{region}' AND erdat >= '20240101'
        """))

# Parallel extraction across regions
results = await asyncio.gather(*[extract_sales(r) for r in ["US", "EU", "APAC"]])
pandas integration
import pyarrow as pa

reader = conn.execute_arrow("SELECT * FROM sapabap1.mara WHERE mtart = 'FERT'")
df = pa.RecordBatchReader.from_stream(reader).read_all().to_pandas()
Streaming large datasets
from pyhdb_rs import ArrowConfig

# Process 100M rows with constant memory
config = ArrowConfig(batch_size=50_000)
reader = conn.execute_arrow("SELECT * FROM sapabap1.mseg", config=config)

for batch in reader:
    process_batch(batch)  # Each batch: 50K rows as Arrow RecordBatch

Full documentation: Python package README — TLS configuration, HA clusters, transaction control, error handling.

MCP Server for AI Agents

Production-ready server that exposes SAP HANA to Claude, Cline, Cursor, and any MCP-compatible assistant.

Why it matters: Instead of copy-pasting schemas or writing boilerplate queries, let AI discover and query your data directly — with guardrails.

Security by design:

  • Read-only mode blocks DML/DDL by default
  • Row limits prevent data exfiltration
  • OIDC/JWT authentication for enterprise deployments
  • Per-user cache isolation in multi-tenant setups

Available tools:

Tool What AI can do
list_tables Explore schemas: "What tables exist in SAPABAP1?"
describe_table Understand structure: "Show me VBAK columns"
execute_sql Query data: "Get top customers by revenue"
ping Verify connectivity

Full documentation: MCP server README — HTTP transport, Kubernetes deployment, Prometheus metrics.

Architecture

flowchart TB
    subgraph apps["Your Applications"]
        direction LR
        subgraph python["Python Analytics"]
            p1["ETL Pipelines"]
            p2["ML Features"]
            p3["BI Dashboards"]
        end
        subgraph ai["AI Assistants"]
            m1["Claude Desktop"]
            m2["Cline / Cursor"]
            m3["Custom Agents"]
        end
    end

    subgraph core["Rust Core"]
        arrow["hdbconnect-arrow\nZero-copy HANA → Arrow"]
    end

    subgraph hana["SAP HANA"]
        db[("BW/4HANA\nS/4HANA\nHANA Cloud")]
    end

    apps --> core
    core --> hana

    style apps fill:#e3f2fd
    style core fill:#fff8e1
    style hana fill:#e8f5e9
Loading

Requirements

  • Python 3.12+ (for pyhdb_rs)
  • Rust 1.88+ (for building from source or MCP server)

Resources

Resource Link
Python API python/README.md
MCP Server crates/hdbconnect-mcp/README.md
Arrow Integration crates/hdbconnect-arrow/README.md
Changelog CHANGELOG.md
Contributing CONTRIBUTING.md

License

Dual-licensed under Apache-2.0 or MIT at your option.

About

SAP HANA meets modern Python. Rust-powered driver with zero-copy Arrow, native Polars/pandas support, async pooling. Includes MCP server for AI assistants.

Topics

Resources

License

Apache-2.0, MIT licenses found

Licenses found

Apache-2.0
LICENSE-APACHE
MIT
LICENSE-MIT

Contributing

Security policy

Stars

Watchers

Forks

Contributors 3

  •  
  •  
  •