Skip to content

Conversation

@joe-clickhouse
Copy link
Contributor

Summary

Adds support to the SQLAlchemy core API for:

  • [LEFT] ARRAY JOINs e.g.
  • FINAL modifier for use with Replacing* tables

Examples

Assume this is our table:

id name tags
1 Alice ['python', 'sql', 'clickhouse']
2 Bob ['java', 'sql']
3 Joe ['python', 'javascript']
4 Charlie []

ARRAY JOIN

Now you can build a query like

query = (
    select(test_table.c.id, test_table.c.name, test_table.c.tags)
    .select_from(array_join(test_table, test_table.c.tags))
    .order_by(test_table.c.id)
    .order_by(test_table.c.tags)
)

This will result in:

id name tag
1 Alice "clickhouse"
1 Alice "python"
1 Alice "sql"
2 Bob "java"
2 Bob "sql"
3 Joe "javascript"
3 Joe "python"

LEFT ARRAY JOIN

Similarly, a LEFT ARRAY JOIN:

query = (
    select(
        test_table.c.id,
        test_table.c.name,
        literal_column("tag"),  # Needed when using alias
    )
    .select_from(array_join(test_table, test_table.c.tags, alias="tag", is_left=True))
    .order_by(test_table.c.id)
    .order_by(literal_column("tag"))
)

This will result in:

id name tag
1 Alice "clickhouse"
1 Alice "python"
1 Alice "sql"
2 Bob "java"
2 Bob "sql"
3 Joe "javascript"
3 Joe "python"
4 Charlie ""

FINAL modifier

Assume we have this table in a ReplacingMergeTree table

id name value
1 Alice 100
1 Alice 200
2 Bob 300

Now you can use the .final() method to ensure FINAL is included in the generated SQL, forcing ClickHouse to fully merge the data at query time. Use it like this:

query = select(test_table).final().order_by(test_table.c.id)

This will generate SQL like:

SELECT id, name, value FROM test_table FINAL ORDER BY id

Checklist

Delete items not relevant to your PR:

  • Unit and integration tests covering the common scenarios were added
  • A human-readable description of the changes was provided to include in CHANGELOG

Closes #579

@joe-clickhouse joe-clickhouse linked an issue Oct 31, 2025 that may be closed by this pull request
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Adds SQLAlchemy core API support for ClickHouse-specific features: ARRAY JOIN clauses and FINAL modifier for ReplacingMergeTree tables.

  • Introduces array_join() function and ArrayJoin class for handling ClickHouse ARRAY JOIN operations
  • Adds final() function and Select.final() method to apply FINAL modifier for ReplacingMergeTree tables
  • Extends SQL compiler to properly render these ClickHouse-specific SQL constructs

Reviewed Changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
clickhouse_connect/cc_sqlalchemy/sql/clauses.py Implements ArrayJoin class and array_join() function for ARRAY JOIN support
clickhouse_connect/cc_sqlalchemy/sql/__init__.py Adds final() function and monkey-patches Select class with .final() method
clickhouse_connect/cc_sqlalchemy/sql/compiler.py Extends compiler to handle ArrayJoin and FINAL hint rendering
clickhouse_connect/cc_sqlalchemy/__init__.py Exports new array_join, ArrayJoin, and final symbols
tests/integration_tests/test_sqlalchemy/test_array_join.py Integration tests for ARRAY JOIN functionality
tests/integration_tests/test_sqlalchemy/test_ddl.py Tests for FINAL modifier with error case validation
tests/integration_tests/test_sqlalchemy/test_select.py Additional test for argMax aggregate function
tests/integration_tests/test_sqlalchemy/conftest.py Helper functions for table creation in tests
README.md Updates feature list to mention ARRAY JOIN and FINAL support
CHANGELOG.md Documents the new feature addition

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

assert rows[1].latest_name == "Bob_v2"
assert rows[1].latest_value == 250

test_table.drop(conn)
Copy link

Copilot AI Oct 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The table drop operation is redundant since the test uses table_context which already handles cleanup. This line should be removed to avoid duplication.

Suggested change
test_table.drop(conn)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for array join and final in sqlalchemy

2 participants