-
Notifications
You must be signed in to change notification settings - Fork 685
Description
Problem
ClickHouse data types are case-sensitive and require PascalCase (e.g., String, Int32, Nullable). However, the sqlparser-rs library's Display implementation for DataType converts certain types to uppercase, causing UNKNOWN_TYPE errors when round-tripping SQL through ClickHouse.
Example
use sqlparser::dialect::ClickHouseDialect;
use sqlparser::parser::Parser;
let sql = "CREATE TABLE t (col Nullable(String))";
let dialect = ClickHouseDialect {};
let ast = Parser::parse_sql(&dialect, sql).unwrap();
// Round-trip: parse and convert back to string
let regenerated = ast[0].to_string();
// Result: "CREATE TABLE t (col Nullable(STRING))"
// ^^^^^^ uppercase!When this regenerated SQL is executed against ClickHouse, it fails with:
Code: 47. DB::Exception: Unknown type STRING. (UNKNOWN_TYPE)
Affected Types
| Type | Current Output | ClickHouse Requires |
|---|---|---|
DataType::Int8 |
INT8 |
Int8 |
DataType::Int64 |
INT64 |
Int64 |
DataType::Float64 |
FLOAT64 |
Float64 |
DataType::String |
STRING |
String |
DataType::Bool |
BOOL |
Bool |
DataType::Date |
DATE |
Date |
DataType::Datetime |
DATETIME |
DateTime |
Types already correct (PascalCase):
Int16,Int32,Int128,Int256UInt8,UInt16,UInt32,UInt64,UInt128,UInt256Float32Nullable,LowCardinality,Array,Map,Tuple,Nested
Root Cause
The Display trait implementation for DataType uses uppercase for type names (e.g., write!(f, "STRING")), which is standard for most SQL dialects but incorrect for ClickHouse.
The challenge is that Display doesn't have access to dialect context, so it can't conditionally format based on the active dialect.
The Problem with Display
Most users serialize SQL by calling Display on top-level AST types:
let ast = Parser::parse_sql(&dialect, sql).unwrap();
let regenerated = ast[0].to_string(); // Uses Display on Statement
// or
let regenerated = format!("{}", query); // Uses Display on QueryThe Display trait signature doesn't allow passing context:
fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::ResultProposed Solution
To solve this, we need dialect-aware serialization throughout the AST hierarchy:
Add to_sql(&dyn Dialect) to all AST types
- Add
to_sql(&dyn Dialect) -> Stringmethod toStatement,Query,Expr,ColumnDef, and other AST types - Each type's implementation calls
to_sql()on its children, propagating the dialect - Keep existing
Displayimplementations unchanged for backwards compatibility
// Example usage after fix:
let ast = Parser::parse_sql(&dialect, sql).unwrap();
let regenerated = ast[0].to_sql(&dialect); // Correct PascalCase for ClickHouseImplementation Scope
The affected types include (non-exhaustive):
Statement(top-level)Query,SetExpr,SelectExpr(especiallyCast,TryCast,SafeCast)ColumnDef,ColumnOptionTableConstraintAlterTableOperationFunctionArg,FunctionArgExpr
This is a significant change but provides the cleanest API and maintains backwards compatibility.
Workarounds
Currently, users must post-process the SQL string output to fix casing. See 514-labs/moosestack#3152 for an example regex-based workaround.
References
- ClickHouse Data Types Documentation
- Related workaround: 514-labs/moosestack#3152