Skip to content

ClickHouse: Display implementation converts types to uppercase, causing UNKNOWN_TYPE errors #2153

@callicles

Description

@callicles

Problem

ClickHouse data types are case-sensitive and require PascalCase (e.g., String, Int32, Nullable). However, the sqlparser-rs library's Display implementation for DataType converts certain types to uppercase, causing UNKNOWN_TYPE errors when round-tripping SQL through ClickHouse.

Example

use sqlparser::dialect::ClickHouseDialect;
use sqlparser::parser::Parser;

let sql = "CREATE TABLE t (col Nullable(String))";
let dialect = ClickHouseDialect {};
let ast = Parser::parse_sql(&dialect, sql).unwrap();

// Round-trip: parse and convert back to string
let regenerated = ast[0].to_string();
// Result: "CREATE TABLE t (col Nullable(STRING))"
//                                       ^^^^^^ uppercase!

When this regenerated SQL is executed against ClickHouse, it fails with:

Code: 47. DB::Exception: Unknown type STRING. (UNKNOWN_TYPE)

Affected Types

Type Current Output ClickHouse Requires
DataType::Int8 INT8 Int8
DataType::Int64 INT64 Int64
DataType::Float64 FLOAT64 Float64
DataType::String STRING String
DataType::Bool BOOL Bool
DataType::Date DATE Date
DataType::Datetime DATETIME DateTime

Types already correct (PascalCase):

  • Int16, Int32, Int128, Int256
  • UInt8, UInt16, UInt32, UInt64, UInt128, UInt256
  • Float32
  • Nullable, LowCardinality, Array, Map, Tuple, Nested

Root Cause

The Display trait implementation for DataType uses uppercase for type names (e.g., write!(f, "STRING")), which is standard for most SQL dialects but incorrect for ClickHouse.

The challenge is that Display doesn't have access to dialect context, so it can't conditionally format based on the active dialect.

The Problem with Display

Most users serialize SQL by calling Display on top-level AST types:

let ast = Parser::parse_sql(&dialect, sql).unwrap();
let regenerated = ast[0].to_string();  // Uses Display on Statement
// or
let regenerated = format!("{}", query);  // Uses Display on Query

The Display trait signature doesn't allow passing context:

fn fmt(&self, f: &mut fmt::Formatter<'_>) -> fmt::Result

Proposed Solution

To solve this, we need dialect-aware serialization throughout the AST hierarchy:

Add to_sql(&dyn Dialect) to all AST types

  • Add to_sql(&dyn Dialect) -> String method to Statement, Query, Expr, ColumnDef, and other AST types
  • Each type's implementation calls to_sql() on its children, propagating the dialect
  • Keep existing Display implementations unchanged for backwards compatibility
// Example usage after fix:
let ast = Parser::parse_sql(&dialect, sql).unwrap();
let regenerated = ast[0].to_sql(&dialect);  // Correct PascalCase for ClickHouse

Implementation Scope

The affected types include (non-exhaustive):

  • Statement (top-level)
  • Query, SetExpr, Select
  • Expr (especially Cast, TryCast, SafeCast)
  • ColumnDef, ColumnOption
  • TableConstraint
  • AlterTableOperation
  • FunctionArg, FunctionArgExpr

This is a significant change but provides the cleanest API and maintains backwards compatibility.

Workarounds

Currently, users must post-process the SQL string output to fix casing. See 514-labs/moosestack#3152 for an example regex-based workaround.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions