Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support LowCardinality series datatype in getMetadata().getColumns() function #442

Closed
geniusjoe opened this issue Aug 10, 2023 · 1 comment

Comments

@geniusjoe
Copy link

Environment

  • OS version:
~ # uname -a                                                          
Linux VM-152-173-centos 5.4.203-1-tlinux4-0011.1 #1 SMP Tue Apr 18 16:47:40 CST 2023 x86_64 x86_64 x86_64 GNU/Linux
  • JDK version:
~ # java -version                                                
openjdk version "1.8.0_292"
OpenJDK Runtime Environment (Tencent Kona 8.0.6-internal) (build 1.8.0_292-b2)
OpenJDK 64-Bit Server VM (Tencent Kona 8.0.6-internal) (build 25.292-b2, mixed mode, sharing)
  • ClickHouse Server version:
~ # clickhouse-client --host <host-addr> --port <port>             
ClickHouse client version 22.2.2.1.
Connecting to <host-addr> as user default.
Connected to ClickHouse server version 22.1.3 revision 54455.
  • ClickHouse Native JDBC version:
    c119dd5
  • (Optional) Spark version: N/A
  • (Optional) Other components' version: N/A

Error logs

I encountered below exception when my code execute connection.getMetaData().getColumns jdbc function.

Exception in thread "main" java.sql.SQLException: Unknown data type: LowCardinality
	at com.github.housepower.misc.Validate.isTrue(Validate.java:41)
	at com.github.housepower.data.DataTypeFactory.get(DataTypeFactory.java:104)
	at com.github.housepower.data.DataTypeFactory.get(DataTypeFactory.java:66)
	at com.github.housepower.jdbc.ClickHouseDatabaseMetadata.getColumns(ClickHouseDatabaseMetadata.java:912)
	at examples.SimpleQuery.main(SimpleQuery.java:30)

Steps to reproduce

Create clickhouse table

Execute below ddl statement in default database to create a table with a LowCardinality(String) datatype.

CREATE TABLE test1 
( 
    user_id UInt32, 
    message LowCardinality(String)
) 
ENGINE = MergeTree() 
order by tuple()

Insert data

Insert some rows into this table

INSERT INTO test1 VALUES (1, 'a') ;
INSERT INTO test1 VALUES (2, 'b') ;
INSERT INTO test1 VALUES (3, 'c') ;

Create java demo

Create a java demo to test getColumn() function, and above exception occurs when running the demo.

public class SimpleQuery {

    public static void main(String[] args) throws Exception {
        try (Connection connection = DriverManager.getConnection("jdbc:clickhouse://<ip>:<port>?client_name=ck-example")) {
            ResultSet columnsMetadata = connection.getMetaData().getColumns(null, "default", "test1", null);
            while (columnsMetadata.next()) {
                System.out.println("DATA_TYPE" + "\t" + JDBCType.valueOf(columnsMetadata.getInt("DATA_TYPE")).getName());
                System.out.println("COLUMN_NAME" + "\t" + columnsMetadata.getString("COLUMN_NAME"));
                System.out.println("DECIMAL_DIGITS" + "\t" + columnsMetadata.getString("DECIMAL_DIGITS"));
                System.out.println("IS_NULLABLE" + "\t" + columnsMetadata.getString("IS_NULLABLE"));
                System.out.println("IS_AUTOINCREMENT" + "\t" + columnsMetadata.getString("IS_AUTOINCREMENT"));
                System.out.println("COLUMN_DEF" + "\t" + columnsMetadata.getString("COLUMN_DEF"));
                System.out.println("COLUMN_SIZE" + "\t" + columnsMetadata.getString("COLUMN_SIZE"));
                System.out.println();
            }
        }
    }
}

Other descriptions

I think this exception is because SQLLexer.bareWord() function will parse first alphabetic word as datatype. When given a data type LowCardinality(String), then for-loop iteration in the bareWord function will encounter first left bracket and return. So that LowCardinality will be regarded as dataTypeName in IDataType<?, ?> get function, and cannot be parsed as known data type.

214e194d9020 :) SELECT database, table, name, type, default_kind as default_type, default_expression FROM system.columns where database LIKE '%default%' and table like '%test1%'; 
                

SELECT
    database,
    table,
    name,
    type,
    default_kind AS default_type,
    default_expression
FROM system.columns
WHERE (database LIKE '%default%') AND (table LIKE '%test1%')

Query id: 38741dd5-06d0-42aa-8199-970d498a0f92

┌─database─┬─table─┬─name────┬─type───────────────────┬─default_type─┬─default_expression─┐
│ default  │ test1 │ user_id │ UInt32                 │              │                    │
│ default  │ test1 │ message │ LowCardinality(String) │              │                    │
└──────────┴───────┴─────────┴────────────────────────┴──────────────┴────────────────────┘

2 rows in set. Elapsed: 0.002 sec. 

I think we can add a LowCardinality param check in public static IDataType<?, ?> get function. We can check if current parsed dataTypeName is LowCardinality. If so, we can make pos cursor forward and return inner data type recursively. Examples below:

    public static IDataType<?, ?> get(SQLLexer lexer, NativeContext.ServerContext serverContext) throws SQLException {
        String dataTypeName = String.valueOf(lexer.bareWord());

        if (dataTypeName.equalsIgnoreCase("Tuple")) {
           ...
        } else if (dataTypeName.equalsIgnoreCase("Array")) {
           ...
        }  ...
        else if (dataTypeName.equalsIgnoreCase("LowCardinality")){
            Validate.isTrue(lexer.character() == '(');
            IDataType<?, ?> nestedDataType = DataTypeFactory.get(lexer, serverContext);
            Validate.isTrue(lexer.character() == ')');
            return nestedDataType;
        } else {
            IDataType<?, ?> dataType = dataTypes.get(dataTypeName.toLowerCase(Locale.ROOT));
            Validate.isTrue(dataType != null, "Unknown data type: " + dataTypeName);
            return dataType;
        }
    }

Are there any suggestions? And if this issue is reasonable, I can also make a PR.

@geniusjoe
Copy link
Author

@pan3793
Hello pan. Although most of this metadata related code was written by sundy-li, I feel that he is currently quite busy. So, could you please help me review this portion of the code?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants