-
Notifications
You must be signed in to change notification settings - Fork 40
Support IBM Db2 #2598
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support IBM Db2 #2598
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive support for IBM Db2 (versions 11 and 12) to the JDBC module. Key changes include integrating DB2-specific connection properties and configuration constants, adapting SQL type definitions and error handling in the JDBC admin methods, and extending integration tests and CI workflows to cover DB2.
Reviewed Changes
Copilot reviewed 40 out of 41 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| core/src/main/java/com/scalar/db/storage/jdbc/JdbcUtils.java | Adds application of DB2 connection properties to data source initialization. |
| core/src/main/java/com/scalar/db/storage/jdbc/JdbcConfig.java | Introduces DB2-specific configuration constants and validation for key column sizes and time column defaults. |
| core/src/main/java/com/scalar/db/storage/jdbc/JdbcAdmin.java | Updates text and key type handling and adds SQL warning handling for improved index creation logic. |
| core/src/integration-test/... | Updates several integration tests to conditionally adjust behavior for DB2, including minimal value handling and index creation tests. |
| .github/workflows/ci.yaml | Adds new CI jobs to run integration tests on DB2 12.1 and 11.5 containers. |
Files not reviewed (1)
- core/build.gradle: Language not supported
Comments suppressed due to low confidence (1)
core/src/main/java/com/scalar/db/storage/jdbc/JdbcAdmin.java:676
- The variable 'keyDataTYpe' contains a typographical error; consider renaming it to 'keyDataType' for consistency.
String keyDataTYpe = rdbEngine.getDataTypeForKey(scalarDbColumnType);
| columns.put("col01", "SMALLINT"); | ||
| columns.put("col02", "INT"); | ||
| columns.put("col03", "BIGINT"); | ||
| columns.put("col04", "REAL"); | ||
| columns.put("col05", "FLOAT(24)"); // Maps to REAL if precision <=24 | ||
| columns.put("col06", "DOUBLE"); | ||
| columns.put("col07", "FLOAT"); | ||
| columns.put("col08", "FLOAT(25)"); // Maps to DOUBLE if precision => 25 | ||
| columns.put("col09", "CHAR(3)"); | ||
| columns.put("col10", "VARCHAR(512)"); | ||
| columns.put("col11", "CLOB"); | ||
| columns.put("col12", "GRAPHIC(3)"); | ||
| columns.put("col13", "VARGRAPHIC(512)"); | ||
| columns.put("col14", "DBCLOB(5)"); | ||
| columns.put("col15", "NCHAR(3)"); | ||
| columns.put("col16", "NVARCHAR(512)"); | ||
| columns.put("col17", "NCLOB(512)"); | ||
| columns.put("col18", "BINARY(5)"); | ||
| columns.put("col19", "VARBINARY(512)"); | ||
| columns.put("col20", "BLOB(1024)"); | ||
| columns.put("col21", "CHAR(5) FOR BIT DATA"); | ||
| columns.put("col22", "VARCHAR(512) FOR BIT DATA"); | ||
| columns.put("col23", "DATE"); | ||
| columns.put("col24", "TIME"); | ||
| columns.put("col25", "TIMESTAMP(6)"); // override to TIME | ||
| columns.put("col26", "TIMESTAMP(3)"); | ||
| columns.put("col27", "TIMESTAMP(3)"); // override to TIMESTAMPTZ | ||
| columns.put("col28", "BOOLEAN"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This lists all the types supported by the import table feature.
| "geometry", | ||
| "geography"); | ||
| static final List<String> UNSUPPORTED_DATA_TYPES_DB2 = | ||
| Arrays.asList("DECIMAL", "DECFLOAT", "XML"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These types cannot be imported with the import table feature.
| if (JdbcTestUtils.isDb2(rdbEngine)) { | ||
| if (dataType == DataType.FLOAT) { | ||
| return JdbcTestUtils.getMinDb2FloatValue(columnName); | ||
| } | ||
| if (dataType == DataType.DOUBLE) { | ||
| return JdbcTestUtils.getMinDb2DoubleValue(columnName); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Db2 FLOAT and DOUBLE data types do not support the subnormal numbers range below:
- FLOAT between the interval [-1.17E-38; -1.4E-45] and [1.4E-45; 1.17E-38]
- DOUBLE between the intervals [-2.22E-308; -4.9E-324] and [4.9E-324; 2.22E-308]
In Java, the subnormal range boundaries are noted by the constants Float.MIN_NORMAL, Float.MIN_VALUE, and Double.MIN_NORMAL, Double.MIN_VALUE.
| hasDifferentClusteringOrders(metadata), schema, table, metadata, ifNotExists); | ||
| try { | ||
| execute(connection, stmts); | ||
| execute(connection, stmts, ifNotExists ? null : rdbEngine::throwIfDuplicatedIndexWarning); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the index already exists, the create index ... SQL statement will return a warning instead of an error, which differs from other storages.
So we need to check the warning and throw it as a SQLException if a duplicate index exists.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL statement will return a warning instead of an error
This issue happens only with DB2, right? My slight concern is this change might affect existing production services that use ScalarDB's other JDBC adapters, where negligible warnings have been emitted silently. How about passing the lambda only when using DB2 adapter?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, only Db2 works like that.
To make sure we are on the same page, I'd like to explain the current implementation in detail.
Only an exception will be thrown with Db2 if a duplicate index code is matched.
scalardb/core/src/main/java/com/scalar/db/storage/jdbc/RdbEngineDb2.java
Lines 281 to 292 in adcf4d5
| public void throwIfDuplicatedIndexWarning(SQLWarning warning) throws SQLException { | |
| assert warning != null; | |
| if (warning.getErrorCode() == 605) { | |
| // Only a warning is raised when the index already exists but no exception is thrown by the | |
| // driver. | |
| // To match the behavior of other storages, we throw an exception in this case. | |
| // | |
| // SQL error code 605: The index name already exists | |
| throw new SQLException( | |
| warning.getMessage(), warning.getSQLState(), warning.getErrorCode(), warning.getCause()); | |
| } | |
| } |
But for other storage, the warning won't be parsed.
scalardb/core/src/main/java/com/scalar/db/storage/jdbc/RdbEngineStrategy.java
Lines 220 to 228 in adcf4d5
| /** | |
| * Throws an exception if the given SQLWarning is a duplicate index warning. | |
| * | |
| * @param warning the SQLWarning to check | |
| * @throws SQLException if the warning is a duplicate index warning | |
| */ | |
| default void throwIfDuplicatedIndexWarning(SQLWarning warning) throws SQLException { | |
| // Do nothing | |
| } |
Regarding your concern,
My slight concern is this change might affect existing production services that use ScalarDB's other JDBC adapters, where negligible warnings have been emitted silently
This implementation should not impact other storage, even if warnings are emitted. Warning will indeed be looped through, but never processed..
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But for other storage, the warning won't be parsed.
Ah sorry, I overlooked the default one. So, the current implementation of this PR looks good!
| PREFIX + "mysql.variable_key_column_size"; | ||
| public static final String ORACLE_VARIABLE_KEY_COLUMN_SIZE = | ||
| PREFIX + "oracle.variable_key_column_size"; | ||
| public static final String DB2_VARIABLE_KEY_COLUMN_SIZE = PREFIX + "db2.variable_key_column_size"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to reduce the length of key and index columns, as we do for Oracle and MySQL. The length is configurable with the scalar.db.jdbc.db2.variable_key_column_size configuration.
| public static final String DB2_TIME_COLUMN_DEFAULT_DATE_COMPONENT = | ||
| PREFIX + "db2.time_column.default_date_component"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ScalarDB TIME type supports the microsecond fraction of a second, but the Db2 TIME type does not support fractional seconds. For this reason, I chose to use the Db2 TIMESTAMP type by default to map the ScalarDB TIME type. The drawback is that a fixed date component is stored with the TIME value, which increases the data volume.
Similarly to Oracle, the fixed date component value used to store value in the TIMESTAMP type is configurable with scalar.db.jdbc.db2.time_column.default_date_component
| for (Entry<String, String> entry : rdbEngine.getConnectionProperties().entrySet()) { | ||
| dataSource.addConnectionProperty(entry.getKey(), entry.getValue()); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now we add custom JDBC connection properties if set in the RdbEngine for all JDBC data source used by:
DistributedStorage=> added previouslyDistributedStorageAdmin=> newly addedTableMetadataManager=> newly added
This is motivated by the fact that the Db2 driver requires a configuration property retrieveMessagesFromServerOnGetMessage to print a human-friendly error message that does not contain only the error codes.
| case BIGINT: | ||
| return "BIGINT"; | ||
| case BLOB: | ||
| return "VARBINARY(32672)"; | ||
| case BOOLEAN: | ||
| return "BOOLEAN"; | ||
| case FLOAT: | ||
| return "REAL"; | ||
| case DOUBLE: | ||
| return "DOUBLE"; | ||
| case INT: | ||
| return "INT"; | ||
| case TEXT: | ||
| return "VARCHAR(32672)"; | ||
| case DATE: | ||
| return "DATE"; | ||
| case TIME: | ||
| return "TIMESTAMP(6)"; | ||
| case TIMESTAMP: | ||
| case TIMESTAMPTZ: | ||
| return "TIMESTAMP(3)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The list default type mapping for ScalarDB data types.
|
|
||
| @Override | ||
| public Map<String, String> getConnectionProperties() { | ||
| ImmutableMap.Builder<String, String> props = new ImmutableMap.Builder<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add several JDBC driver properties; see the code comments for the reasoning.
| hasDifferentClusteringOrders(metadata), schema, table, metadata, ifNotExists); | ||
| try { | ||
| execute(connection, stmts); | ||
| execute(connection, stmts, ifNotExists ? null : rdbEngine::throwIfDuplicatedIndexWarning); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL statement will return a warning instead of an error
This issue happens only with DB2, right? My slight concern is this change might affect existing production services that use ScalarDB's other JDBC adapters, where negligible warnings have been emitted silently. How about passing the lambda only when using DB2 adapter?
| case BIGINT: | ||
| return "BIGINT"; | ||
| case BLOB: | ||
| return "VARBINARY(32672)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
32KB seems small to me. Is it possible to use DB2's BLOB type which has a maximum size 2GB?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is possible to use BLOB, but this data type cannot be an index.
And with the purpose of creating an index on an existing table, altering a BLOB column to a VARBINARY is not allowed.
Similarly, for Oracle, I assume that's why it was decided to map ScalarDB BLOB to Oracle RAW(2000), while Oracle also provides BLOB type that allows a much bigger size than RAW.
So, it all boils down to whether there is a use case for creating an index on a blob column; if not, then we can choose to use BLOB types for the default mapping. And to stay consistent, we should do the same for Oracle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. That's an interesting limitation.
From the user perspective of ScalarDB, especially as the developer of SSR (which serializes a group of many items and stores it in a BLOB column), the 32KB limitation could be a blocker for that use case.
How about using DB2 VARBINARY for partition/clustering keys and secondary indexes, and DB2 BLOB for regular columns? As for cases where adding a new secondary index on DB2 BLOB type column, I think it should fail by validations, though.
If it works well, migrating the Oracle RAW type to BLOB type keeping compatibility might be an option in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the approach you're suggesting makes sense too. It only forbids creating an index on a BLOB column of an existing table.
@feeblefakie @brfrn169 From a product requirement perspective, what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let me clarify my understanding as follows:
First of all, as noted in the following documentation, some adopters already use different data types for partition key, clustering key, and secondary index key columns compared to regular columns:
That said, the reason we don’t use Oracle BLOB or DB2 BLOB for regular columns is due to an issue that arises when creating indexes on those columns.
When creating an index on a column that uses a different data type for secondary index keys, we need to alter the column’s data type before creating the index. However, Oracle and DB2 do not allow altering the data type from BLOB to REAL or VARBINARY, and vice versa.
Due to this limitation, we decided to use REAL or VARBINARY for regular columns as well, to avoid type-alteration issues during index creation.
I think there’s one more point to consider.
We currently have two ways to create an index:
- Using the
createTable()API by specifying secondary indexes - Using the
createIndex()API on columns of an existing table
For case 2, we need to alter the data type as mentioned above. But for case 1, we don’t need to alter the data type because the column is created as an index key column from the beginning.
Given that, I think we have two possible options:
- As @Torch3333 mentioned, we forbid creating an index on a
BLOBcolumn of an existing table. In this case, users can create an index through thecreateTable()API by specifying secondary indexes, but not via thecreateIndex()API. - As we discussed in the meeting, we disallow creating indexes on
BLOBcolumns altogether so that we can consistently use OracleBLOBor DB2BLOB.
Personally, I feel option 2 is more consistent and would cause less confusion for users. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed it with Toshi, and we want to go with option 2 if you are OK.
I think non-indexable Blob data is acceptable for most users.
For this PR, I think we can merge it without the change.
(And, let's work on expanding the Blob data size and adding some validation in another PR if OK.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, thanks.
Let's work on changes in a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, looking good.
I left one minor naming suggestion.
I'm also concerned about the blob type size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few comments. Please take a look when you have time!
integration-test/src/main/java/com/scalar/db/schemaloader/SchemaLoaderIntegrationTestBase.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds support for IBM Db2 LUV for versions 11 and 12 by introducing a new JDBC adapter and updating configurations, integration tests, and CI workflows. Key changes include:
- Adding Db2-specific connection properties in the data source initialization routines.
- Extending JdbcConfig to include Db2 default settings and variable key column size validations.
- Updating integration tests, admin utilities, and CI workflows to accommodate Db2-specific behaviors.
Reviewed Changes
Copilot reviewed 40 out of 41 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| core/src/main/java/com/scalar/db/storage/jdbc/JdbcUtils.java | Updated to include Db2 connection properties during data source initialization. |
| core/src/main/java/com/scalar/db/storage/jdbc/JdbcConfig.java | Added Db2 configuration constants and validation logic with updated comments. |
| core/src/main/java/com/scalar/db/storage/jdbc/JdbcAdmin.java | Modified creation of metadata tables and index creation logic to support Db2. |
| Various integration test files | Introduced Db2-specific handling in test utility methods and adjusted test conditions, including index creation tests. |
| .github/workflows/ci.yaml | Added CI jobs for Db2 11.5 and Db2 12.1 integration tests. |
Files not reviewed (1)
- core/build.gradle: Language not supported
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
# Conflicts: # core/src/integration-test/java/com/scalar/db/storage/jdbc/ConsensusCommitAdminIntegrationTestWithJdbcDatabase.java # core/src/integration-test/java/com/scalar/db/storage/jdbc/JdbcAdminIntegrationTest.java # core/src/integration-test/java/com/scalar/db/storage/jdbc/JdbcAdminTestUtils.java # core/src/integration-test/java/com/scalar/db/storage/jdbc/SingleCrudOperationTransactionAdminIntegrationTestWithJdbcDatabase.java # core/src/integration-test/java/com/scalar/db/transaction/jdbc/JdbcTransactionAdminIntegrationTest.java # core/src/main/java/com/scalar/db/storage/jdbc/JdbcAdmin.java # core/src/main/java/com/scalar/db/storage/jdbc/JdbcConfig.java # core/src/test/java/com/scalar/db/storage/jdbc/JdbcAdminTestBase.java # integration-test/src/main/java/com/scalar/db/api/DistributedStorageAdminRepairTableIntegrationTestBase.java
Description
This adds support for IBM Db2 LUV for versions 11 and 12.
Related issues and/or PRs
Changes made
Add a new JDBC adapter for Db2. Please look at my review comments below for the highlights of the changes.
Checklist
Additional notes (optional)
N/A
Release notes
Add support for IBM Db2 LUV 11 and 12