Skip to content

Conversation

@Torch3333
Copy link
Contributor

@Torch3333 Torch3333 commented Apr 16, 2025

Description

This adds support for IBM Db2 LUV for versions 11 and 12.

Related issues and/or PRs

Changes made

Add a new JDBC adapter for Db2. Please look at my review comments below for the highlights of the changes.

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

  • I have commented my code, particularly in hard-to-understand areas.
  • I have updated the documentation to reflect the changes.
  • I have considered whether similar issues could occur in other products, components, or modules if this PR is for bug fixes.
  • Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
  • Tests (unit, integration, etc.) have been added for the changes.
  • My changes generate no new warnings.
  • Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

N/A

Release notes

Add support for IBM Db2 LUV 11 and 12

@Torch3333 Torch3333 self-assigned this Apr 16, 2025
@Torch3333 Torch3333 changed the title Db2 adapter Support IBM Db2 Apr 16, 2025
@Torch3333 Torch3333 added the enhancement New feature or request label Apr 30, 2025
@feeblefakie feeblefakie requested a review from Copilot April 30, 2025 01:44
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds comprehensive support for IBM Db2 (versions 11 and 12) to the JDBC module. Key changes include integrating DB2-specific connection properties and configuration constants, adapting SQL type definitions and error handling in the JDBC admin methods, and extending integration tests and CI workflows to cover DB2.

Reviewed Changes

Copilot reviewed 40 out of 41 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
core/src/main/java/com/scalar/db/storage/jdbc/JdbcUtils.java Adds application of DB2 connection properties to data source initialization.
core/src/main/java/com/scalar/db/storage/jdbc/JdbcConfig.java Introduces DB2-specific configuration constants and validation for key column sizes and time column defaults.
core/src/main/java/com/scalar/db/storage/jdbc/JdbcAdmin.java Updates text and key type handling and adds SQL warning handling for improved index creation logic.
core/src/integration-test/... Updates several integration tests to conditionally adjust behavior for DB2, including minimal value handling and index creation tests.
.github/workflows/ci.yaml Adds new CI jobs to run integration tests on DB2 12.1 and 11.5 containers.
Files not reviewed (1)
  • core/build.gradle: Language not supported
Comments suppressed due to low confidence (1)

core/src/main/java/com/scalar/db/storage/jdbc/JdbcAdmin.java:676

  • The variable 'keyDataTYpe' contains a typographical error; consider renaming it to 'keyDataType' for consistency.
String keyDataTYpe = rdbEngine.getDataTypeForKey(scalarDbColumnType);

Comment on lines +466 to +493
columns.put("col01", "SMALLINT");
columns.put("col02", "INT");
columns.put("col03", "BIGINT");
columns.put("col04", "REAL");
columns.put("col05", "FLOAT(24)"); // Maps to REAL if precision <=24
columns.put("col06", "DOUBLE");
columns.put("col07", "FLOAT");
columns.put("col08", "FLOAT(25)"); // Maps to DOUBLE if precision => 25
columns.put("col09", "CHAR(3)");
columns.put("col10", "VARCHAR(512)");
columns.put("col11", "CLOB");
columns.put("col12", "GRAPHIC(3)");
columns.put("col13", "VARGRAPHIC(512)");
columns.put("col14", "DBCLOB(5)");
columns.put("col15", "NCHAR(3)");
columns.put("col16", "NVARCHAR(512)");
columns.put("col17", "NCLOB(512)");
columns.put("col18", "BINARY(5)");
columns.put("col19", "VARBINARY(512)");
columns.put("col20", "BLOB(1024)");
columns.put("col21", "CHAR(5) FOR BIT DATA");
columns.put("col22", "VARCHAR(512) FOR BIT DATA");
columns.put("col23", "DATE");
columns.put("col24", "TIME");
columns.put("col25", "TIMESTAMP(6)"); // override to TIME
columns.put("col26", "TIMESTAMP(3)");
columns.put("col27", "TIMESTAMP(3)"); // override to TIMESTAMPTZ
columns.put("col28", "BOOLEAN");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This lists all the types supported by the import table feature.

"geometry",
"geography");
static final List<String> UNSUPPORTED_DATA_TYPES_DB2 =
Arrays.asList("DECIMAL", "DECFLOAT", "XML");
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types cannot be imported with the import table feature.

Comment on lines +44 to +51
if (JdbcTestUtils.isDb2(rdbEngine)) {
if (dataType == DataType.FLOAT) {
return JdbcTestUtils.getMinDb2FloatValue(columnName);
}
if (dataType == DataType.DOUBLE) {
return JdbcTestUtils.getMinDb2DoubleValue(columnName);
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Db2 FLOAT and DOUBLE data types do not support the subnormal numbers range below:

  • FLOAT between the interval [-1.17E-38; -1.4E-45] and [1.4E-45; 1.17E-38]
  • DOUBLE between the intervals [-2.22E-308; -4.9E-324] and [4.9E-324; 2.22E-308]

In Java, the subnormal range boundaries are noted by the constants Float.MIN_NORMAL, Float.MIN_VALUE, and Double.MIN_NORMAL, Double.MIN_VALUE.

hasDifferentClusteringOrders(metadata), schema, table, metadata, ifNotExists);
try {
execute(connection, stmts);
execute(connection, stmts, ifNotExists ? null : rdbEngine::throwIfDuplicatedIndexWarning);
Copy link
Contributor Author

@Torch3333 Torch3333 Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the index already exists, the create index ... SQL statement will return a warning instead of an error, which differs from other storages.
So we need to check the warning and throw it as a SQLException if a duplicate index exists.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL statement will return a warning instead of an error

This issue happens only with DB2, right? My slight concern is this change might affect existing production services that use ScalarDB's other JDBC adapters, where negligible warnings have been emitted silently. How about passing the lambda only when using DB2 adapter?

Copy link
Contributor Author

@Torch3333 Torch3333 May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, only Db2 works like that.
To make sure we are on the same page, I'd like to explain the current implementation in detail.

Only an exception will be thrown with Db2 if a duplicate index code is matched.

public void throwIfDuplicatedIndexWarning(SQLWarning warning) throws SQLException {
assert warning != null;
if (warning.getErrorCode() == 605) {
// Only a warning is raised when the index already exists but no exception is thrown by the
// driver.
// To match the behavior of other storages, we throw an exception in this case.
//
// SQL error code 605: The index name already exists
throw new SQLException(
warning.getMessage(), warning.getSQLState(), warning.getErrorCode(), warning.getCause());
}
}

But for other storage, the warning won't be parsed.

/**
* Throws an exception if the given SQLWarning is a duplicate index warning.
*
* @param warning the SQLWarning to check
* @throws SQLException if the warning is a duplicate index warning
*/
default void throwIfDuplicatedIndexWarning(SQLWarning warning) throws SQLException {
// Do nothing
}

Regarding your concern,

My slight concern is this change might affect existing production services that use ScalarDB's other JDBC adapters, where negligible warnings have been emitted silently

This implementation should not impact other storage, even if warnings are emitted. Warning will indeed be looped through, but never processed..
What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for other storage, the warning won't be parsed.

Ah sorry, I overlooked the default one. So, the current implementation of this PR looks good!

PREFIX + "mysql.variable_key_column_size";
public static final String ORACLE_VARIABLE_KEY_COLUMN_SIZE =
PREFIX + "oracle.variable_key_column_size";
public static final String DB2_VARIABLE_KEY_COLUMN_SIZE = PREFIX + "db2.variable_key_column_size";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to reduce the length of key and index columns, as we do for Oracle and MySQL. The length is configurable with the scalar.db.jdbc.db2.variable_key_column_size configuration.

Comment on lines +59 to +60
public static final String DB2_TIME_COLUMN_DEFAULT_DATE_COMPONENT =
PREFIX + "db2.time_column.default_date_component";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ScalarDB TIME type supports the microsecond fraction of a second, but the Db2 TIME type does not support fractional seconds. For this reason, I chose to use the Db2 TIMESTAMP type by default to map the ScalarDB TIME type. The drawback is that a fixed date component is stored with the TIME value, which increases the data volume.

Similarly to Oracle, the fixed date component value used to store value in the TIMESTAMP type is configurable with scalar.db.jdbc.db2.time_column.default_date_component

Comment on lines +95 to +97
for (Entry<String, String> entry : rdbEngine.getConnectionProperties().entrySet()) {
dataSource.addConnectionProperty(entry.getKey(), entry.getValue());
}
Copy link
Contributor Author

@Torch3333 Torch3333 Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now we add custom JDBC connection properties if set in the RdbEngine for all JDBC data source used by:

  • DistributedStorage => added previously
  • DistributedStorageAdmin => newly added
  • TableMetadataManager => newly added

This is motivated by the fact that the Db2 driver requires a configuration property retrieveMessagesFromServerOnGetMessage to print a human-friendly error message that does not contain only the error codes.

Comment on lines +65 to +85
case BIGINT:
return "BIGINT";
case BLOB:
return "VARBINARY(32672)";
case BOOLEAN:
return "BOOLEAN";
case FLOAT:
return "REAL";
case DOUBLE:
return "DOUBLE";
case INT:
return "INT";
case TEXT:
return "VARCHAR(32672)";
case DATE:
return "DATE";
case TIME:
return "TIMESTAMP(6)";
case TIMESTAMP:
case TIMESTAMPTZ:
return "TIMESTAMP(3)";
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The list default type mapping for ScalarDB data types.


@Override
public Map<String, String> getConnectionProperties() {
ImmutableMap.Builder<String, String> props = new ImmutableMap.Builder<>();
Copy link
Contributor Author

@Torch3333 Torch3333 Apr 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add several JDBC driver properties; see the code comments for the reasoning.

@Torch3333 Torch3333 marked this pull request as ready for review April 30, 2025 02:34
@Torch3333 Torch3333 requested review from a team, brfrn169, feeblefakie and komamitsu and removed request for a team April 30, 2025 02:34
hasDifferentClusteringOrders(metadata), schema, table, metadata, ifNotExists);
try {
execute(connection, stmts);
execute(connection, stmts, ifNotExists ? null : rdbEngine::throwIfDuplicatedIndexWarning);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SQL statement will return a warning instead of an error

This issue happens only with DB2, right? My slight concern is this change might affect existing production services that use ScalarDB's other JDBC adapters, where negligible warnings have been emitted silently. How about passing the lambda only when using DB2 adapter?

case BIGINT:
return "BIGINT";
case BLOB:
return "VARBINARY(32672)";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

32KB seems small to me. Is it possible to use DB2's BLOB type which has a maximum size 2GB?

https://www.ibm.com/docs/en/db2/11.5.x?topic=sql-xml-limits

Copy link
Contributor Author

@Torch3333 Torch3333 May 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is possible to use BLOB, but this data type cannot be an index.
And with the purpose of creating an index on an existing table, altering a BLOB column to a VARBINARY is not allowed.

Similarly, for Oracle, I assume that's why it was decided to map ScalarDB BLOB to Oracle RAW(2000), while Oracle also provides BLOB type that allows a much bigger size than RAW.

So, it all boils down to whether there is a use case for creating an index on a blob column; if not, then we can choose to use BLOB types for the default mapping. And to stay consistent, we should do the same for Oracle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. That's an interesting limitation.

From the user perspective of ScalarDB, especially as the developer of SSR (which serializes a group of many items and stores it in a BLOB column), the 32KB limitation could be a blocker for that use case.

How about using DB2 VARBINARY for partition/clustering keys and secondary indexes, and DB2 BLOB for regular columns? As for cases where adding a new secondary index on DB2 BLOB type column, I think it should fail by validations, though.

If it works well, migrating the Oracle RAW type to BLOB type keeping compatibility might be an option in the future?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the approach you're suggesting makes sense too. It only forbids creating an index on a BLOB column of an existing table.

@feeblefakie @brfrn169 From a product requirement perspective, what do you think?

Copy link
Collaborator

@brfrn169 brfrn169 May 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me clarify my understanding as follows:

First of all, as noted in the following documentation, some adopters already use different data types for partition key, clustering key, and secondary index key columns compared to regular columns:

https://scalardb.scalar-labs.com/docs/latest/schema-loader/#data-type-mapping-between-scalardb-and-other-databases

That said, the reason we don’t use Oracle BLOB or DB2 BLOB for regular columns is due to an issue that arises when creating indexes on those columns.

When creating an index on a column that uses a different data type for secondary index keys, we need to alter the column’s data type before creating the index. However, Oracle and DB2 do not allow altering the data type from BLOB to REAL or VARBINARY, and vice versa.

Due to this limitation, we decided to use REAL or VARBINARY for regular columns as well, to avoid type-alteration issues during index creation.

I think there’s one more point to consider.

We currently have two ways to create an index:

  1. Using the createTable() API by specifying secondary indexes
  2. Using the createIndex() API on columns of an existing table

For case 2, we need to alter the data type as mentioned above. But for case 1, we don’t need to alter the data type because the column is created as an index key column from the beginning.

Given that, I think we have two possible options:

  1. As @Torch3333 mentioned, we forbid creating an index on a BLOB column of an existing table. In this case, users can create an index through the createTable() API by specifying secondary indexes, but not via the createIndex() API.
  2. As we discussed in the meeting, we disallow creating indexes on BLOB columns altogether so that we can consistently use Oracle BLOB or DB2 BLOB.

Personally, I feel option 2 is more consistent and would cause less confusion for users. What do you think?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed it with Toshi, and we want to go with option 2 if you are OK.
I think non-indexable Blob data is acceptable for most users.
For this PR, I think we can merge it without the change.
(And, let's work on expanding the Blob data size and adding some validation in another PR if OK.)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, thanks.
Let's work on changes in a separate PR.

Copy link
Contributor

@feeblefakie feeblefakie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, looking good.
I left one minor naming suggestion.
I'm also concerned about the blob type size.

Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments. Please take a look when you have time!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for IBM Db2 LUV for versions 11 and 12 by introducing a new JDBC adapter and updating configurations, integration tests, and CI workflows. Key changes include:

  • Adding Db2-specific connection properties in the data source initialization routines.
  • Extending JdbcConfig to include Db2 default settings and variable key column size validations.
  • Updating integration tests, admin utilities, and CI workflows to accommodate Db2-specific behaviors.

Reviewed Changes

Copilot reviewed 40 out of 41 changed files in this pull request and generated no comments.

Show a summary per file
File Description
core/src/main/java/com/scalar/db/storage/jdbc/JdbcUtils.java Updated to include Db2 connection properties during data source initialization.
core/src/main/java/com/scalar/db/storage/jdbc/JdbcConfig.java Added Db2 configuration constants and validation logic with updated comments.
core/src/main/java/com/scalar/db/storage/jdbc/JdbcAdmin.java Modified creation of metadata tables and index creation logic to support Db2.
Various integration test files Introduced Db2-specific handling in test utility methods and adjusted test conditions, including index creation tests.
.github/workflows/ci.yaml Added CI jobs for Db2 11.5 and Db2 12.1 integration tests.
Files not reviewed (1)
  • core/build.gradle: Language not supported

Copy link
Contributor

@feeblefakie feeblefakie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

Copy link
Collaborator

@brfrn169 brfrn169 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you!

Copy link
Contributor

@komamitsu komamitsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thank you!

@brfrn169 brfrn169 merged commit f14554c into master May 8, 2025
55 checks passed
@brfrn169 brfrn169 deleted the db2_adapter branch May 8, 2025 06:58
Torch3333 added a commit that referenced this pull request May 8, 2025
# Conflicts:
#	core/src/integration-test/java/com/scalar/db/storage/jdbc/ConsensusCommitAdminIntegrationTestWithJdbcDatabase.java
#	core/src/integration-test/java/com/scalar/db/storage/jdbc/JdbcAdminIntegrationTest.java
#	core/src/integration-test/java/com/scalar/db/storage/jdbc/JdbcAdminTestUtils.java
#	core/src/integration-test/java/com/scalar/db/storage/jdbc/SingleCrudOperationTransactionAdminIntegrationTestWithJdbcDatabase.java
#	core/src/integration-test/java/com/scalar/db/transaction/jdbc/JdbcTransactionAdminIntegrationTest.java
#	core/src/main/java/com/scalar/db/storage/jdbc/JdbcAdmin.java
#	core/src/main/java/com/scalar/db/storage/jdbc/JdbcConfig.java
#	core/src/test/java/com/scalar/db/storage/jdbc/JdbcAdminTestBase.java
#	integration-test/src/main/java/com/scalar/db/api/DistributedStorageAdminRepairTableIntegrationTestBase.java
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants