For Oracle, change the data type for BLOB column to support storing up to 2GB #3070

Torch3333 · 2025-10-17T07:25:20Z

Description

This changes the default mapping for Oracle BLOB column to a data type that allows storing up to 2GB. The former data type RAW(2000) allowed a maximum data size of 2000 bytes. This changes addresses a similar concerns regarding Db2 that the original data type only allowed to store up 2KB which was deemed too restrictive.

The new Oracle data type used is named BLOB and its maximum capacity is 2GB which also coincides with the maximum capacity that the backing object byte[] of BlobColumn object can hold.

Because of limitations of the Oracle BLOB data type, ScalarDB BLOB column can no longer be used as :

primary key
clustering key
secondary index
a column of a where condition in a cross-partition scan, .i.e ScanAll, operation
an ordering column in a cross-partition scan

Related issues and/or PRs

Changes made

Changes the default mapping for ScalarDB BLOB column in Oracle from RAW(2000) to BLOB
Disable some integration tests that can no longer be run on a BLOB column because of the limitations mentioned above
Added an integration test for putting and reading a 100MB BLOB for Oracle and Db2.

Checklist

The following is a best-effort checklist. If any items in this checklist are not applicable to this PR or are dependent on other, unmerged PRs, please still mark the checkboxes after you have read and understood each item.

I have commented my code, particularly in hard-to-understand areas.
I have updated the documentation to reflect the changes.
I have considered whether similar issues could occur in other products, components, or modules if this PR is for bug fixes.
Any remaining open issues linked to this PR are documented and up-to-date (Jira, GitHub, etc.).
Tests (unit, integration, etc.) have been added for the changes.
My changes generate no new warnings.
Any dependent changes in other PRs have been merged and published.

Additional notes (optional)

Provide any additional information or notes that may be relevant to the reviewers or stakeholders.

Release notes

When using Oracle, the default data type used for ScalarDB BLOB column from Oracle is changed from RAW(2000) to BLOB to allow storing data up to 2GB. This brings new limitations that a BLOB column can no longer be used as partition key, clustering key, secondary index, a condition or as an ordering column in a cross-partitions scan, .i.e. ScanAll, operation.

…p to 2GB

Torch3333 · 2025-10-20T00:16:38Z

...integration-test/java/com/scalar/db/storage/jdbc/JdbcDatabaseColumnValueIntegrationTest.java

+  @EnabledIf("isDb2OrOracle")
+  @ParameterizedTest()
+  @MethodSource("provideBlobSizes")
+  public void put_largeBlobData_ShouldWorkCorrectly(int blobSize, String humanReadableBlobSize)


I added an integration test that verifies if inserting then getting a 100MB blob works fine;
This test fails for other storage for various reasons that require time to investigate. Since this is not of the highest priority, I will investigate the issues at a later time.

Torch3333 · 2025-10-20T00:18:21Z

core/build.gradle

    options {
        systemProperties(System.getProperties().findAll{it.key.toString().startsWith("scalardb")})
    }
+    maxHeapSize = "4g"


I needed to increase the heap because of experiencing failure with the addition of the large BLOB integration tests for Oracle.

Torch3333 · 2025-10-20T00:20:46Z

core/src/main/java/com/scalar/db/storage/jdbc/RdbEngineOracle.java

      case BLOB:
-        return "RAW(2000)";
+        return "BLOB";


Updates the ScalarDB BLOB data type mapping for Oracle from RAW(2000) (allows a max of 2000 bytes) to BLOB (allows a max of 2GB).

Because of limitations of the Oracle BLOB data type, ScalarDB BLOB column can no longer be used as :

primary key

clustering key

secondary index

a column of a condition in a cross-partition scan, .i.e ScanAll, operation

an ordering column in a cross-partition scan

Torch3333 · 2025-10-20T00:23:52Z

core/src/main/java/com/scalar/db/storage/jdbc/RdbEngineOracle.java

+  public void bindBlobColumnToPreparedStatement(
+      PreparedStatement preparedStatement, int index, byte[] bytes) throws SQLException {
+    // When writing to the BLOB data type with a BLOB size greater than 32766 using a MERGE INTO
+    // statement, an internal error ORA-03137 on the server side occurs so we needed to use a
+    // workaround. This has been confirmed to be a limitation by AWS support.
+    // Below is a detailed explanation of the workaround.
+    //
+    // Depending on the byte array size, the JDBC driver automatically choose one the following mode
+    // to transfer the BLOB data to the server:
+    // - DIRECT: the most efficient mode. It's used when the byte array length is less than 32767.
+    // - STREAM: this mode is less efficient. It's used when the byte array length is greater than
+    // 32766.
+    // - LOB BINDING: this mode is the least efficient. It's used when an input stream without
+    // specifying the length is specified.
+    //
+    // When the driver selects the STREAM mode, the error
+    // ORA-03137 occurs. So, we work around the issue by making sure to use the driver in a way so
+    // that it should never selects the STREAM mode.
+    // For more details about the modes, see the following documentation:
+    // https://docs.oracle.com/en/database/oracle/oracle-database/23/jjdbc/LOBs-and-BFiles.html#GUID-8FD40D53-8D64-4187-9F6F-FF78242188AD
+    if (bytes.length <= 32766) {
+      // the DIRECT mode is used to send BLOB data of small size
+      preparedStatement.setBytes(index, bytes);
+    } else {
+      // the LOB BINDING mode is used to send BLOB data of large size
+      InputStream inputStream = new ByteArrayInputStream(bytes);
+      preparedStatement.setBinaryStream(index, inputStream);
+    }


As the comments explain, there is a limitation with a large BLOB file when executing a ScalarDB Put operation that gets translated to a MERGE INTO SQL statement. This limitation requires special handling to put BLOB data.
I added several integration tests in core/src/integration-test/java/com/scalar/db/storage/jdbc/JdbcDatabaseColumnValueIntegrationTest.java to test the values that could be problematic for the implemented workaround.

Torch3333 · 2025-10-20T00:25:09Z

/gemini review

gemini-code-assist

Code Review

This pull request changes the default data type for BLOB columns in Oracle from RAW(2000) to BLOB to support storing up to 2GB of data. This change aligns Oracle's BLOB handling with that of Db2 and addresses previous concerns about storage limitations. As a result of using the native BLOB type, several limitations are introduced for Oracle: BLOB columns can no longer be used as primary keys, clustering keys, secondary indexes, or in conditions and orderings for cross-partition scans. The changes include updating the RdbEngine for Oracle, adding new error codes, and modifying integration tests to respect these new limitations. A specific workaround for handling large BLOBs (>32KB) in Oracle has also been implemented. The changes are thorough and well-tested. I have one minor suggestion to improve a comment's clarity.

...integration-test/java/com/scalar/db/storage/jdbc/JdbcDatabaseColumnValueIntegrationTest.java

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Copilot

Pull Request Overview

This PR updates the Oracle JDBC adapter to change the default data type mapping for BLOB columns from RAW(2000) to native Oracle BLOB type, which can store up to 2GB instead of just 2KB. This addresses the restrictive storage limitation of the previous mapping.

Changes Oracle BLOB column mapping from RAW(2000) to BLOB for increased storage capacity
Introduces restrictions preventing BLOB columns from being used as keys, secondary indexes, or in cross-partition scan conditions/ordering
Updates integration tests to disable BLOB-related functionality where Oracle no longer supports it

Reviewed Changes

Copilot reviewed 28 out of 28 changed files in this pull request and generated 4 comments.

Show a summary per file

File	Description
RdbEngineOracle.java	Implements new BLOB data type mapping and validation methods for Oracle limitations
JdbcOperationChecker.java	Adds cross-partition scan condition checking for BLOB columns
Multiple test files	Updates integration tests to handle Oracle BLOB limitations and adds new BLOB size testing
CoreError.java	Adds error messages for Oracle BLOB limitations
Multiple admin/storage test files	Updates to disable BLOB column features where Oracle doesn't support them

_{Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.}

...src/main/java/com/scalar/db/api/DistributedStorageCrossPartitionScanIntegrationTestBase.java

core/src/main/java/com/scalar/db/storage/jdbc/RdbEngineOracle.java

...integration-test/java/com/scalar/db/storage/jdbc/JdbcDatabaseColumnValueIntegrationTest.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

komamitsu

LGTM, thank you!

brfrn169 · 2025-10-22T06:19:44Z

core/src/main/java/com/scalar/db/storage/jdbc/JdbcOperationChecker.java

+  protected void checkConjunctions(Selection selection, TableMetadata metadata) {
+    super.checkConjunctions(selection, metadata);
+    if (selection instanceof ScanAll) {
+      rdbEngine.throwIfCrossPartitionScanConditionOnBlobColumnNotSupported(


Why do we check conjunctions only for cross-partition scan operations? Shouldn’t we also check conjunctions for partition scan and get operations?

feeblefakie

LGTM other than what Toshi pointed out.
Thank you!

Torch3333 self-assigned this Oct 17, 2025

Torch3333 added the improvement label Oct 17, 2025

For Oracle, change the data type for BLOB column to support storing u…

58f82af

…p to 2GB

Torch3333 force-pushed the oracle_blob_change branch from c02b715 to 58f82af Compare October 17, 2025 09:04

Torch3333 commented Oct 20, 2025

View reviewed changes

gemini-code-assist bot reviewed Oct 20, 2025

View reviewed changes

...integration-test/java/com/scalar/db/storage/jdbc/JdbcDatabaseColumnValueIntegrationTest.java Outdated Show resolved Hide resolved

Update comment [skip ci]

7ac621c

Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>

Torch3333 marked this pull request as ready for review October 20, 2025 00:27

Copilot AI review requested due to automatic review settings October 20, 2025 00:27

Copilot AI reviewed Oct 20, 2025

View reviewed changes

Torch3333 and others added 3 commits October 20, 2025 09:30

Apply suggestion from @Copilot [skip ci]

538eb22

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Apply suggestion from @Copilot [skip ci]

f293d5c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Apply suggestion from @Copilot [skip ci]

23db591

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Torch3333 requested review from a team, brfrn169, feeblefakie and komamitsu and removed request for a team October 22, 2025 01:49

komamitsu approved these changes Oct 22, 2025

View reviewed changes

brfrn169 reviewed Oct 22, 2025

View reviewed changes

feeblefakie approved these changes Oct 23, 2025

View reviewed changes

For Oracle, change the data type for BLOB column to support storing up to 2GB #3070

Are you sure you want to change the base?

For Oracle, change the data type for BLOB column to support storing up to 2GB #3070

Uh oh!

Conversation

Torch3333 commented Oct 17, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related issues and/or PRs

Changes made

Checklist

Additional notes (optional)

Release notes

Uh oh!

Torch3333 Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Torch3333 Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Torch3333 Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Torch3333 Oct 20, 2025

Choose a reason for hiding this comment

Uh oh!

Torch3333 commented Oct 20, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

komamitsu left a comment

Choose a reason for hiding this comment

Uh oh!

brfrn169 Oct 22, 2025

Choose a reason for hiding this comment

Uh oh!

feeblefakie left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Torch3333 commented Oct 17, 2025 •

edited

Loading

Torch3333 Oct 20, 2025 •

edited

Loading

Torch3333 Oct 20, 2025 •

edited

Loading