Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RocksDB: segfault in org.rocksdb.WriteBatch::delete called from org.apache.bookkeeper.bookie.storage.ldb.EntryLocationIndex#removeOffsetFromDeletedLedgers #3734

Closed
dlg99 opened this issue Jan 11, 2023 · 14 comments · Fixed by #3793
Labels

Comments

@dlg99
Copy link
Contributor

dlg99 commented Jan 11, 2023

BUG REPORT

Describe the bug

A prod server crashed because of the segfault in the RocksDB.
Unfortunately, the crash dump is lost. Logs point to org.rocksdb.WriteBatch::delete called from org.apache.bookkeeper.bookie.storage.ldb.EntryLocationIndex#removeOffsetFromDeletedLedgers

It is hard to pinpoint the issue / match it to a specific rocksDB bug without the crash dump. I cannot repro the problem in unit test and even if I repro it I won't know if that's the exact problem.

So far the crash happened only one time, roughly the timing and code correlate with upgrade to a (internal) version (BK 4.14.x uses rocksdb 6.16.4) with change bringing the use of range deletion w/rocksDB #3653

After some research I have a gut feeling that the problem is related to fix of "a bug in iterator refresh which could segfault for DeleteRange users" facebook/rocksdb#10739
This should be included into RocksDB 7.8.0, I do not see it in 6.x versions. Instead i see 6.29.0 has "Added API warning against using Iterator::Refresh() together with DB::DeleteRange(), which are incompatible and have always risked causing the refreshed iterator to return incorrect results."

With that said, we have the following options:

  1. do nothing, hope the problem is extremely rare. Collect more info if/when it reoccurs.
  2. revert Bring back deleteRange for RocksDB to improve location delete performance #3653 cc @hangc0276 - do you have any perf test results that show how much this PR improved performance to help decide why we may want to not revert this?
  3. upgrade RocksDB to 7.8.0+. Upgrade to 7.x as attempted at Issue 3567: Upgrade rocksdb version to avoid checksum mismatch error #3568 but will need more work for backwards compat tests (at least) assuming there is no data incompatibility. I see some changes around dropping some data format options that may affect downgrade, so there is a risk.
  4. Upgrade to the RocksDB 6.29.5. It sounds like option 1 with extra steps but there are multiple fixes between 6.16.4 (or even 6.29.4.1 used by BK 4.16) and 6.29.5 that might reduce chances of the problem to surface, e.g.:
Fixed a bug caused by race among flush, incoming writes and taking snapshots. Queries to snapshots created with these race condition can return incorrect result, e.g. resurfacing deleted data.
Fixed a bug that DisableManualCompaction may assert when disable an unscheduled manual compaction.
Fixed a bug that Iterator::Refresh() reads stale keys after DeleteRange() performed.
Fixed a race condition when disable and re-enable manual compaction.
Fix a race condition when cancel manual compaction with DisableManualCompaction. Also DB close can cancel the manual compaction thread.
Fixed a data race on versions_ between DBImpl::ResumeImpl() and threads waiting for recovery to complete (#9496)
Fixed a read-after-free bug in DB::GetMergeOperands().

Fix a data loss bug for 2PC write-committed transaction caused by concurrent transaction commit and memtable switch 

Fixed a major bug in which batched MultiGet could return old values for keys deleted by DeleteRange when memtable Bloom filter is enabled

To Reproduce

cannot repro

Expected behavior

no segfault

@hangc0276
Copy link
Contributor

Hi @dlg99, thanks for raising this issue. From the information you provided, the segfault issue seems caused by deleteRange operation.

After some research I have a gut feeling that the problem is related to fix of "a bug in iterator refresh which could segfault for DeleteRange users" facebook/rocksdb#10739

RocksDB PR 10739 looks like not fix the issue we encountered. That issue was introduced since RocksDB 7.7 facebook/rocksdb#10739 (comment)

We need to get the core dump file to investigate the root cause of the RocksDB segfault issue.

I suggest reverting the PR #3653 on branch-4.14 and branch-4.15. For the master branch, we keep the PR and try to upgrade the RocksDB version to 7.8+ to see if the segfault issue is resolved.

@merlimat @eolivelli Do you have any ideas?

@hangc0276
Copy link
Contributor

I tested the deletion performance, and the deleteRange has a huge performance improvement in key deletion.

The number of deletion keys is 1/2 of the Total keys.

Total Keys (million) 0.1 1 10 50 100
deleteRange 0.02s 0.021s 0.024s 0.03s 0.033s
deleteBatch 0.089s 0.514s 6.985s 38.896s 98.511s

This is the test code. You can put the code in EntryLocationIndexTest.java to reproduce the results.

@Test
public void deleteBatchLedgersTest() throws Exception {
    File tmpDir = File.createTempFile("bkTest", ".dir");
    tmpDir.delete();
    tmpDir.mkdir();
    tmpDir.deleteOnExit();

    EntryLocationIndex idx = new EntryLocationIndex(serverConfiguration, KeyValueStorageRocksDB.factory,
        tmpDir.getAbsolutePath(), NullStatsLogger.INSTANCE);

    int numLedgers = 10000;
    int numEntriesPerLedger = 10000;

    int location = 0;
    KeyValueStorage.Batch batch = idx.newBatch();
    for (int entryId = 0; entryId < numEntriesPerLedger; ++entryId) {
        for (int ledgerId = 0; ledgerId < numLedgers; ++ledgerId) {
            idx.addLocation(batch, ledgerId, entryId, location);
            location++;
        }
    }
    batch.flush();
    batch.close();

    for (int ledgerId = 0; ledgerId < numLedgers; ++ledgerId) {
        if (ledgerId % 2 == 0) {
            idx.delete(ledgerId);
        }
    }

    idx.removeOffsetFromDeletedLedgers();
    idx.close();
}

@hangc0276
Copy link
Contributor

I have test upgraded the RocksDB version from 6.10.2 to 7.9.2, and then rollback to 6.10.2.

  • The upgrade process works fine in the following cases

    • Pulsar producer keeps producing messages to Pulsar topic
    • Pulsar consumer consumes messages from the Pulsar topic, and the Pulsar broker fetches messages from the BookKeeper cluster
    • Trigger compaction to cleanup expired ledgers
  • Rollback RocksDB version from 7.9.2 to 6.10.2, and bookie failed to startup with the following exception.

2023-01-29T17:09:27,794+0800 [main] ERROR org.apache.bookkeeper.server.Main - Failed to build bookie server
java.io.IOException: Error open RocksDB database
        at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:200) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:89) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.lambda$static$0(KeyValueStorageRocksDB.java:63) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.bookie.storage.ldb.LedgerMetadataIndex.<init>(LedgerMetadataIndex.java:68) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.bookie.storage.ldb.SingleDirectoryDbLedgerStorage.<init>(SingleDirectoryDbLedgerStorage.java:170) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.newSingleDirectoryDbLedgerStorage(DbLedgerStorage.java:150) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.bookie.storage.ldb.DbLedgerStorage.initialize(DbLedgerStorage.java:129) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.bookie.Bookie.<init>(Bookie.java:819) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.proto.BookieServer.newBookie(BookieServer.java:152) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.proto.BookieServer.<init>(BookieServer.java:120) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.server.service.BookieService.<init>(BookieService.java:52) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.server.Main.buildBookieServer(Main.java:304) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.server.Main.doMain(Main.java:226) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        at org.apache.bookkeeper.server.Main.main(Main.java:208) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
Caused by: org.rocksdb.RocksDBException: unknown checksum type 4 in data/bookkeeper/ledgers/current/ledgers/000025.sst offset 1078 size 33
        at org.rocksdb.RocksDB.open(Native Method) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
        at org.rocksdb.RocksDB.open(RocksDB.java:239) ~[org.rocksdb-rocksdbjni-6.10.2.jar:?]
        at org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorageRocksDB.<init>(KeyValueStorageRocksDB.java:197) ~[org.apache.bookkeeper-bookkeeper-server-4.14.6.jar:4.14.6]
        ... 13 more

The root cause of this exception is that RocksDB 7.9.2 uses kXXH3 by default and kXXH3 is only supported since RocksDB 6.27
https://github.com/facebook/rocksdb/blob/79e57a39a33dbe17c8f51167e40e66d6c91f8eb4/include/rocksdb/table.h#L56

For the BookKeeper master branch, we have upgraded the RocksDB to 6.29.4.1, which can support RocksDB upgrade to 7.9.2 and rollback to 6.29.4.1.

For the RocksDB < 6.27, we can push a fix to ensure RocksDB 7.9.2 does not use the latest checksum type kXXH3

In a word, I'd send an email to discuss the RocksDB upgradation .

@eolivelli
Copy link
Contributor

@hangc0276 is it possible to set the old checksum format and make it configurable in order to allow rollback easily ?

@hangc0276
Copy link
Contributor

hangc0276 commented Jan 30, 2023

@hangc0276 is it possible to set the old checksum format and make it configurable in order to allow rollback easily ?

@eolivelli Yes, I will make the checksum type configurable.

@dlg99
Copy link
Contributor Author

dlg99 commented Jan 31, 2023

@hangc0276 Thank you for looking at this problem!

I suggest reverting the PR #3653 on branch-4.14 and branch-4.15. For the master branch, we keep the PR and try to upgrade the RocksDB version to 7.8+ to see if the segfault issue is resolved.

This means that time to confirm the fix goes into the remote future, Pulsar 2.10/2.11 use bk 4.15 IIRC.

I think we still should try to upgrade RocksDB. I'd be ok with upgraded db backported to 4.14/4.15 if we can guarantee safe downgrade.

Currently we've downgraded BK on prod so this problem is no longer happening, unfortunately it means I don't have any logs/dumps and it really happened only one time.

I've spent some time experimenting with code/injecting errors.

With this:

diff --git a/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java b/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java
index 3f6d1ae55b..03acfecc87 100644
--- a/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java
+++ b/bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java
@@ -26,6 +26,8 @@ import java.io.IOException;
 import java.util.Map.Entry;
 import java.util.Set;
 import java.util.concurrent.TimeUnit;
+
+import lombok.SneakyThrows;
 import org.apache.bookkeeper.bookie.Bookie;
 import org.apache.bookkeeper.bookie.EntryLocation;
 import org.apache.bookkeeper.bookie.storage.ldb.KeyValueStorage.Batch;
@@ -189,6 +191,7 @@ public class EntryLocationIndex implements Closeable {
         deletedLedgers.add(ledgerId);
     }
 
+    @SneakyThrows
     public void removeOffsetFromDeletedLedgers() throws IOException {
         LongPairWrapper firstKeyWrapper = LongPairWrapper.get(-1, -1);
         LongPairWrapper lastKeyWrapper = LongPairWrapper.get(-1, -1);
@@ -202,6 +205,7 @@ public class EntryLocationIndex implements Closeable {
         log.info("Deleting indexes for ledgers: {}", ledgersToDelete);
         long startTime = System.nanoTime();
 
+        locationsDb.close();
         try (Batch batch = locationsDb.newBatch()) {
             for (long ledgerId : ledgersToDelete) {
                 if (log.isDebugEnabled()) {
@@ -213,7 +217,6 @@ public class EntryLocationIndex implements Closeable {
 
                 batch.deleteRange(firstKeyWrapper.array, lastKeyWrapper.array);
             }
-
             batch.flush();
             for (long ledgerId : ledgersToDelete) {
                 deletedLedgers.remove(ledgerId);

I got rocksdb segfault

---------------  T H R E A D  ---------------

Current thread (0x00007f9dc800d000):  JavaThread "main" [_thread_in_native, id=6147, stack(0x0000700003b4f000,0x0000700003c4f000)]

Stack: [0x0000700003b4f000,0x0000700003c4f000],  sp=0x0000700003c4d2c0,  free space=1016k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [librocksdbjni13563433824350328902.jnilib+0x22e1c]  Java_org_rocksdb_RocksDB_write0+0x1c
j  org.rocksdb.RocksDB.write0(JJJ)V+0
#

with this dump

This does not look exactly as original case and more similar to #3043 but the question is i it possible some other rocksdb calls should not run concurrently like index update on deleted range?
I've tried injecting a few other errors and tried running various operations concurrently but so far without additional success.

@hangc0276
Copy link
Contributor

I think we still should try to upgrade RocksDB. I'd be ok with upgraded db backported to 4.14/4.15 if we can guarantee safe downgrade.

I agree with upgrading the RocksDB version. But I'm not sure if it will be OK to backport to the patch release version branch-4.14 / branch-4.15.

@hangc0276
Copy link
Contributor

In your test code, you closed the locationsDb before the flush, which means you flush data into a closed DB and it will throw exceptions.

@dlg99
Copy link
Contributor Author

dlg99 commented Jan 31, 2023

yeah, db close is not the best way to introduce an error.

@hangc0276
Copy link
Contributor

@merlimat @eolivelli @dlg99 any throughs about this suggestion? #3734 (comment)

@hangc0276
Copy link
Contributor

If there are no objections, I will revert the PR #3653 on branch-4.14 and branch-4.15, and trigger a new release for 4.14.7

hangc0276 added a commit that referenced this issue Feb 8, 2023
### Motivations
This PR is to resolve the issue #3734 in branch-4.14 by following this suggestion. #3734 (comment)

### Modifications
1. Revert #3653
2. Bring #3646 to branch-4.14 to make delete entries batch size configurable.
@hangc0276
Copy link
Contributor

I have pushed one PR #3768 to revert #3653 on branch-4.14, and I will trigger a new release for 4.14.7

@muni-chada
Copy link

which version of Pulsar would incorporate 4.14.7 release?

@hangc0276
Copy link
Contributor

which version of Pulsar would incorporate 4.14.7 release?

@muni-chada It will be introduced in Pulsar 2.8, 2.9 and 2.10

hezhangjian pushed a commit that referenced this issue Feb 22, 2023
### Motivation
Related to #3734

### Modification
Upgrade RocksDB version to 7.9.2
hangc0276 added a commit that referenced this issue Mar 6, 2023
### Motivation
Fix #3734 (comment)

We have two rocksDB tables, one for the ledger index, and another for the entry log location.
 - ledger index RocksDB table: Use the default table option, and the checksum is `kCRC32c`
 - entry log location RocksDb table: Use configured table option, and the checksum is `kxxHash` 

When we upgrade the RocksDB version from 6.10.2 to 7.9.2, the new RocksDB version's default table checksum has changed from `kCRC32c` to `kXXH3`, and `kXXH3` only supported since RocksDB 6.27. The RocksDB version rollback to 6.10.2 will be failed due to RocksDB 6.10.2 doesn't support the `kXXH3` checksum type.

### Modifications
In this PR, I make the RocksDB checksum type configurable. But there is one change that will change the ledger index RocksDB table's checksum type from the default `kCRC32c` to `kxxHash`. I have tested the compatibility of the two checksum types in and between multiple RocksDB versions, it works fine.

After setting the two RocksDB table's checksum type to `kxxHash`, the RocksDB's version upgraded from 6.10.2 to 7.9.2, and rolling back to 6.10.2 works fine.

### More to discuss
When writing the unit test to read the table checksum type from RocksDB configuration files, it failed. I found the related issue on RocksDB: facebook/rocksdb#5297
The related PR: facebook/rocksdb#10826

It means we still can't load RocksDB table options from configuration files. Maybe I missed some parts about reading RocksDB table options from the configuration file. 

If this issue exists, we do **NOT** recommend users configure RocksDB configurations through configuration files.

@merlimat @eolivelli @dlg99 Please help take a look, thanks.
nicoloboschi pushed a commit to datastax/bookkeeper that referenced this issue Mar 13, 2023
…e#3768)

### Motivations
This PR is to resolve the issue apache#3734 in branch-4.14 by following this suggestion. apache#3734 (comment)

### Modifications
1. Revert apache#3653
2. Bring apache#3646 to branch-4.14 to make delete entries batch size configurable.

(cherry picked from commit e56d6d6)
nicoloboschi pushed a commit to datastax/bookkeeper that referenced this issue Mar 13, 2023
…e#3768)

### Motivations
This PR is to resolve the issue apache#3734 in branch-4.14 by following this suggestion. apache#3734 (comment)

### Modifications
1. Revert apache#3653
2. Bring apache#3646 to branch-4.14 to make delete entries batch size configurable.

(cherry picked from commit e56d6d6)
RobertIndie pushed a commit to apache/pulsar that referenced this issue Apr 12, 2023
…ocksDB dependency (#20072)

### Motivation
BookKeeper has upgraded the RocksDB dependency to 7.9.2, related discussion:
https://lists.apache.org/thread/8j90y4vrvgz1nvt5pb0xdjjy3o8z57z7

apache/bookkeeper#3734

However, Pulsar also has the RocksDB dependency and it will override the BookKeeper's RocksDB dependency version. It will lead to the release package still using the old RocksDB version (6.29.4.1)

### Modifications
Upgrade the Pulsar's RocksDB dependency to 7.9.2 to keep sync with the BookKeeper's RocksDB dependency.
hezhangjian pushed a commit that referenced this issue May 9, 2023
Descriptions of the changes in this PR:



### Motivation
Upgrade the RocksDB version to 6.29.4.1 to make sure BookKeeper 4.16.0 can roll back 4.14.x

Refer to: #3734 (comment)

### Changes

(Describe: what changes you have made)

Master Issue: #<master-issue-number>

> ---
> In order to uphold a high standard for quality for code contributions, Apache BookKeeper runs various precommit
> checks for pull requests. A pull request can only be merged when it passes precommit checks.
>
> ---
> Be sure to do all of the following to help us incorporate your contribution
> quickly and easily:
>
> If this PR is a BookKeeper Proposal (BP):
>
> - [ ] Make sure the PR title is formatted like:
>     `<BP-#>: Description of bookkeeper proposal`
>     `e.g. BP-1: 64 bits ledger is support`
> - [ ] Attach the master issue link in the description of this PR.
> - [ ] Attach the google doc link if the BP is written in Google Doc.
>
> Otherwise:
> 
> - [ ] Make sure the PR title is formatted like:
>     `<Issue #>: Description of pull request`
>     `e.g. Issue 123: Description ...`
> - [ ] Make sure tests pass via `mvn clean apache-rat:check install spotbugs:check`.
> - [ ] Replace `<Issue #>` in the title with the actual Issue number.
> 
> ---
dlg99 pushed a commit to datastax/bookkeeper that referenced this issue Jun 28, 2024
### Motivation
Fix apache#3734 (comment)

We have two rocksDB tables, one for the ledger index, and another for the entry log location.
 - ledger index RocksDB table: Use the default table option, and the checksum is `kCRC32c`
 - entry log location RocksDb table: Use configured table option, and the checksum is `kxxHash`

When we upgrade the RocksDB version from 6.10.2 to 7.9.2, the new RocksDB version's default table checksum has changed from `kCRC32c` to `kXXH3`, and `kXXH3` only supported since RocksDB 6.27. The RocksDB version rollback to 6.10.2 will be failed due to RocksDB 6.10.2 doesn't support the `kXXH3` checksum type.

### Modifications
In this PR, I make the RocksDB checksum type configurable. But there is one change that will change the ledger index RocksDB table's checksum type from the default `kCRC32c` to `kxxHash`. I have tested the compatibility of the two checksum types in and between multiple RocksDB versions, it works fine.

After setting the two RocksDB table's checksum type to `kxxHash`, the RocksDB's version upgraded from 6.10.2 to 7.9.2, and rolling back to 6.10.2 works fine.

### More to discuss
When writing the unit test to read the table checksum type from RocksDB configuration files, it failed. I found the related issue on RocksDB: facebook/rocksdb#5297
The related PR: facebook/rocksdb#10826

It means we still can't load RocksDB table options from configuration files. Maybe I missed some parts about reading RocksDB table options from the configuration file.

If this issue exists, we do **NOT** recommend users configure RocksDB configurations through configuration files.

@merlimat @eolivelli @dlg99 Please help take a look, thanks.

(cherry picked from commit 3844bf1)
dlg99 pushed a commit to datastax/bookkeeper that referenced this issue Jun 28, 2024
Descriptions of the changes in this PR:

### Motivation
Upgrade the RocksDB version to 6.29.4.1 to make sure BookKeeper 4.16.0 can roll back 4.14.x

Refer to: apache#3734 (comment)

### Changes

(Describe: what changes you have made)

Master Issue: #<master-issue-number>

> ---
> In order to uphold a high standard for quality for code contributions, Apache BookKeeper runs various precommit
> checks for pull requests. A pull request can only be merged when it passes precommit checks.
>
> ---
> Be sure to do all of the following to help us incorporate your contribution
> quickly and easily:
>
> If this PR is a BookKeeper Proposal (BP):
>
> - [ ] Make sure the PR title is formatted like:
>     `<BP-#>: Description of bookkeeper proposal`
>     `e.g. BP-1: 64 bits ledger is support`
> - [ ] Attach the master issue link in the description of this PR.
> - [ ] Attach the google doc link if the BP is written in Google Doc.
>
> Otherwise:
>
> - [ ] Make sure the PR title is formatted like:
>     `<Issue #>: Description of pull request`
>     `e.g. Issue 123: Description ...`
> - [ ] Make sure tests pass via `mvn clean apache-rat:check install spotbugs:check`.
> - [ ] Replace `<Issue #>` in the title with the actual Issue number.
>
> ---

(cherry picked from commit c3a60bb)
dlg99 pushed a commit to datastax/bookkeeper that referenced this issue Jul 2, 2024
### Motivation
Fix apache#3734 (comment)

We have two rocksDB tables, one for the ledger index, and another for the entry log location.
 - ledger index RocksDB table: Use the default table option, and the checksum is `kCRC32c`
 - entry log location RocksDb table: Use configured table option, and the checksum is `kxxHash`

When we upgrade the RocksDB version from 6.10.2 to 7.9.2, the new RocksDB version's default table checksum has changed from `kCRC32c` to `kXXH3`, and `kXXH3` only supported since RocksDB 6.27. The RocksDB version rollback to 6.10.2 will be failed due to RocksDB 6.10.2 doesn't support the `kXXH3` checksum type.

### Modifications
In this PR, I make the RocksDB checksum type configurable. But there is one change that will change the ledger index RocksDB table's checksum type from the default `kCRC32c` to `kxxHash`. I have tested the compatibility of the two checksum types in and between multiple RocksDB versions, it works fine.

After setting the two RocksDB table's checksum type to `kxxHash`, the RocksDB's version upgraded from 6.10.2 to 7.9.2, and rolling back to 6.10.2 works fine.

### More to discuss
When writing the unit test to read the table checksum type from RocksDB configuration files, it failed. I found the related issue on RocksDB: facebook/rocksdb#5297
The related PR: facebook/rocksdb#10826

It means we still can't load RocksDB table options from configuration files. Maybe I missed some parts about reading RocksDB table options from the configuration file.

If this issue exists, we do **NOT** recommend users configure RocksDB configurations through configuration files.

@merlimat @eolivelli @dlg99 Please help take a look, thanks.

(cherry picked from commit 3844bf1)
dlg99 pushed a commit to datastax/bookkeeper that referenced this issue Jul 2, 2024
Descriptions of the changes in this PR:

### Motivation
Upgrade the RocksDB version to 6.29.4.1 to make sure BookKeeper 4.16.0 can roll back 4.14.x

Refer to: apache#3734 (comment)

### Changes

(Describe: what changes you have made)

Master Issue: #<master-issue-number>

> ---
> In order to uphold a high standard for quality for code contributions, Apache BookKeeper runs various precommit
> checks for pull requests. A pull request can only be merged when it passes precommit checks.
>
> ---
> Be sure to do all of the following to help us incorporate your contribution
> quickly and easily:
>
> If this PR is a BookKeeper Proposal (BP):
>
> - [ ] Make sure the PR title is formatted like:
>     `<BP-#>: Description of bookkeeper proposal`
>     `e.g. BP-1: 64 bits ledger is support`
> - [ ] Attach the master issue link in the description of this PR.
> - [ ] Attach the google doc link if the BP is written in Google Doc.
>
> Otherwise:
>
> - [ ] Make sure the PR title is formatted like:
>     `<Issue #>: Description of pull request`
>     `e.g. Issue 123: Description ...`
> - [ ] Make sure tests pass via `mvn clean apache-rat:check install spotbugs:check`.
> - [ ] Replace `<Issue #>` in the title with the actual Issue number.
>
> ---

(cherry picked from commit c3a60bb)
Ghatage pushed a commit to sijie/bookkeeper that referenced this issue Jul 12, 2024
### Motivation
Related to apache#3734

### Modification
Upgrade RocksDB version to 7.9.2
Ghatage pushed a commit to sijie/bookkeeper that referenced this issue Jul 12, 2024
### Motivation
Fix apache#3734 (comment)

We have two rocksDB tables, one for the ledger index, and another for the entry log location.
 - ledger index RocksDB table: Use the default table option, and the checksum is `kCRC32c`
 - entry log location RocksDb table: Use configured table option, and the checksum is `kxxHash` 

When we upgrade the RocksDB version from 6.10.2 to 7.9.2, the new RocksDB version's default table checksum has changed from `kCRC32c` to `kXXH3`, and `kXXH3` only supported since RocksDB 6.27. The RocksDB version rollback to 6.10.2 will be failed due to RocksDB 6.10.2 doesn't support the `kXXH3` checksum type.

### Modifications
In this PR, I make the RocksDB checksum type configurable. But there is one change that will change the ledger index RocksDB table's checksum type from the default `kCRC32c` to `kxxHash`. I have tested the compatibility of the two checksum types in and between multiple RocksDB versions, it works fine.

After setting the two RocksDB table's checksum type to `kxxHash`, the RocksDB's version upgraded from 6.10.2 to 7.9.2, and rolling back to 6.10.2 works fine.

### More to discuss
When writing the unit test to read the table checksum type from RocksDB configuration files, it failed. I found the related issue on RocksDB: facebook/rocksdb#5297
The related PR: facebook/rocksdb#10826

It means we still can't load RocksDB table options from configuration files. Maybe I missed some parts about reading RocksDB table options from the configuration file. 

If this issue exists, we do **NOT** recommend users configure RocksDB configurations through configuration files.

@merlimat @eolivelli @dlg99 Please help take a look, thanks.
lhotari pushed a commit to apache/pulsar-sql that referenced this issue Oct 18, 2024
…ocksDB dependency (#20072)

### Motivation
BookKeeper has upgraded the RocksDB dependency to 7.9.2, related discussion:
https://lists.apache.org/thread/8j90y4vrvgz1nvt5pb0xdjjy3o8z57z7

apache/bookkeeper#3734

However, Pulsar also has the RocksDB dependency and it will override the BookKeeper's RocksDB dependency version. It will lead to the release package still using the old RocksDB version (6.29.4.1)

### Modifications
Upgrade the Pulsar's RocksDB dependency to 7.9.2 to keep sync with the BookKeeper's RocksDB dependency.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants