Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BlockBasedTableConfig is not loaded from Options file #5297

Closed
caronzh03 opened this issue May 10, 2019 · 4 comments
Closed

BlockBasedTableConfig is not loaded from Options file #5297

caronzh03 opened this issue May 10, 2019 · 4 comments
Assignees
Labels

Comments

@caronzh03
Copy link

Expected behavior

Given an options file auto-generated by a previously built rocksdb instance, OptionsUtil.loadLatestOptions() should load that options file and deserialize the [TableOptions/BlockBasedTable "default"] section into a BlockBasedTableConfig object.

Actual behavior

OptionsUtil.loadLatestOptions() does not parse the [TableOptions/BlockBasedTable "default"] section of a given Options file. As a result, the TableFormatConfig field in each ColumnFamilyOption is null.

Steps to reproduce the behavior

I'm using rocksdb's Java library.

// Construct DBOptions by loading an Options file on disk
DBOptions loadedDBOptions = new DBOptions();
List<ColumnFamilyDescriptor> loadedCFDescriptors = new ArrayList<>();
OptionsUtil.loadLatestOptions(rocksDbPath, Env.getDefault(), loadedDBOptions, loadedCFDescriptors);

// check if `TableFormatConfig` object is correctly populated in each `ColumnFamilyOption`
TableFormatConfig tableFormatConfig = loadedCFDescriptors.get(0).getOptions().tableFormatConfig();

// tableFormatConfig == null !

Here's the auto-generated Options file: https://gist.github.com/caronzh03/317e2e703a0261f1710c50a79ba5d81e

According to wiki, the BlockBasedTableOptions should be deserialized, no?

@sagar0 sagar0 self-assigned this May 21, 2019
@mrambacher
Copy link
Contributor

@sagar0 @adamretter Is this still an issue?

@adamretter
Copy link
Collaborator

@mrambacher I could take a look at this in the next few days if you want confirmation?

@adamretter
Copy link
Collaborator

adamretter commented Oct 31, 2021

@caronzh03 I think there are two issues here:

  1. The Java and C++ APIs have diverged here somewhat. Java talks about TableConfigs but C++ talks about TableFactory.

  2. The Java API for TableConfigs underneath relies on the C++ API, so we are (currently) limited to what the C++ API allows us to read back in-terms of options.

The C++ API does not allow you to directly retrieve the options for the TableFactory, it does allow you to print out the options that have been loaded though. I wrote out some C++ which is equivalent to your Java and that shows that the options are loaded from your options file:

#include "rocksdb/db.h"
#include "rocksdb/env.h"
#include "rocksdb/options.h"
#include "rocksdb/utilities/options_util.h"
#include "table/block_based/block_based_table_factory.h"

#include <cstring>
#include <iostream>
#include <string>

using namespace ROCKSDB_NAMESPACE;
using namespace std;

int main(int /*argv*/, char*[] /*argc[]*/) {

    string options_file_name("/tmp/rocksdb_options_file.ini");
    Env* env = Env::Default();

    DBOptions* db_options = new DBOptions();
    vector<ColumnFamilyDescriptor>* cf_descs = new vector<ColumnFamilyDescriptor>();

    //LoadLatestOptions
    Status s = LoadOptionsFromFile(options_file_name, env, db_options, cf_descs);
    cout << "OK=" << s.ok() << endl;

    // check if `TableFormatConfig` object is correctly populated in each `ColumnFamilyOption`
    ColumnFamilyDescriptor default_cf = cf_descs->at(0);
    shared_ptr<TableFactory> table_factory = default_cf.options.table_factory;
    cout << "table_factory(is null)=" << (table_factory.get() == nullptr) << endl;

    cout << "## table_factory: " << table_factory->Name() << endl;

    if (strcmp(table_factory->Name(), TableFactory::kBlockBasedTableName()) == 0) {
        auto* block_base_table_factory = reinterpret_cast<BlockBasedTableFactory*>(table_factory.get());
        string printable_options = block_base_table_factory->GetPrintableOptions();
        
        cout << "### printable_options" << endl;
        cout << printable_options << endl;
    } else {
        cout << "## NOT IMPLEMENTED" << endl;
    }

    return 0;
}

I think it would be best here to align the Java API to that of the C++ API, and I have opened a new issue for that - #9101

If the C++ API needs to offer more in future than it does currently, then I think it would be better to discuss that as a separate issue.

@mrambacher
Copy link
Contributor

@caronzh03 I think there are two issues here:

  1. The Java and C++ APIs have diverged here somewhat. Java talks about TableConfigs but C++ talks about TableFactory.
  2. The Java API for TableConfigs underneath relies on the C++ API, so we are (currently) limited to what the C++ API allows us to read back in-terms of options.

The C++ API does not allow you to directly retrieve the options for the TableFactory, it does allow you to print out the options that have been loaded though. I wrote out some C++ which is equivalent to your Java and that shows that the options are loaded from your options file:

#include "rocksdb/db.h"
#include "rocksdb/env.h"
#include "rocksdb/options.h"
#include "rocksdb/utilities/options_util.h"
#include "table/block_based/block_based_table_factory.h"

#include <cstring>
#include <iostream>
#include <string>

using namespace ROCKSDB_NAMESPACE;
using namespace std;

int main(int /*argv*/, char*[] /*argc[]*/) {

    string options_file_name("/tmp/rocksdb_options_file.ini");
    Env* env = Env::Default();

    DBOptions* db_options = new DBOptions();
    vector<ColumnFamilyDescriptor>* cf_descs = new vector<ColumnFamilyDescriptor>();

    //LoadLatestOptions
    Status s = LoadOptionsFromFile(options_file_name, env, db_options, cf_descs);
    cout << "OK=" << s.ok() << endl;

    // check if `TableFormatConfig` object is correctly populated in each `ColumnFamilyOption`
    ColumnFamilyDescriptor default_cf = cf_descs->at(0);
    shared_ptr<TableFactory> table_factory = default_cf.options.table_factory;
    cout << "table_factory(is null)=" << (table_factory.get() == nullptr) << endl;

    cout << "## table_factory: " << table_factory->Name() << endl;

    if (strcmp(table_factory->Name(), TableFactory::kBlockBasedTableName()) == 0) {
        auto* block_base_table_factory = reinterpret_cast<BlockBasedTableFactory*>(table_factory.get());
        string printable_options = block_base_table_factory->GetPrintableOptions();
        
        cout << "### printable_options" << endl;
        cout << printable_options << endl;
    } else {
        cout << "## NOT IMPLEMENTED" << endl;
    }

    return 0;
}

I think it would be best here to align the Java API to that of the C++ API, and I have opened a new issue for that - #9101

If the C++ API needs to offer more in future than it does currently, then I think it would be better to discuss that as a separate issue.

@adamretter You can do (instead of reinterpret_cast).
auto *block_based_table_factory = table_factory->CheckedCast();
If the table factory is not a BlockBasedTableFactory, it will return nullptr.

Additionally, you can get at the underlying options object (but are not allowed to modify it):
auto *block_based_table_options = table_factory->GetOptions();
Again, if the class is not a BlockBasedTableFactory or does not have BlockBasedTableOptions, this will return nullptr.

I do agree that the C++ and Java APIs have strayed and should be unified.

hangc0276 added a commit to apache/bookkeeper that referenced this issue Mar 6, 2023
### Motivation
Fix #3734 (comment)

We have two rocksDB tables, one for the ledger index, and another for the entry log location.
 - ledger index RocksDB table: Use the default table option, and the checksum is `kCRC32c`
 - entry log location RocksDb table: Use configured table option, and the checksum is `kxxHash` 

When we upgrade the RocksDB version from 6.10.2 to 7.9.2, the new RocksDB version's default table checksum has changed from `kCRC32c` to `kXXH3`, and `kXXH3` only supported since RocksDB 6.27. The RocksDB version rollback to 6.10.2 will be failed due to RocksDB 6.10.2 doesn't support the `kXXH3` checksum type.

### Modifications
In this PR, I make the RocksDB checksum type configurable. But there is one change that will change the ledger index RocksDB table's checksum type from the default `kCRC32c` to `kxxHash`. I have tested the compatibility of the two checksum types in and between multiple RocksDB versions, it works fine.

After setting the two RocksDB table's checksum type to `kxxHash`, the RocksDB's version upgraded from 6.10.2 to 7.9.2, and rolling back to 6.10.2 works fine.

### More to discuss
When writing the unit test to read the table checksum type from RocksDB configuration files, it failed. I found the related issue on RocksDB: facebook/rocksdb#5297
The related PR: facebook/rocksdb#10826

It means we still can't load RocksDB table options from configuration files. Maybe I missed some parts about reading RocksDB table options from the configuration file. 

If this issue exists, we do **NOT** recommend users configure RocksDB configurations through configuration files.

@merlimat @eolivelli @dlg99 Please help take a look, thanks.
dlg99 pushed a commit to datastax/bookkeeper that referenced this issue Jun 28, 2024
### Motivation
Fix apache#3734 (comment)

We have two rocksDB tables, one for the ledger index, and another for the entry log location.
 - ledger index RocksDB table: Use the default table option, and the checksum is `kCRC32c`
 - entry log location RocksDb table: Use configured table option, and the checksum is `kxxHash`

When we upgrade the RocksDB version from 6.10.2 to 7.9.2, the new RocksDB version's default table checksum has changed from `kCRC32c` to `kXXH3`, and `kXXH3` only supported since RocksDB 6.27. The RocksDB version rollback to 6.10.2 will be failed due to RocksDB 6.10.2 doesn't support the `kXXH3` checksum type.

### Modifications
In this PR, I make the RocksDB checksum type configurable. But there is one change that will change the ledger index RocksDB table's checksum type from the default `kCRC32c` to `kxxHash`. I have tested the compatibility of the two checksum types in and between multiple RocksDB versions, it works fine.

After setting the two RocksDB table's checksum type to `kxxHash`, the RocksDB's version upgraded from 6.10.2 to 7.9.2, and rolling back to 6.10.2 works fine.

### More to discuss
When writing the unit test to read the table checksum type from RocksDB configuration files, it failed. I found the related issue on RocksDB: facebook/rocksdb#5297
The related PR: facebook/rocksdb#10826

It means we still can't load RocksDB table options from configuration files. Maybe I missed some parts about reading RocksDB table options from the configuration file.

If this issue exists, we do **NOT** recommend users configure RocksDB configurations through configuration files.

@merlimat @eolivelli @dlg99 Please help take a look, thanks.

(cherry picked from commit 3844bf1)
dlg99 pushed a commit to datastax/bookkeeper that referenced this issue Jul 2, 2024
### Motivation
Fix apache#3734 (comment)

We have two rocksDB tables, one for the ledger index, and another for the entry log location.
 - ledger index RocksDB table: Use the default table option, and the checksum is `kCRC32c`
 - entry log location RocksDb table: Use configured table option, and the checksum is `kxxHash`

When we upgrade the RocksDB version from 6.10.2 to 7.9.2, the new RocksDB version's default table checksum has changed from `kCRC32c` to `kXXH3`, and `kXXH3` only supported since RocksDB 6.27. The RocksDB version rollback to 6.10.2 will be failed due to RocksDB 6.10.2 doesn't support the `kXXH3` checksum type.

### Modifications
In this PR, I make the RocksDB checksum type configurable. But there is one change that will change the ledger index RocksDB table's checksum type from the default `kCRC32c` to `kxxHash`. I have tested the compatibility of the two checksum types in and between multiple RocksDB versions, it works fine.

After setting the two RocksDB table's checksum type to `kxxHash`, the RocksDB's version upgraded from 6.10.2 to 7.9.2, and rolling back to 6.10.2 works fine.

### More to discuss
When writing the unit test to read the table checksum type from RocksDB configuration files, it failed. I found the related issue on RocksDB: facebook/rocksdb#5297
The related PR: facebook/rocksdb#10826

It means we still can't load RocksDB table options from configuration files. Maybe I missed some parts about reading RocksDB table options from the configuration file.

If this issue exists, we do **NOT** recommend users configure RocksDB configurations through configuration files.

@merlimat @eolivelli @dlg99 Please help take a look, thanks.

(cherry picked from commit 3844bf1)
Ghatage pushed a commit to sijie/bookkeeper that referenced this issue Jul 12, 2024
### Motivation
Fix apache#3734 (comment)

We have two rocksDB tables, one for the ledger index, and another for the entry log location.
 - ledger index RocksDB table: Use the default table option, and the checksum is `kCRC32c`
 - entry log location RocksDb table: Use configured table option, and the checksum is `kxxHash` 

When we upgrade the RocksDB version from 6.10.2 to 7.9.2, the new RocksDB version's default table checksum has changed from `kCRC32c` to `kXXH3`, and `kXXH3` only supported since RocksDB 6.27. The RocksDB version rollback to 6.10.2 will be failed due to RocksDB 6.10.2 doesn't support the `kXXH3` checksum type.

### Modifications
In this PR, I make the RocksDB checksum type configurable. But there is one change that will change the ledger index RocksDB table's checksum type from the default `kCRC32c` to `kxxHash`. I have tested the compatibility of the two checksum types in and between multiple RocksDB versions, it works fine.

After setting the two RocksDB table's checksum type to `kxxHash`, the RocksDB's version upgraded from 6.10.2 to 7.9.2, and rolling back to 6.10.2 works fine.

### More to discuss
When writing the unit test to read the table checksum type from RocksDB configuration files, it failed. I found the related issue on RocksDB: facebook/rocksdb#5297
The related PR: facebook/rocksdb#10826

It means we still can't load RocksDB table options from configuration files. Maybe I missed some parts about reading RocksDB table options from the configuration file. 

If this issue exists, we do **NOT** recommend users configure RocksDB configurations through configuration files.

@merlimat @eolivelli @dlg99 Please help take a look, thanks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants