Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make Comparator into a Customizable Object #8336

Closed
wants to merge 8 commits into from
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
13 changes: 12 additions & 1 deletion include/rocksdb/comparator.h
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@

#include <string>

#include "rocksdb/customizable.h"
#include "rocksdb/rocksdb_namespace.h"

namespace ROCKSDB_NAMESPACE {
Expand All @@ -20,7 +21,7 @@ class Slice;
// used as keys in an sstable or a database. A Comparator implementation
// must be thread-safe since rocksdb may invoke its methods concurrently
// from multiple threads.
class Comparator {
class Comparator : public Customizable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is public interface, is there potential ABI breakage?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure what the ABI breakage would be or how to test for it. Ideas?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's fine to break binary compatibility, especially for non patch release. I found a website which tracks our API binary compatibility: https://abi-laboratory.pro/?view=timeline&l=rocksdb

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Customizable increases the comparator size, what would be the performance impact?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a comparison of master and this branch for some operations from benchmark.sh (same performance tests as the performance Wiki):

Master:
readrandom : 461.911 micros/op 138552 ops/sec; 35.1 MB/s (7364222 of 11651999 found)
multireadrandom : 461.621 micros/op 138639 ops/sec; (7369984 of 11654990 found)
overwrite : 694.754 micros/op 92117 ops/sec; 36.9 MB/s
readwhilewriting : 659.770 micros/op 97001 ops/sec; 30.8 MB/s (6496957 of 8195999 found)

Change:
readrandom : 461.858 micros/op 138568 ops/sec; 35.1 MB/s (7365481 of 11649999 found)
multireadrandom : 461.689 micros/op 138619 ops/sec; (7366962 of 11652990 found)
overwrite : 698.994 micros/op 91558 ops/sec; 36.7 MB/s
readwhilewriting : 661.834 micros/op 96698 ops/sec; 30.7 MB/s (6455689 of 8151999 found)

This change does not appear to have a noticeable performance impact.

public:
Comparator() : timestamp_size_(0) {}

Expand All @@ -37,7 +38,17 @@ class Comparator {

virtual ~Comparator() {}

static Status CreateFromString(const ConfigOptions& opts,
const std::string& id,
const Comparator** comp);
static const char* Type() { return "Comparator"; }
static const char* kBytewiseClassName() {
return "leveldb.BytewiseComparator";
}
static const char* kReverseBytewiseClassName() {
return "rocksdb.ReverseBytewiseComparator";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not these be defined in derived class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are defined here so they can be used publicly. If they are in the derived class, then these constants cannot be used as strings/inputs to CreateFromString. Generally speaking, I have thought the names of the builtin Customizable objects should be public, but I can go either way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks odd to me that the derived class name is defined in base class. How about other compactors?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, the sampling is small (so far) as there are not that many Customizable classes.

For TableFactory, the derived names (Block, Plain, Cuckoo) are in the base class. This is necessary because there is often casting that goes on based on the type and the derived classes are not in the public API.

For Env, I could see the potential of needing the "default" or "posix" names in the base class. I am not certain that other names will have to be there. Default might be necessary so that one could cast an Env to the default or base implementation.

If the name is not in the base class, then constructing a derived class can be more problematic. The options are:

  • Create a unique constructor for every type (e.g ByteWiseComparator(), ReverseByteWiseComparator(), etc). This is obviously not sustainable for all permutations
  • Document what the names are to create each type (e.g. "Comparator has two built-in types. a "leveldb.BytewiseComparator" and a "rocksdb.ReverseBytewiseComparator". These names can be passed to CreateFromString to create ..."). This is changing the problem from having a constant in the public API to having the behavior in documentation.
  • Make the implementation classes public. Yuck.

I am not opposed to making it a "documentation" problem. I have also thought about ways of creating a tool that would allow a user to see what "instances" are available (via the ObjectRegistry) to discover what these names are and possibly make the features self-documenting.

Since the names (for Comparators) are not really necessary in the public API at the moment, I can move them to the implementation classes if that is better.


// Three-way comparison. Returns value:
// < 0 iff "a" < "b",
// == 0 iff "a" == "b",
Expand Down
1 change: 0 additions & 1 deletion include/rocksdb/utilities/options_type.h
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,6 @@ enum class OptionType {
kCompactionPri,
kSliceTransform,
kCompressionType,
kComparator,
kCompactionFilter,
kCompactionFilterFactory,
kCompactionStopStyle,
Expand Down
40 changes: 24 additions & 16 deletions options/cf_options.cc
Original file line number Diff line number Diff line change
Expand Up @@ -535,22 +535,30 @@ static std::unordered_map<std::string, OptionTypeInfo>
OptionVerificationType::kNormal, OptionTypeFlags::kNone,
{0, OptionType::kCompressionType})},
{"comparator",
{offset_of(&ImmutableCFOptions::user_comparator),
OptionType::kComparator, OptionVerificationType::kByName,
OptionTypeFlags::kCompareLoose,
// Parses the string and sets the corresponding comparator
[](const ConfigOptions& opts, const std::string& /*name*/,
const std::string& value, void* addr) {
auto old_comparator = static_cast<const Comparator**>(addr);
const Comparator* new_comparator = *old_comparator;
Status status =
opts.registry->NewStaticObject(value, &new_comparator);
if (status.ok()) {
*old_comparator = new_comparator;
return status;
}
return Status::OK();
}}},
OptionTypeInfo::AsCustomRawPtr<const Comparator>(
offset_of(&ImmutableCFOptions::user_comparator),
OptionVerificationType::kByName, OptionTypeFlags::kCompareLoose,
// Serializes a Comparator
[](const ConfigOptions& /*opts*/, const std::string&,
const void* addr, std::string* value) {
// it's a const pointer of const Comparator*
const auto* ptr = static_cast<const Comparator* const*>(addr);

// Since the user-specified comparator will be wrapped by
// InternalKeyComparator, we should persist the user-specified
// one instead of InternalKeyComparator.
if (*ptr == nullptr) {
*value = kNullptrString;
} else {
const Comparator* root_comp = (*ptr)->GetRootComparator();
if (root_comp == nullptr) {
root_comp = (*ptr);
}
*value = root_comp->Name();
}
return Status::OK();
},
/* Use the default match function*/ nullptr)},
{"memtable_insert_with_hint_prefix_extractor",
{offset_of(
&ImmutableCFOptions::memtable_insert_with_hint_prefix_extractor),
Expand Down
34 changes: 33 additions & 1 deletion options/customizable_test.cc
Original file line number Diff line number Diff line change
Expand Up @@ -707,6 +707,14 @@ static int RegisterTestObjects(ObjectLibrary& library,
guard->reset(new mock::MockTableFactory());
return guard->get();
});
library.Register<Comparator>(
test::SimpleSuffixReverseComparator::kClassName(),
[](const std::string& /*uri*/, std::unique_ptr<Comparator>* /*guard*/,
std::string* /* errmsg */) {
static test::SimpleSuffixReverseComparator ssrc;
return &ssrc;
});

return static_cast<int>(library.GetFactoryCount(&num_types));
}

Expand All @@ -716,6 +724,7 @@ static int RegisterLocalObjects(ObjectLibrary& library,
// Load any locally defined objects here
return static_cast<int>(library.GetFactoryCount(&num_types));
}
#endif // !ROCKSDB_LITE

class LoadCustomizableTest : public testing::Test {
public:
Expand Down Expand Up @@ -755,7 +764,30 @@ TEST_F(LoadCustomizableTest, LoadTableFactoryTest) {
ASSERT_STREQ(factory->Name(), "MockTable");
}
}
#endif // !ROCKSDB_LITE

TEST_F(LoadCustomizableTest, LoadComparatorTest) {
const Comparator* result = nullptr;
ASSERT_NOK(Comparator::CreateFromString(
config_options_, test::SimpleSuffixReverseComparator::kClassName(),
&result));
ASSERT_OK(Comparator::CreateFromString(
config_options_, Comparator::kBytewiseClassName(), &result));
ASSERT_NE(result, nullptr);
ASSERT_STREQ(result->Name(), Comparator::kBytewiseClassName());
ASSERT_OK(Comparator::CreateFromString(
config_options_, Comparator::kReverseBytewiseClassName(), &result));
ASSERT_NE(result, nullptr);
ASSERT_STREQ(result->Name(), Comparator::kReverseBytewiseClassName());

if (RegisterTests("Test")) {
ASSERT_OK(Comparator::CreateFromString(
config_options_, test::SimpleSuffixReverseComparator::kClassName(),
&result));
ASSERT_NE(result, nullptr);
ASSERT_STREQ(result->Name(),
test::SimpleSuffixReverseComparator::kClassName());
}
}

} // namespace ROCKSDB_NAMESPACE
int main(int argc, char** argv) {
Expand Down
17 changes: 0 additions & 17 deletions options/options_helper.cc
Original file line number Diff line number Diff line change
Expand Up @@ -562,23 +562,6 @@ bool SerializeSingleOptionHelper(const void* opt_address,
: kNullptrString;
break;
}
case OptionType::kComparator: {
// it's a const pointer of const Comparator*
const auto* ptr = static_cast<const Comparator* const*>(opt_address);
// Since the user-specified comparator will be wrapped by
// InternalKeyComparator, we should persist the user-specified one
// instead of InternalKeyComparator.
if (*ptr == nullptr) {
*value = kNullptrString;
} else {
const Comparator* root_comp = (*ptr)->GetRootComparator();
if (root_comp == nullptr) {
root_comp = (*ptr);
}
*value = root_comp->Name();
}
break;
}
case OptionType::kCompactionFilter: {
// it's a const pointer of const CompactionFilter*
const auto* ptr =
Expand Down
6 changes: 2 additions & 4 deletions test_util/testutil.h
Original file line number Diff line number Diff line change
Expand Up @@ -98,10 +98,8 @@ class PlainInternalKeyComparator : public InternalKeyComparator {
class SimpleSuffixReverseComparator : public Comparator {
public:
SimpleSuffixReverseComparator() {}

virtual const char* Name() const override {
return "SimpleSuffixReverseComparator";
}
static const char* kClassName() { return "SimpleSuffixReverseComparator"; }
virtual const char* Name() const override { return kClassName(); }

virtual int Compare(const Slice& a, const Slice& b) const override {
Slice prefix_a = Slice(a.data(), 8);
Expand Down
88 changes: 83 additions & 5 deletions util/comparator.cc
Original file line number Diff line number Diff line change
Expand Up @@ -8,20 +8,26 @@
// found in the LICENSE file. See the AUTHORS file for names of contributors.

#include "rocksdb/comparator.h"

#include <stdint.h>

#include <algorithm>
#include <memory>
#include <mutex>

#include "options/configurable_helper.h"
#include "port/port.h"
#include "rocksdb/slice.h"
#include "rocksdb/utilities/object_registry.h"

namespace ROCKSDB_NAMESPACE {

namespace {
class BytewiseComparatorImpl : public Comparator {
public:
BytewiseComparatorImpl() { }

const char* Name() const override { return "leveldb.BytewiseComparator"; }
static const char* kClassName() { return kBytewiseClassName(); }
const char* Name() const override { return kClassName(); }

int Compare(const Slice& a, const Slice& b) const override {
return a.compare(b);
Expand Down Expand Up @@ -139,9 +145,8 @@ class ReverseBytewiseComparatorImpl : public BytewiseComparatorImpl {
public:
ReverseBytewiseComparatorImpl() { }

const char* Name() const override {
return "rocksdb.ReverseBytewiseComparator";
}
static const char* kClassName() { return kReverseBytewiseClassName(); }
const char* Name() const override { return kClassName(); }

int Compare(const Slice& a, const Slice& b) const override {
return -a.compare(b);
Expand Down Expand Up @@ -220,4 +225,77 @@ const Comparator* ReverseBytewiseComparator() {
return &rbytewise;
}

#ifndef ROCKSDB_LITE
static int RegisterBuiltinComparators(ObjectLibrary& library,
const std::string& /*arg*/) {
library.Register<const Comparator>(
BytewiseComparatorImpl::kClassName(),
[](const std::string& /*uri*/,
std::unique_ptr<const Comparator>* /*guard */,
std::string* /* errmsg */) { return BytewiseComparator(); });
library.Register<const Comparator>(
ReverseBytewiseComparatorImpl::kClassName(),
[](const std::string& /*uri*/,
std::unique_ptr<const Comparator>* /*guard */,
std::string* /* errmsg */) { return ReverseBytewiseComparator(); });
return 2;
}
#endif // ROCKSDB_LITE

Status Comparator::CreateFromString(const ConfigOptions& config_options,
const std::string& value,
const Comparator** result) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's counter intuitive to have the return result as const. Is because of the option is const?

const Comparator* comparator = BytewiseComparator();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it is defined here as const because the option is const. This actually makes everything else slightly harder to do. If I make them non-const, it also will break (at least UBSAN) compatibility with any code that currently uses the ObjectRegistry to register their comparators.

#ifndef ROCKSDB_LITE
static std::once_flag once;
std::call_once(once, [&]() {
RegisterBuiltinComparators(*(ObjectLibrary::Default().get()), "");
});
#endif // ROCKSDB_LITE
std::string id;
std::unordered_map<std::string, std::string> opt_map;
Status status =
ConfigurableHelper::GetOptionsMap(value, *result, &id, &opt_map);
if (!status.ok()) { // GetOptionsMap failed
return status;
}
std::string curr_opts;
#ifndef ROCKSDB_LITE
if (*result != nullptr && (*result)->GetId() == id) {
// Try to get the existing options, ignoring any errors
ConfigOptions embedded = config_options;
embedded.delimiter = ";";
(*result)->GetOptionString(embedded, &curr_opts).PermitUncheckedError();
}
#endif
if (id == BytewiseComparatorImpl::kClassName()) {
*result = BytewiseComparator();
} else if (id == ReverseBytewiseComparatorImpl::kClassName()) {
*result = ReverseBytewiseComparator();
} else if (value.empty()) {
// No Id and no options. Clear the object
*result = nullptr;
return Status::OK();
} else if (id.empty()) { // We have no Id but have options. Not good
return Status::NotSupported("Cannot reset object ", id);
} else {
#ifndef ROCKSDB_LITE
status = config_options.registry->NewStaticObject(id, result);
#else
status = Status::NotSupported("Cannot load object in LITE mode ", id);
#endif // ROCKSDB_LITE
if (!status.ok()) {
if (config_options.ignore_unsupported_options &&
status.IsNotSupported()) {
return Status::OK();
} else {
return status;
}
} else if (!curr_opts.empty() || !opt_map.empty()) {
Comparator* comparator = const_cast<Comparator*>(*result);
status = ConfigurableHelper::ConfigureNewObject(
config_options, comparator, id, curr_opts, opt_map);
}
}
return status;
}
} // namespace ROCKSDB_NAMESPACE