Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend StringAppendTESTOperator to use string delimiter of variable length #4806

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 22 additions & 0 deletions java/rocksjni/merge_operator.cc
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,28 @@ jlong Java_org_rocksdb_StringAppendOperator_newSharedStringAppendOperator(
return reinterpret_cast<jlong>(sptr_string_append_op);
}

/*
* Class: org_rocksdb_StringAppendOperator
* Method: newSharedStringAppendTESTOperator
* Signature: ([B)J
*/
jlong Java_org_rocksdb_StringAppendOperator_newSharedStringAppendTESTOperator(
JNIEnv* env, jclass, jbyteArray jdelim) {
jboolean has_exception = JNI_FALSE;
std::string delim = rocksdb::JniUtil::byteString<std::string>(
env, jdelim,
[](const char* str, const size_t len) { return std::string(str, len); },
&has_exception);
if (has_exception == JNI_TRUE) {
// exception occurred
return 0;
}

auto* sptr_string_append_test_op = new std::shared_ptr<rocksdb::MergeOperator>(
rocksdb::MergeOperators::CreateStringAppendTESTOperator(delim));
return reinterpret_cast<jlong>(sptr_string_append_test_op);
}

/*
* Class: org_rocksdb_StringAppendOperator
* Method: disposeInternal
Expand Down
7 changes: 6 additions & 1 deletion java/src/main/java/org/rocksdb/StringAppendOperator.java
Original file line number Diff line number Diff line change
Expand Up @@ -14,10 +14,15 @@ public StringAppendOperator() {
this(',');
}

public StringAppendOperator(char delim) {
public StringAppendOperator(final char delim) {
super(newSharedStringAppendOperator(delim));
}

public StringAppendOperator(final byte[] delim) {
super(newSharedStringAppendTESTOperator(delim));

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems a bad idea that, depending on the constructor argument, a different implementation of the string append opertor is used. I wonder if instead it would be better to either i) have a different class in the java API that alwys goes against StringAppendTESTOperator or ii) replace SharedStringAppendOperator with SharedStringAppendTESTOperator in all places here. Is there any scenario where a user would prefer SharedStringAppendOperator over SharedStringAppendTESTOperator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, while implementing this, I was not sure about all implications for the API.

ii) I do not know exact name origin and all backwards compatibility implications of using only SharedStringAppendTESTOperator, but it seems to me that it is more performant and it uses newer c++ merge interface, maybe @sagar0 could help to answer this question.

if ii) is not applicable, i) is what I suggested in the upper comment to add a separate java class StringAppendOperatorWithVariableDelimitor which uses SharedStringAppendTESTOperator if this looks ok for the API

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think for the time being this approach is okay.

  • StringAppendOperator implements the Associative Merge Operator
  • StringAppendTESTOperator implements the Non-Associative Merge Operator API

}

private native static long newSharedStringAppendOperator(final char delim);
private native static long newSharedStringAppendTESTOperator(final byte[] delim);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this named with "TEST" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be something to discuss. I tried not to change too much in general.

As I understand there are 2 string appending merge operators in c++ at the moment: StringAppendOperator and StringAppendTESTOperator. The former is the first implementation based on MergeOperator.FullMerge before having MergeOperator.FullMergeV2. The later is more efficient based on the newer MergeOperator.FullMergeV2 interface. I changed the newer one in c++. The Java StringAppendOperator.newSharedStringAppendOperator() also creates the older one and newSharedStringAppendTESTOperator creates the newer one. Basically, now it is in sync with c++ code to some extend but might be a bit confusing in Java code.

I thought about creating another separate Java MergeOperator class which is backed by StringAppendTESTOperator. It could be called e.g. StringAppendOperatorWithVariableDelimitor. I can refactor it if it makes sense.

@Override protected final native void disposeInternal(final long handle);
}
30 changes: 30 additions & 0 deletions java/src/test/java/org/rocksdb/MergeTest.java
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
import org.junit.Test;
import org.junit.rules.TemporaryFolder;

import static java.nio.charset.StandardCharsets.UTF_8;
import static org.assertj.core.api.Assertions.assertThat;

public class MergeTest {
Expand Down Expand Up @@ -183,6 +184,35 @@ public void operatorOption()
}
}

@Test
public void emptyStringDelimiter() throws RocksDBException {
stringDelimiter("");
}

@Test
public void stringDelimiter() throws RocksDBException {
stringDelimiter("DELIM");
}

private void stringDelimiter(String delim) throws RocksDBException {
try (final StringAppendOperator stringAppendOperator = new StringAppendOperator(delim.getBytes(UTF_8));
final Options opt = new Options()
.setCreateIfMissing(true)
.setMergeOperator(stringAppendOperator);
final RocksDB db = RocksDB.open(opt, dbFolder.getRoot().getAbsolutePath())) {
// Writing aa under key
db.put("key".getBytes(UTF_8), "aa".getBytes(UTF_8));

// Writing bb under key
db.merge("key".getBytes(UTF_8), "bb".getBytes(UTF_8));

final byte[] value = db.get("key".getBytes(UTF_8));
final String strValue = new String(value, UTF_8);

assertThat(strValue).isEqualTo("aa" + delim + "bb");
}
}

@Test
public void uint64AddOperatorOption()
throws InterruptedException, RocksDBException {
Expand Down
1 change: 1 addition & 0 deletions utilities/merge_operators.h
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ class MergeOperators {
static std::shared_ptr<MergeOperator> CreateStringAppendOperator();
static std::shared_ptr<MergeOperator> CreateStringAppendOperator(char delim_char);
static std::shared_ptr<MergeOperator> CreateStringAppendTESTOperator();
static std::shared_ptr<MergeOperator> CreateStringAppendTESTOperator(std::string delim);
static std::shared_ptr<MergeOperator> CreateMaxOperator();
static std::shared_ptr<MergeOperator> CreateBytesXOROperator();

Expand Down
28 changes: 14 additions & 14 deletions utilities/merge_operators/string_append/stringappend2.cc
Original file line number Diff line number Diff line change
Expand Up @@ -15,11 +15,6 @@

namespace rocksdb {

// Constructor: also specify the delimiter character.
StringAppendTESTOperator::StringAppendTESTOperator(char delim_char)
: delim_(delim_char) {
}

// Implementation for the merge operation (concatenates two strings)
bool StringAppendTESTOperator::FullMergeV2(
const MergeOperationInput& merge_in,
Expand All @@ -37,7 +32,7 @@ bool StringAppendTESTOperator::FullMergeV2(
size_t numBytes = 0;
for (auto it = merge_in.operand_list.begin();
it != merge_in.operand_list.end(); ++it) {
numBytes += it->size() + 1; // Plus 1 for the delimiter
numBytes += it->size() + delim_.size(); // Plus one delimiter
}

// Only print the delimiter after the first entry has been printed
Expand All @@ -48,20 +43,20 @@ bool StringAppendTESTOperator::FullMergeV2(
merge_out->new_value.reserve(numBytes + merge_in.existing_value->size());
merge_out->new_value.append(merge_in.existing_value->data(),
merge_in.existing_value->size());
printDelim = true;
printDelim = !delim_.empty();
} else if (numBytes) {
merge_out->new_value.reserve(
numBytes - 1); // Minus 1 since we have one less delimiter
numBytes - delim_.size()); // Minus 1 delimiter since we have one less delimiter
}

// Concatenate the sequence of strings (and add a delimiter between each)
for (auto it = merge_in.operand_list.begin();
it != merge_in.operand_list.end(); ++it) {
if (printDelim) {
merge_out->new_value.append(1, delim_);
merge_out->new_value.append(delim_);
}
merge_out->new_value.append(it->data(), it->size());
printDelim = true;
printDelim = !delim_.empty();
}

return true;
Expand All @@ -87,17 +82,17 @@ bool StringAppendTESTOperator::_AssocPartialMergeMulti(
// Determine and reserve correct size for *new_value.
size_t size = 0;
for (const auto& operand : operand_list) {
size += operand.size();
size += operand.size() + delim_.size();
}
size += operand_list.size() - 1; // Delimiters
size -= delim_.size(); // since we have one less delimiter
new_value->reserve(size);

// Apply concatenation
new_value->assign(operand_list.front().data(), operand_list.front().size());

for (std::deque<Slice>::const_iterator it = operand_list.begin() + 1;
for (auto it = operand_list.begin() + 1;
it != operand_list.end(); ++it) {
new_value->append(1, delim_);
new_value->append(delim_);
new_value->append(it->data(), it->size());
}

Expand All @@ -114,4 +109,9 @@ MergeOperators::CreateStringAppendTESTOperator() {
return std::make_shared<StringAppendTESTOperator>(',');
}

std::shared_ptr<MergeOperator>
MergeOperators::CreateStringAppendTESTOperator(std::string delim) {
return std::make_shared<StringAppendTESTOperator>(delim);
}

} // namespace rocksdb
24 changes: 13 additions & 11 deletions utilities/merge_operators/string_append/stringappend2.h
Original file line number Diff line number Diff line change
Expand Up @@ -13,26 +13,28 @@
#pragma once
#include <deque>
#include <string>

#include <utility>
#include "rocksdb/merge_operator.h"
#include "rocksdb/slice.h"

namespace rocksdb {

class StringAppendTESTOperator : public MergeOperator {
public:
// Constructor with delimiter
explicit StringAppendTESTOperator(char delim_char);
// Constructor with string delimiter
explicit StringAppendTESTOperator(std::string delim_str) : delim_(std::move(delim_str)) {};

// Constructor with char delimiter
explicit StringAppendTESTOperator(char delim_char) : delim_(std::string(1, delim_char)) {};

virtual bool FullMergeV2(const MergeOperationInput& merge_in,
MergeOperationOutput* merge_out) const override;
bool FullMergeV2(const MergeOperationInput& merge_in,
MergeOperationOutput* merge_out) const override;

virtual bool PartialMergeMulti(const Slice& key,
const std::deque<Slice>& operand_list,
std::string* new_value, Logger* logger) const
override;
bool PartialMergeMulti(const Slice& key,
const std::deque<Slice>& operand_list,
std::string* new_value, Logger* logger) const override;

virtual const char* Name() const override;
const char* Name() const override;

private:
// A version of PartialMerge that actually performs "partial merging".
Expand All @@ -41,7 +43,7 @@ class StringAppendTESTOperator : public MergeOperator {
const std::deque<Slice>& operand_list,
std::string* new_value, Logger* logger) const;

char delim_; // The delimiter is inserted between elements
std::string delim_; // The delimiter is inserted between elements

};

Expand Down