Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make InternalKeyComparator not configurable #10342

Closed
wants to merge 3 commits into from

Conversation

siying
Copy link
Contributor

@siying siying commented Jul 11, 2022

Summary:
InternalKeyComparator is an internal class which is a simple wrapper of Comparator. #8336 made Comparator customizeable. As a side effect, internal key comparator was made configurable too. This introduces overhead to this simple wrapper. For example, every InternalKeyComparator will have an std::vector attached to it, which consumes memory and possible allocation overhead too.
We remove InternalKeyComparator from being customizable by making InternalKeyComparator not a subclass of Comparator.

Test Plan: Run existing CI tests and make sure it doesn't fail

@facebook-github-bot
Copy link
Contributor

@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

HISTORY.md Outdated
@@ -23,6 +23,7 @@
* Fix a bug in which backup/checkpoint can include a WAL deleted by RocksDB.
* Fix a bug where concurrent compactions might cause unnecessary further write stalling. In some cases, this might cause write rate to drop to minimum.
* Fix a bug in Logger where if dbname and db_log_dir are on different filesystems, dbname creation would fail wrt to db_log_dir path returning an error and fails to open the DB.
* Fix the regression introduce by https://github.com/facebook/rocksdb/pull/8336 which made InternalKeyComparator configurable as an unintended side effect.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CPU efficiency (kind of regression)

@@ -17,6 +17,20 @@ namespace ROCKSDB_NAMESPACE {

class Slice;

class CompareInterface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a comment for why this exists, such as "used internally"

@@ -386,7 +386,7 @@ void MergingIterator::SwitchToForward() {
if (child.status() == Status::TryAgain()) {
continue;
}
if (child.Valid() && comparator_->Equal(target, child.key())) {
if (child.Valid() && comparator_->Compare(target, child.key()) == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I could see this being a performance regression on large keys where there's a good chance a.size() != b.size().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potentially yes but now InternalKeyComparator just uses a default implementation of Equal() which does exactly what is done here. Improving efficiency of InternalKeyComparator::Equal() sounds outside the scope of the PR and I don't see a need to mix the two.

@mrambacher
Copy link
Contributor

While I agree that it is less-than-ideal to have an unused vector here taking up space, I am wondering if this is the best solution or what problem is being solved here. What is the performance difference with this change?

Many of the methods that are part of the Comparator have not been moved to the ComparatorInterface. Is there a reason these were skipped?

What would happen if instead the std::vector was changed into a std::unique_ptrstd::vector, and constructors (and guards) were added that allowed for this vector to be null? Would the equivalent performance/size gain be accomplished for all classes/implementations that chose to use that new constructor (and not have any options)?

@siying
Copy link
Contributor Author

siying commented Jul 12, 2022

While I agree that it is less-than-ideal to have an unused vector here taking up space, I am wondering if this is the best solution or what problem is being solved here. What is the performance difference with this change?

Many of the methods that are part of the Comparator have not been moved to the ComparatorInterface. Is there a reason these were skipped?

What would happen if instead the std::vector was changed into a std::unique_ptrstd::vector, and constructors (and guards) were added that allowed for this vector to be null? Would the equivalent performance/size gain be accomplished for all classes/implementations that chose to use that new constructor (and not have any options)?

We know that BlockIter is quite sensitive to memory size. The reason is that it would usually use memory allocated by ArenaWrappedDBIter. We experienced performance regression at least twice for object size increase in BlockIter. A whole InternalKeyComparator will likely to be put in BlockIter after we fix #9611 for #10340 . Even 8 byte should ideally be removed. Fundamentally, I don't see a point of making a non-configurable class configurable. It is simply confusing and is hard to maintain.

Regarding not putting more functions in CompareInterface. Ideally, Comparator and InternalKeyComparator don't need to share any function. They operate on different things and don't need to share. I am trying to be conscious on what they share and minimize it. It would also make it easy to get ride of CompareInterface in the future.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

db/dbformat.h Outdated
Comment on lines 265 to 266
void FindShortestSeparator(std::string* start, const Slice& limit) const;
void FindShortSuccessor(std::string* key) const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these methods still needed/used? If so, can they be renamed so that it is clear they are not the same/related to the ones in Comparator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These functions are still used index_builder.h. Let me see whether I can move it to clear InternalKeyComparator.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the functions to IndexBuilder.

db/dbformat.h Outdated
@@ -273,7 +277,7 @@ class InternalKeyComparator
// value `kDisableGlobalSequenceNumber`.
int Compare(const Slice& a, SequenceNumber a_global_seqno, const Slice& b,
SequenceNumber b_global_seqno) const;
virtual const Comparator* GetRootComparator() const override {
virtual const Comparator* GetRootComparator() const {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method still needed? If so, can it be renamed to prevent confusion with the method in Comparator?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me see whether I can delete the function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed it.

@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

siying added 3 commits July 13, 2022 16:13
Summary:
InternalKeyComparator is an internal class which is a simple wrapper of Comparator. facebook#8336 made Comparator customizeable. As a side effect, internal key comparator was made configurable too. This introduces overhead to this simple wrapper. For example, every InternalKeyComparator will have an std::vector attached to it, which consumes memory and possible allocation overhead too.
We remove InternalKeyComparator from being customizable by making InternalKeyComparator not a subclass of Comparator.

Test Plan: Run existing CI tests and make sure it doesn't fail
@siying siying force-pushed the internal_key_name branch from 1c30c38 to 7342554 Compare July 13, 2022 23:13
@facebook-github-bot
Copy link
Contributor

@siying has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@siying has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Copy link
Contributor

@mrambacher mrambacher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the changes

@@ -9,6 +9,7 @@

#include "db/dbformat.h"

#include "table/block_based/index_builder.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This should probably include compaator.h as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants