Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

compaction: send file size and range segments to compaction partitioner context #341

Merged
merged 17 commits into from
Sep 7, 2023

Conversation

YuJuncen
Copy link

This PR added a fields describing the overlapped file info of "next level of output level" (i.e. output_level + 1) info to the partitioner factory if there is.

Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
This reverts commit dde965f.

Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
@@ -577,6 +619,8 @@ std::unique_ptr<SstPartitioner> Compaction::CreateSstPartitioner() const {
context.output_level = output_level_;
context.smallest_user_key = smallest_user_key_;
context.largest_user_key = largest_user_key_;
context.output_next_level_segments =
CreateSegmentsForLevel(output_level_ + 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should consider start and end key.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The start key and end key of this compaction can be accessed during creating segments, could you describe more details about the advice?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see.

if (cmp->Compare(iter->smallest_key, largest_user_key_) > 0) {
break;
}
segs.emplace_back(iter->fd.GetFileSize(), iter->largest_key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know why test work, but this doesn't copy string, the string will be destructed once iter is out of scope.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to own a Slice, it looks like a pure reference type(The content is a const char*)... Should we use PinSlice here (Or are there other types presents a owned slice?)?

For this case, I'm guessing the slice should live as long as VersionStorageInfo. I'm not sure whether that is true, if I got wrong, perhaps it is needed to use a owned type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are probably right.

@@ -62,6 +62,13 @@ class SstPartitioner {
virtual bool CanDoTrivialMove(const Slice& smallest_user_key,
const Slice& largest_user_key) = 0;

struct Segment {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is cumbersome to implement in ffi. You should use two separate vectors instead.

Copy link
Member

@tabokie tabokie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LG, you should test this out in TiKV before merge.

// The level shall be empty.
return std::vector<SstPartitioner::Segment>();
}
const auto files = vsi->LevelFilesBrief(in_level);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const auto files = vsi->LevelFilesBrief(in_level);
const auto& files = vsi->LevelFilesBrief(in_level);

}
const auto files = vsi->LevelFilesBrief(in_level);
const auto cmp = immutable_options()->user_comparator;
const auto bgn = std::lower_bound(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
const auto bgn = std::lower_bound(
const auto start = std::lower_bound(

[cmp](FdWithKeyRange& fd, const Slice& slice) {
return cmp->Compare(fd.largest_key, slice) < 0;
});
const auto end = files.files + files.num_files;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move this line up.

}

std::vector<SstPartitioner::Segment> segs;
segs.emplace_back(0, bgn->smallest_key);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should only be added if there exists at least one file in range.

Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
db/compaction/compaction.cc Outdated Show resolved Hide resolved
Co-authored-by: Xinye Tao <xy.tao@outlook.com>
Signed-off-by: hillium <yujuncen@pingcap.com>
@YuJuncen YuJuncen force-pushed the next-level-seg-on-compaction branch from d644dd8 to 9134efe Compare August 23, 2023 06:59
Copy link
Member

@Connor1996 Connor1996 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@@ -566,6 +567,52 @@ std::unique_ptr<CompactionFilter> Compaction::CreateCompactionFilter(
context);
}

std::pair<std::vector<Slice>, std::vector<uint64_t>>
Compaction::CreateSegmentsForLevel(int in_level) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why call it in_level?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This creates the segment of the input level, so the argument is named in_level, perhaps a plain level might be fine. 🤔️

Signed-off-by: hillium <yujuncen@pingcap.com>
@YuJuncen
Copy link
Author

YuJuncen commented Sep 7, 2023

@Connor1996 could you help to merge this?

@Connor1996 Connor1996 merged commit e2f6ec7 into tikv:6.29.tikv Sep 7, 2023
2 checks passed
v01dstar pushed a commit to v01dstar/rocksdb that referenced this pull request May 21, 2024
…er context (tikv#341)

compaction: send file size and range segments to compaction partitioner context

Signed-off-by: hillium <yujuncen@pingcap.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants