-
Notifications
You must be signed in to change notification settings - Fork 594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(storage): use exist_table_id to filter exist key when compaction #3038
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add some tests at least for the following cases:
- Data of all tables in an dropped MV will be cleared after compaction.
- Data of all tables in any existing MV will not be cleared after compaction.
- Shared buffer / local compaction works as usual
proto/hummock.proto
Outdated
@@ -140,6 +140,9 @@ message CompactTask { | |||
repeated common.ParallelUnitMapping vnode_mappings = 11; | |||
// compaction group the task belongs to | |||
uint64 compaction_group_id = 12; | |||
|
|||
// exist_table_id for compaction drop key | |||
repeated uint32 exist_table_id = 13; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits: exist_table_id
-> existing_table_ids
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix it.
and i add some test below
- check drop all keys without existing_table_id_set, we can see after compaction, no sstable produce
- check to compaction diffence keyspace (table_id prefix) with existing_table_id_set, we can see that the key which have drop_table_id prefix has been drop, and the other prefix (with exist_table_id prefix) remain. the test check all keys by hummock_scan, to avoid unexpectd data lost
- local compaction logic and workflow not change , i think we test it on above unit test and e2e
src/storage/src/hummock/compactor.rs
Outdated
// iter.next().await?; | ||
// continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove commented lines
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fix it
src/storage/src/hummock/compactor.rs
Outdated
@@ -160,6 +160,7 @@ impl Compactor { | |||
// VNode mappings are not required when compacting shared buffer to L0 | |||
vnode_mappings: vec![], | |||
compaction_group_id: StaticCompactionGroupId::StateDefault.into(), | |||
exist_table_id: vec![], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause shared buffer compaction producing empty SSTs. We should disable the check for dropped table ids for shared buffer compaction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, i think that we can distinguish shared_buffer_compation and normal_compaction by using strategy like CompactionFilter (next commit)
818fec6
to
244f571
Compare
244f571
to
4b39ef2
Compare
Codecov Report
@@ Coverage Diff @@
## main #3038 +/- ##
==========================================
+ Coverage 73.49% 73.57% +0.07%
==========================================
Files 733 733
Lines 99536 99784 +248
==========================================
+ Hits 73157 73412 +255
+ Misses 26379 26372 -7
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
b046871
to
8980df5
Compare
src/storage/src/hummock/compactor.rs
Outdated
} | ||
} | ||
|
||
// in our design, frontend avoid to access keys which had be deleted, so we dont need to | ||
// consider the epoch when the compaction_filter match (it means that mv had drop) | ||
if !compaction_filter.filter(iter_key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if !compaction_filter.filter(iter_key) { | |
if !drop && !compaction_filter.filter(iter_key) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good job! LGTM.
What's changed and what's your intention?
To support clean expired key when compaction
Please explain IN DETAIL what the changes are in this PR and why they are needed:
Checklist
Refer to a related PR or issue link (optional)