Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shortens blocks' last_key in the SSTable block index. #1361

Merged
merged 3 commits into from
May 6, 2022

Conversation

fulmicoton
Copy link
Collaborator

Right now we store last key in the blocks of the SSTable index.
This PR replaces the last key by a shorter string that is greater or
equal and still lesser than the next key.
This property is sufficiently to ensure the block index
works properly.

// It is possible to do one character shorter in some case,
// but it is not worth the extra complexity
for pos in (common_len + 1)..left.len() {
if left[pos] != 255u8 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the meaning of 255 here?

the comment says string, but we operate on bytes, what about utf-8?

Copy link
Collaborator Author

@fulmicoton fulmicoton May 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

255 is the max value for a u8.

It is not especially utf8. All of these are byte strings.
There are already u64 byte representation in there.

We do not expose the last term in block so it should be safe.

@fulmicoton fulmicoton force-pushed the shorter-last-key branch 3 times, most recently from 2c47e3c to 937b155 Compare May 6, 2022 05:48
@codecov-commenter
Copy link

codecov-commenter commented May 6, 2022

Codecov Report

Merging #1361 (2cb1bd3) into main (386ffab) will increase coverage by 0.00%.
The diff coverage is 98.21%.

@@           Coverage Diff            @@
##             main    #1361    +/-   ##
========================================
  Coverage   94.28%   94.29%            
========================================
  Files         234      234            
  Lines       42528    42634   +106     
========================================
+ Hits        40099    40201   +102     
- Misses       2429     2433     +4     
Impacted Files Coverage Δ
...termdict/sstable_termdict/sstable/sstable_index.rs 99.02% <97.72%> (-0.98%) ⬇️
src/termdict/sstable_termdict/sstable/mod.rs 100.00% <100.00%> (ø)
src/query/query_parser/query_parser.rs 95.15% <0.00%> (-0.22%) ⬇️
query-grammar/src/query_grammar.rs 99.65% <0.00%> (+0.01%) ⬆️
src/store/index/mod.rs 98.37% <0.00%> (+0.54%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 386ffab...2cb1bd3. Read the comment docs.

Right now we store last key in the blocks of the SSTable index.
This PR replaces the last key by a shorter string that is greater or
equal and still lesser than the next key.
This property is sufficiently to ensure the block index
works properly.

Related to quickwit#1366
fulmicoton and others added 2 commits May 6, 2022 16:39
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>
Co-authored-by: PSeitz <PSeitz@users.noreply.github.com>
@fulmicoton fulmicoton merged commit e304497 into main May 6, 2022
@fulmicoton fulmicoton deleted the shorter-last-key branch May 6, 2022 08:29
This was referenced Jan 13, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants