Skip to content
This repository has been archived by the owner on Aug 13, 2019. It is now read-only.

postings compression exploration #629

Draft
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

naivewong
Copy link
Contributor

  1. Store the first posting, then store the delta to that (not the delta to the previous element, the delta to the first number only). Find the minimum number of bits required to store the deltas. The bitpack using those many bits. Also be able to do binary search here.
  2. Use blocks of postings, like 4KB block, store deltas with the previous number. Binary search to the right block and then iterate.

Signed-off-by: naivewong <867245430@qq.com>
@naivewong
Copy link
Contributor Author

index/postings.go Outdated Show resolved Hide resolved

startOff := it.offset / (deltaBlockSize * 8) * deltaBlockSize
num := len(it.bs.bstream) / deltaBlockSize - it.offset / (deltaBlockSize * 8)
// Do binary search between current position and end.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Try checking if the next block is too big before doing the search.

@naivewong
Copy link
Contributor Author

  • Fix a bug in Seek() and the benchmark results look much better.
  • Add another idea of block postings, using delta to the base not the previous. And we can binary search inside the block also.
  • Benchmark results updated.

index/postings.go Outdated Show resolved Hide resolved
Signed-off-by: naivewong <867245430@qq.com>
@naivewong naivewong force-pushed the postings-compression branch 2 times, most recently from ef22dcd to 11f2b1b Compare June 17, 2019 16:52
Signed-off-by: naivewong <867245430@qq.com>
@naivewong naivewong force-pushed the postings-compression branch from 11f2b1b to 7cfcf3d Compare June 17, 2019 17:00
Signed-off-by: naivewong <867245430@qq.com>
@naivewong naivewong force-pushed the postings-compression branch from e11da7c to 99ce72a Compare June 18, 2019 16:16
Signed-off-by: naivewong <867245430@qq.com>
Signed-off-by: naivewong <867245430@qq.com>
Signed-off-by: naivewong <867245430@qq.com>
Signed-off-by: naivewong <867245430@qq.com>
Signed-off-by: naivewong <867245430@qq.com>
Signed-off-by: naivewong <867245430@qq.com>
@naivewong naivewong force-pushed the postings-compression branch from d130757 to b3f2b5e Compare June 28, 2019 07:46
naivewong added 2 commits July 2, 2019 16:58
Signed-off-by: naivewong <867245430@qq.com>
Signed-off-by: naivewong <867245430@qq.com>
@naivewong naivewong force-pushed the postings-compression branch 2 times, most recently from c4b6f0f to c7505ed Compare July 4, 2019 14:49
Signed-off-by: naivewong <867245430@qq.com>
@naivewong naivewong force-pushed the postings-compression branch from c7505ed to 95f3c79 Compare July 5, 2019 02:52
naivewong added 4 commits July 8, 2019 23:56
…comments)

Signed-off-by: naivewong <867245430@qq.com>
Signed-off-by: naivewong <867245430@qq.com>
Signed-off-by: naivewong <867245430@qq.com>
Signed-off-by: naivewong <867245430@qq.com>
Signed-off-by: naivewong <867245430@qq.com>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants