Paired Block Bloom Filter Algorithm #29

udi-speedb · 2022-06-26T11:38:04Z

Why :

Reduce false positives rate while using the same amount of memory.

What:

Develop a filter which is fast and low on CPU consumption on the one hand, but with a better memory footprint- FPR trade-off on the other hand.

Technical detail:

In the traditional bloom filter there is a tradeoff between memory usage and performance. Rocksdb blocked bloom filter takes less time but consumes extra memory.

Ribbon filter, on the other hand, takes ~30% less memory but is much slower than the bloom filter (factor of 4).

The idea is to improve bloom filter in both memory consumption and keep it high performant.

Who:

The proposed filter should be most beneficial when there is a need for a very small FPR. Typically this happens when the penalty of a false positive is very big compared to the filter test time (database on the disk), and when true positives are rare.

Integrate a new type of filter policy: Paired Block Bloom Filter

erez-speedb · 2022-07-13T15:08:25Z

Specific tests:

Standard tests with 24BPK on both main and the branch.
Filters fast and ribbon.
Paired compare to the best from before.
Compare 8 BPK to 24 on main.

Action for now, create the baseline on main for 1,2,4

erez-speedb · 2022-07-17T07:56:37Z

Additional tests
bloom + pair
Worse case scenario,
DB in cache, all keys exists.
Test: Fillup sequential, random reads.
Best case,
DB not in cache, get before write
Test TBD
Fillup n keys, random reads 10000X keys. -- small obj.
Overwrite without filllup?

udi-speedb · 2022-07-17T08:12:00Z

The flag that sets the filter type in db_bench is filter_uri.
Paired bloom (new): -filter_uri spdb.PairedBloomFilter:BPK (e.g., -filter_uri spdb.PairedBloomFilter:23.4)
Fast Local Bloom: -filter_uri rocksdb.internal.FastLocalBloomFilter:BPK
Ribbon: -filter_uri rocksdb.internal.Standard128RibbonFilter:BPK

udi-speedb · 2022-07-20T14:29:21Z

@erez-speedb I have pushed the branch rebased on latest main.
Please go ahead with the basic performance tests we have agreed upon. Thanks

erez-speedb · 2022-07-22T19:38:26Z

./db_bench --compression_type=None -db=/data/ -num=80000000 -value_size=1000 -key_size=16 --delayed_write_rate=536870912 -report_interval_seconds=1 -max_write_buffe
r_number=0 -histogram -duration=900 --use_existing_db -threads=50 -seek_nexts=100 -report_file=seekrandomwriterandom.csv -benchmark_read_rate_limit=0 -benchmark_write_rate_limit=0 --benchmarks=seekrandomwriterandom -filter_uri=spdb.PairedBloomFilter::23.4 -readwritepercent=95

failure creating filter policy[spdb.PairedBloomFilter::23.4]: Not implemented: Could not load FilterPolicy: spdb.PairedBloomFilter::23.4

udi-speedb · 2022-07-24T06:20:32Z

@erez-speedb - Sorry, my mistake in the example. There should be a single ':' not '::'
-filter_uri=spdb.PairedBloomFilter:23.4

erez-speedb · 2022-07-24T07:10:57Z

Rerunning tests

udi-speedb · 2022-07-27T06:53:30Z

One thing that needs attention:
The performance of the filter is heavily affected by the availability of AVX2 support in the processor.

We must make sure to compare rocksdb / us on platforms that have the same AVX2 support.
We may wish to compare rocksdb / us with AVX2 and without AVX2.
The beneficial use case is when using AVX2 so this is a requirement for the best case scenario.

isaac-io · 2022-08-02T13:12:25Z

Blocked by #101.

isaac-io · 2022-08-16T12:53:26Z

Didn't show an improvement with #101, so we need to define a good test to show the value of the feature.

erez-speedb · 2022-09-06T08:52:04Z

Running the test on a single HDD (simulating disk as bottleneck) and with DB size larger than RAM
Showed clear benefit

With no additional memory usage, compare to the default bloom with the same BPK

isaac-io · 2022-09-12T13:35:54Z

Depends on #71 and on #123.

Yuval-Ariel · 2022-10-19T10:55:40Z

QA passed on 4cf14cb

erez-speedb · 2022-10-24T13:47:49Z

Pass performance tests.

as part of - Speedb's Paired Block Bloom (#29)

udi-speedb self-assigned this Jun 26, 2022

udi-speedb added the enhancement New feature or request label Jun 26, 2022

udi-speedb linked a pull request Jul 13, 2022 that will close this issue

29 paired block bloom filter algorithm #54

Merged

erez-speedb self-assigned this Jul 13, 2022

isaac-io mentioned this issue Jul 17, 2022

OSS Documentation for the new Paired Block Bloom Filter #42

Closed

isaac-io mentioned this issue Aug 2, 2022

db_bench: allow restricting the range of keys for a read benchmark to the range of database keys #101

Closed

isaac-io unassigned udi-speedb Aug 16, 2022

erez-speedb assigned assaf-speedb and unassigned erez-speedb Sep 6, 2022

isaac-io added this to the v2.1.0 milestone Sep 21, 2022

isaac-io closed this as completed in #54 Oct 24, 2022

bosmatt added this to Speedb Roadmap Nov 3, 2022

udi-speedb added a commit that referenced this issue Nov 23, 2022

Speedb's Paired Block Bloom (#29)

03572d4

Yuval-Ariel pushed a commit that referenced this issue Nov 25, 2022

Speedb's Paired Block Bloom (#29)

f9c553c

Yuval-Ariel mentioned this issue Jan 17, 2023

ZSTD stress asan error #367

Open

Yuval-Ariel pushed a commit that referenced this issue Apr 30, 2023

Speedb's Paired Block Bloom (#29)

bb66d75

Yuval-Ariel added a commit that referenced this issue May 3, 2023

Rebase 8.1.1: fix for speedb_db_bloom_filter_test

24a2adb

as part of - Speedb's Paired Block Bloom (#29)

Yuval-Ariel pushed a commit that referenced this issue May 4, 2023

Speedb's Paired Block Bloom (#29)

e0ed0ec

Yuval-Ariel added a commit that referenced this issue May 4, 2023

Rebase 8.1.1: fix for speedb_db_bloom_filter_test

835f46c

as part of - Speedb's Paired Block Bloom (#29)

udi-speedb added a commit that referenced this issue Nov 13, 2023

Speedb's Paired Block Bloom (#29)

672bc49

udi-speedb added a commit that referenced this issue Nov 13, 2023

Rebase 8.6.7: Fix compilation of Speedb's Paired Block Bloom (#29)

3801d12

udi-speedb pushed a commit that referenced this issue Nov 14, 2023

Rebase 8.1.1: fix for speedb_db_bloom_filter_test

149af01

as part of - Speedb's Paired Block Bloom (#29)

udi-speedb pushed a commit that referenced this issue Nov 15, 2023

Rebase 8.1.1: fix for speedb_db_bloom_filter_test

fc419e9

as part of - Speedb's Paired Block Bloom (#29)

udi-speedb added a commit that referenced this issue Dec 3, 2023

Speedb's Paired Block Bloom (#29)

5c6f913

udi-speedb added a commit that referenced this issue Dec 3, 2023

Rebase 8.6.7: Fix compilation of Speedb's Paired Block Bloom (#29)

4364f02

udi-speedb pushed a commit that referenced this issue Dec 5, 2023

Rebase 8.1.1: fix for speedb_db_bloom_filter_test

6c3bb9a

as part of - Speedb's Paired Block Bloom (#29)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Paired Block Bloom Filter Algorithm #29

Paired Block Bloom Filter Algorithm #29

udi-speedb commented Jun 26, 2022 •

edited by Guyme

Loading

erez-speedb commented Jul 13, 2022

erez-speedb commented Jul 17, 2022

udi-speedb commented Jul 17, 2022 •

edited

Loading

udi-speedb commented Jul 20, 2022

erez-speedb commented Jul 22, 2022

udi-speedb commented Jul 24, 2022

erez-speedb commented Jul 24, 2022

udi-speedb commented Jul 27, 2022

isaac-io commented Aug 2, 2022

isaac-io commented Aug 16, 2022

erez-speedb commented Sep 6, 2022

isaac-io commented Sep 12, 2022

Yuval-Ariel commented Oct 19, 2022

erez-speedb commented Oct 24, 2022

Paired Block Bloom Filter Algorithm #29

Paired Block Bloom Filter Algorithm #29

Comments

udi-speedb commented Jun 26, 2022 • edited by Guyme Loading

erez-speedb commented Jul 13, 2022

erez-speedb commented Jul 17, 2022

udi-speedb commented Jul 17, 2022 • edited Loading

udi-speedb commented Jul 20, 2022

erez-speedb commented Jul 22, 2022

udi-speedb commented Jul 24, 2022

erez-speedb commented Jul 24, 2022

udi-speedb commented Jul 27, 2022

isaac-io commented Aug 2, 2022

isaac-io commented Aug 16, 2022

erez-speedb commented Sep 6, 2022

isaac-io commented Sep 12, 2022

Yuval-Ariel commented Oct 19, 2022

erez-speedb commented Oct 24, 2022

udi-speedb commented Jun 26, 2022 •

edited by Guyme

Loading

udi-speedb commented Jul 17, 2022 •

edited

Loading