-
Notifications
You must be signed in to change notification settings - Fork 6.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix and generalize framework for filtering range queries, etc. #13005
Conversation
670da65
to
c8f1074
Compare
Summary: There was a subtle design/contract bug in the previous version of range filtering in experimental.h If someone implemented a key segments extractor with "all or nothing" fixed size segments, that could result in unsafe range filtering. TODO: example I have re-worked the contract to make it clear what does work, and implemented a standard extractor for fixed-size segments, CappedKeySegmentsExtractor. The safe approach for filtering is to consume as much as is available for a segment in the case of a short key. I have also (contractually / mathematically) generalized the framework to comparators other than the byte-wise comparator, and made other generalizations to make the extractor limitations more explicitly connected to the particular filters and filtering used--at least in description. (TODO: better detection/enforcement?) Test Plan: added a sizeable unit test for capped extractor
c8f1074
to
1007d15
Compare
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for adding this improvement.
// class filter on the sequence of segments. | ||
// | ||
// GENERALIZING FILTERS (INDIRECT): | ||
// * Point queries can utilize essentially any kind of filter by extracting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the theory, but in practice we don't need this capability to filter tables for point query, right? Since we check table's bounds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The minimum and maximum keys on a table essentially provide a min-max filter on the whole key. You can think of this as a one-dimensional filter that represents only a single range within that dimension. Keeping with one dimension, Bloom filters on a prefix or the whole key complement that by greatly narrowing down the set of keys that might be in the SST file. And of course higher-dimensional filters can also operate on point queries. For some workloads, a min-max filter on a segment might be more effective than a Bloom filter, even for point queries, and much more space/memory efficient.
I'll see if I can make this clearer, perhaps with such examples.
include/rocksdb/experimental.h
Outdated
// | ||
// Beyond point queries, we generally expect the key comparator to be a | ||
// lexicographic / big endian ordering at a high level, while each segment | ||
// can use an arbitrary comparator. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this "can use arbitrary comparator" mean? Since lexicographic ordering is used in building the filter and for checking bounds against the filter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, each segment could be a little-endian 4-byte value, but ordering between segments is still lexicographic. We can make min-max filters aware of other segment comparators, much as we added reverse byte-wise support in this PR.
I'll expand this paragraph.
// * Order-based filters on segments (rather than whole key) can apply to range | ||
// queries (with "whole key" bounds). Specifically, an order-based filter on | ||
// segments i through j and category set s is applicable to a range query from | ||
// lb to ub if |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit:lb
and ub
also both need to be in category set s, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I forgot to mention that this time! Yes, thanks!
@pdillinger has updated the pull request. You must reimport the pull request before landing. |
@pdillinger has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@pdillinger merged this pull request in 10984e8. |
Summary: There was a subtle design/contract bug in the previous version of range filtering in experimental.h If someone implemented a key segments extractor with "all or nothing" fixed size segments, that could result in unsafe range filtering. For example, with two segments of width 3:
Segment 1 of y (empty) is out of order with segment 1 of x and z.
I have re-worked the contract to make it clear what does work, and implemented a standard extractor for fixed-size segments, CappedKeySegmentsExtractor. The safe approach for filtering is to consume as much as is available for a segment in the case of a short key.
I have also added support for min-max filtering with reverse byte-wise comparator, which is probably the 2nd most common comparator for RocksDB users (because of MySQL). It might seem that a min-max filter doesn't care about forward or reverse ordering, but it does when trying to determine whether in input range from segment values v1 to v2, where it so happens that v2 is byte-wise less than v1, is an empty forward interval or a non-empty reverse interval. At least in the current setup, we don't have that context.
A new unit test (with some refactoring) tests CappedKeySegmentsExtractor, reverse byte-wise comparator, and the corresponding min-max filter.
I have also (contractually / mathematically) generalized the framework to comparators other than the byte-wise comparator, and made other generalizations to make the extractor limitations more explicitly connected to the particular filters and filtering used--at least in description.
Test Plan: added unit tests as described