Composite approach for checking in-filter values set in column dictionary #13133

rohangarg · 2022-09-21T11:25:36Z

Currently, for checking the in-filter values in a column dictionary we use a binary search per value in the set. That works well for smaller value-sets but starts slowing down as the number of values in the set increase. To accommodate for large value-sets arising from large in-filters or from joins being pushed down as in-filters, we use sorted merge algorithm for merging the set and dictionary for larger values.

The following benchmark was run to find the cutoff point :

Benchmark     (dictionarySize)  (filterToDictionaryPercentage)  (selectivityPercentage)  Mode  Cnt        Score       Error  Units
binarySearch           1000000                               1                       10  avgt   10    11271.575 ±   621.620  us/op
binarySearch           1000000                               1                      100  avgt   10    16751.127 ±   580.550  us/op
binarySearch           1000000                               2                       10  avgt   10    25387.245 ±  1795.582  us/op
binarySearch           1000000                               2                      100  avgt   10    19707.334 ±   973.384  us/op
binarySearch           1000000                               5                       10  avgt   10    37256.779 ±  2287.203  us/op
binarySearch           1000000                               5                      100  avgt   10    47391.511 ±  1689.971  us/op
binarySearch           1000000                              10                       10  avgt   10    76204.615 ±  6515.056  us/op
binarySearch           1000000                              10                      100  avgt   10    71483.416 ±  7197.376  us/op
binarySearch           1000000                              12                       10  avgt   10  139481.091 ± 15513.357  us/op
binarySearch           1000000                              12                      100  avgt   10  142881.846 ±  9511.367  us/op
binarySearch           1000000                              15                       10  avgt   10   113786.273 ±  4123.158  us/op
binarySearch           1000000                              15                      100  avgt   10   165300.278 ± 10555.479  us/op
binarySearch           1000000                              20                       10  avgt   10   138410.942 ± 14330.367  us/op
binarySearch           1000000                              20                      100  avgt   10   137543.621 ±  9273.845  us/op
binarySearch           1000000                              30                       10  avgt   10   206512.608 ± 13497.954  us/op
binarySearch           1000000                              30                      100  avgt   10   305317.908 ± 12452.686  us/op
binarySearch           1000000                              50                       10  avgt   10   328867.893 ± 14594.249  us/op
binarySearch           1000000                              50                      100  avgt   10   332823.291 ± 26349.158  us/op
binarySearch           1000000                             100                       10  avgt   10   668720.906 ± 49193.818  us/op
binarySearch           1000000                             100                      100  avgt   10  1014546.405 ± 30709.670  us/op
sortedMerge            1000000                               1                       10  avgt   10    47220.743 ±  2598.755  us/op
sortedMerge            1000000                               1                      100  avgt   10    53634.485 ±  2886.296  us/op
sortedMerge            1000000                               2                       10  avgt   10    51201.356 ±  2745.801  us/op
sortedMerge            1000000                               2                      100  avgt   10    53046.058 ±  4500.987  us/op
sortedMerge            1000000                               5                       10  avgt   10    58501.742 ±  7320.870  us/op
sortedMerge            1000000                               5                      100  avgt   10    65597.519 ±  7356.548  us/op
sortedMerge            1000000                              10                       10  avgt   10    75347.468 ±  9417.556  us/op
sortedMerge            1000000                              10                      100  avgt   10    74601.584 ±  5251.078  us/op
sortedMerge            1000000                              12                       10  avgt   10   64838.734 ±  2644.738  us/op
sortedMerge            1000000                              12                      100  avgt   10   80342.737 ±  5414.306  us/op
sortedMerge            1000000                              15                       10  avgt   10    83345.836 ±  6034.488  us/op
sortedMerge            1000000                              15                      100  avgt   10    78405.299 ±  1651.375  us/op
sortedMerge            1000000                              20                       10  avgt   10   111307.577 ± 10924.456  us/op
sortedMerge            1000000                              20                      100  avgt   10    89371.173 ± 10347.814  us/op
sortedMerge            1000000                              30                       10  avgt   10   116740.355 ±  6003.945  us/op
sortedMerge            1000000                              30                      100  avgt   10   117312.763 ±  6325.076  us/op
sortedMerge            1000000                              50                       10  avgt   10   132145.411 ± 23121.259  us/op
sortedMerge            1000000                              50                      100  avgt   10   192802.722 ± 39032.876  us/op
sortedMerge            1000000                             100                       10  avgt   10   216079.634 ± 28725.333  us/op
sortedMerge            1000000                             100                      100  avgt   10   236561.476 ± 14369.673  us/op

This PR has:

been self-reviewed.
- using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added or updated version, license, or notice information in licenses.yaml
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

FrankChen021 · 2022-09-22T02:04:18Z

...ssing/src/main/java/org/apache/druid/segment/serde/DictionaryEncodedStringIndexSupplier.java

-                  break;
+              // if the size of in-filter values is less than the threshold percentage of dictionary size, then use binary search
+              // based lookup per value. The algorithm works well for smaller number of values.
+              if (size < SORTED_MERGE_RATIO_THRESHOLD * dictionary.size()) {


We can determine the strategy at the point of Iterator<ImmutableBitmap> object is instantiated.

And it would be much better if we split the returned Iterator<ImmutableBitmap> into two inner classes, one is for the binary search, the other is for the sorted merge.

Yes, that would be much more readable.

Makes sense, done. Had done it earlier like that, but changed it last moment.

FrankChen021 · 2022-09-22T02:05:09Z

...ssing/src/main/java/org/apache/druid/segment/serde/DictionaryEncodedStringIndexSupplier.java

@@ -175,6 +177,8 @@ public String getValue(int index)
      extends BaseGenericIndexedDictionaryEncodedIndex<ByteBuffer> implements StringValueSetIndex, Utf8ValueSetIndex
  {
    private static final int SIZE_WORTH_CHECKING_MIN = 8;
+    private static final double SORTED_MERGE_RATIO_THRESHOLD = 0.12D;


Could you add some javadoc to explain why the default threshold is 0.12?

We could link the issue that has the benchmarks.

Added javadoc explanation for the threshold

kfaraz

Left some comments.

kfaraz · 2022-09-21T14:43:48Z

processing/src/main/java/org/apache/druid/segment/data/GenericIndexed.java

@@ -826,4 +862,28 @@ public void inspectRuntimeShape(RuntimeShapeInspector inspector)
      }
    };
  }
+
+  public class ValueWithIndex


I guess it would be cleaner to just use a ListIterator which provides nextIndex().
You wouldn't be able to peek the next index though, and you might have to work around that.
(That could be easier to do if we go with @FrankChen021 's suggestion to separate the two kinds of
searches into two different iterables.)

Another alternative could be to just use Pair but I am not a fan of it.

If you do decide to use this class, however, I would suggest putting as a top level class in druid-core/org.apache.druid.java.util.common, as other parts of the code might have similar requirements.

removed the iterator itself, so not needed anymore.

kfaraz · 2022-09-22T03:56:01Z

...ssing/src/main/java/org/apache/druid/segment/serde/DictionaryEncodedStringIndexSupplier.java

-                  break;
+              // if the size of in-filter values is less than the threshold percentage of dictionary size, then use binary search
+              // based lookup per value. The algorithm works well for smaller number of values.
+              if (size < SORTED_MERGE_RATIO_THRESHOLD * dictionary.size()) {


Yes, that would be much more readable.

abhishekagarwal87 · 2022-09-24T10:23:05Z

@rohangarg - thanks for putting these benchmarks. In terms of query latencies, what difference have you observed?

…nary

… dictionary

rohangarg · 2022-10-07T09:02:22Z

The force-push is done since I rebased from latest master. The new changes are a part of new commit and aren't squashed into the old ones.

rohangarg · 2022-10-07T09:05:12Z

@rohangarg - thanks for putting these benchmarks. In terms of query latencies, what difference have you observed?

I have not tried benchmarking queries as of now since the amount of difference would depend on the data and the type of query being used. The benchmark added can measure the improvement with in-filter operation independently.

cheddar

You've got one stale comment in there and the CI isn't passing for some reason, but the code looks good.

...ssing/src/main/java/org/apache/druid/segment/serde/DictionaryEncodedStringIndexSupplier.java

… dictionary

kfaraz added the Area - Querying label Sep 21, 2022

FrankChen021 reviewed Sep 22, 2022

View reviewed changes

kfaraz reviewed Sep 22, 2022

View reviewed changes

rohangarg added 2 commits October 6, 2022 15:30

Composite approach for checking in-filter values set in column dictio…

951b203

…nary

fixup! Composite approach for checking in-filter values set in column…

d53680b

… dictionary

rohangarg force-pushed the in_filter_perf branch from a83d939 to d53680b Compare October 7, 2022 08:59

Remove comparison method from GenericIndexed

d5ff341

cheddar approved these changes Oct 12, 2022

View reviewed changes

...ssing/src/main/java/org/apache/druid/segment/serde/DictionaryEncodedStringIndexSupplier.java Outdated Show resolved Hide resolved

fixup! Composite approach for checking in-filter values set in column…

32d7d33

… dictionary

rohangarg merged commit 45dfd67 into apache:master Oct 13, 2022

kfaraz added this to the 25.0 milestone Nov 22, 2022

This was referenced Dec 18, 2022

[Draft] 25.0.0 Release Notes #13592

Closed

Add SegmentAllocationQueue to batch allocation actions #13369

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Composite approach for checking in-filter values set in column dictionary #13133

Composite approach for checking in-filter values set in column dictionary #13133

rohangarg commented Sep 21, 2022

FrankChen021 Sep 22, 2022

kfaraz Sep 22, 2022

rohangarg Oct 7, 2022

FrankChen021 Sep 22, 2022

abhishekagarwal87 Sep 24, 2022

rohangarg Oct 7, 2022

kfaraz left a comment

kfaraz Sep 21, 2022

rohangarg Oct 7, 2022

kfaraz Sep 22, 2022

abhishekagarwal87 commented Sep 24, 2022

rohangarg commented Oct 7, 2022

rohangarg commented Oct 7, 2022

cheddar left a comment

Composite approach for checking in-filter values set in column dictionary #13133

Composite approach for checking in-filter values set in column dictionary #13133

Conversation

rohangarg commented Sep 21, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kfaraz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

abhishekagarwal87 commented Sep 24, 2022

rohangarg commented Oct 7, 2022

rohangarg commented Oct 7, 2022

cheddar left a comment

Choose a reason for hiding this comment