add support for ColumnConfig skipValueRangeIndexScale and skipValueRangeIndexScale to traditional string columns #15551

clintropolis · 2023-12-13T08:28:41Z

Description

Adds the enhancements from #13977 to traditional string columns to improve performance of filter processing by allowing obviously expensive bitmap operations to be skipped. While I haven't run the benchmarks, I assume a similar improvement for string columns that are not created with the 'auto' indexer as were seen in #13977. Previously this performance enhancement was only available for 'auto' and 'json' columns.

This is related to #15550 to help replace the 'auto' search strategy.

I've not documented the configs behind this yet in this PR because I'm still thinking on how to frame what these configs do in an understandable manner without requiring deep knowledge of how query processing actually works. For the most part these settings should probably not be tweaked by users, but will try to think of something and add docs in a follow-up PR.

Release note

String columns created with the 'string' column indexer now have an enhancement for filter match processing that was previously only available to columns created by the 'auto' indexer, and will automatically skip obviously expensive index computation for filters which would require a very large number of bitmap operations. This should improve performance, particularly when filter clauses contain a mix of simple and complex filters since it allows the complex filters. Previously Druid would always utilize indexes if they were available, and this behavior can be returned by setting druid.processing.skipValuePredicateIndexScale and druid.processing.skipValueRangeIndexScale to 1.0.

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
a release note entry in the PR description.
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
been tested in a test Druid cluster.

…ngeIndexScale to traditional string columns

clintropolis · 2024-02-06T04:59:29Z

replaced by #15838

add support for ColumnConfig skipValueRangeIndexScale and skipValueRa…

c321d1c

…ngeIndexScale to traditional string columns

clintropolis added Performance Area - Querying labels Dec 13, 2023

github-actions bot added Area - Segment Format and Ser/De and removed Area - Querying labels Dec 13, 2023

clintropolis mentioned this pull request Dec 21, 2023

expression virtual column indexes #15585

Merged

6 tasks

clintropolis mentioned this pull request Feb 6, 2024

adaptive filter partitioning #15838

Merged

8 tasks

clintropolis closed this Feb 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add support for ColumnConfig skipValueRangeIndexScale and skipValueRangeIndexScale to traditional string columns #15551

add support for ColumnConfig skipValueRangeIndexScale and skipValueRangeIndexScale to traditional string columns #15551

clintropolis commented Dec 13, 2023 •

edited

Loading

clintropolis commented Feb 6, 2024

add support for ColumnConfig skipValueRangeIndexScale and skipValueRangeIndexScale to traditional string columns #15551

add support for ColumnConfig skipValueRangeIndexScale and skipValueRangeIndexScale to traditional string columns #15551

Conversation

clintropolis commented Dec 13, 2023 • edited Loading

Description

Release note

clintropolis commented Feb 6, 2024

clintropolis commented Dec 13, 2023 •

edited

Loading