add support for ColumnConfig skipValueRangeIndexScale and skipValueRangeIndexScale to traditional string columns #15551
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Adds the enhancements from #13977 to traditional string columns to improve performance of filter processing by allowing obviously expensive bitmap operations to be skipped. While I haven't run the benchmarks, I assume a similar improvement for string columns that are not created with the 'auto' indexer as were seen in #13977. Previously this performance enhancement was only available for 'auto' and 'json' columns.
This is related to #15550 to help replace the 'auto' search strategy.
I've not documented the configs behind this yet in this PR because I'm still thinking on how to frame what these configs do in an understandable manner without requiring deep knowledge of how query processing actually works. For the most part these settings should probably not be tweaked by users, but will try to think of something and add docs in a follow-up PR.
Release note
String columns created with the 'string' column indexer now have an enhancement for filter match processing that was previously only available to columns created by the 'auto' indexer, and will automatically skip obviously expensive index computation for filters which would require a very large number of bitmap operations. This should improve performance, particularly when filter clauses contain a mix of simple and complex filters since it allows the complex filters. Previously Druid would always utilize indexes if they were available, and this behavior can be returned by setting
druid.processing.skipValuePredicateIndexScale
anddruid.processing.skipValueRangeIndexScale
to1.0
.This PR has: