You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We are trying to use bloom filters to reduce the latency of queries that have a long IN caluse (number of elements in the IN caluse is ~50). However we see bllom filters are not taking effects.
After digging it, we found there is a server config which will disable pruning if the number of values in the in predicate is larger than 10 (default).
Do we know the reason of setting this default number as 10? Applying pruning on a large IN clause will lead to diminishing returns, but even if we take this into consideration, 10 looks too conversative for me.
The text was updated successfully, but these errors were encountered:
Yes, in our case we see a quite substantial improvement with bloomfilter pruning added to a high cardinality dictionary-enabled column (@UOETianleZhang can probably share some anonymous number here). This kind of tells us for dictionary enabled column binary search is slower than hashing (bloomfilter). Therefore the gain would proably be more prominent when the number of values in in clause is larger? Unless we are sure that these values would exist in every segment we query.
We are trying to use bloom filters to reduce the latency of queries that have a long IN caluse (number of elements in the IN caluse is ~50). However we see bllom filters are not taking effects.
After digging it, we found there is a server config which will disable pruning if the number of values in the in predicate is larger than 10 (default).
Do we know the reason of setting this default number as 10? Applying pruning on a large IN clause will lead to diminishing returns, but even if we take this into consideration, 10 looks too conversative for me.
The text was updated successfully, but these errors were encountered: