You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When there are many partitions, the list partition pruner performance is not good enough. such as the below table has 1000 partitions, each partition has 50 values:
Currently, we build an expression for each partition to locate partition, such as:
the expression for partition p0 is: id in (0,1)
the expression for partition p1 is: id in (2,3)
the expression for partition p2 is: id in (4,5) or id is NULL
then, when insert into the table t with row like (id, a , b) values (2,2,2), the list partition pruner try to locate the partition by evaluating the partition expression one by one, from the partition p0:
when id is 2, the expression for partition p0 id in (0,1) is false. it means not in this partition, go on;
when id is 2, the expression for partition p1 id in (2,3) is true. Great, it's is the right partition for id = 2, we can finish locate process now.
As you can see, when there are many partitions and the values for each partition very large, the performance of the list partition pruner will be bad since it has to evaluate so many expressions.
Ideas for improvement
We can use a map instead of an expression for all partition values. such as:
The list partition pruner now to locate id = 2 will only check the map once, then it will know the partition of id = 2 is partition p1. The performance of using mapto locate is much better than evaluatein` expressions.
The text was updated successfully, but these errors were encountered:
Development Task
Related to #20678
When there are many partitions, the list partition pruner performance is not good enough. such as the below table has 1000 partitions, each partition has 50 values:
The following sections describe the current implementation and ideas for improvement.
For simplicity, the following table is used as an example to explain:
Current implementation
Currently, we build an expression for each partition to locate partition, such as:
id in (0,1)
id in (2,3)
id in (4,5) or id is NULL
then, when insert into the table
t
with row like(id, a , b) values (2,2,2)
, the list partition pruner try to locate the partition by evaluating the partition expression one by one, from the partition p0:id in (0,1)
is false. it means not in this partition, go on;id in (2,3)
is true. Great, it's is the right partition forid = 2
, we can finish locate process now.As you can see, when there are many partitions and the values for each partition very large, the performance of the list partition pruner will be bad since it has to evaluate so many expressions.
Ideas for improvement
We can use a
map
instead of an expression for all partition values. such as:The list partition pruner now to locate
id = 2
will only check themap
once, then it will know the partition ofid = 2
is partitionp1
. The performance of using mapto locate is much better than evaluate
in` expressions.The text was updated successfully, but these errors were encountered: