optimize the list partition pruner performance in huge partitions #21778

crazycs520 · 2020-12-15T03:09:35Z

Development Task

Related to #20678

When there are many partitions, the list partition pruner performance is not good enough. such as the below table has 1000 partitions, each partition has 50 values:

CREATE TABLE `t` (
  `id` int(11) DEFAULT NULL,
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  `name` varchar(10) DEFAULT NULL,
  UNIQUE KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
PARTITION BY LIST COLUMNS(id) (
  PARTITION `p0` VALUES IN (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49),
  PARTITION `p1` VALUES IN (50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99),
  PARTITION `p2` VALUES IN (100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149),
...
...
...
  PARTITION `p999` VALUES IN (49950,49951,49952,49953,49954,49955,49956,49957,49958,49959,49960,49961,49962,49963,49964,49965,49966,49967,49968,49969,49970,49971,49972,49973,49974,49975,49976,49977,49978,49979,49980,49981,49982,49983,49984,49985,49986,49987,49988,49989,49990,49991,49992,49993,49994,49995,49996,49997,49998,49999)
)

The following sections describe the current implementation and ideas for improvement.

For simplicity, the following table is used as an example to explain：

CREATE TABLE `t` (
  `id` int(11) DEFAULT NULL,
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
PARTITION BY LIST COLUMNS(id) (
  PARTITION `p0` VALUES IN (0,1),
  PARTITION `p1` VALUES IN (2,3),
  PARTITION `p2` VALUES IN (4,5, NULL)
)

Current implementation

Currently, we build an expression for each partition to locate partition, such as:

the expression for partition p0 is: id in (0,1)
the expression for partition p1 is: id in (2,3)
the expression for partition p2 is: id in (4,5) or id is NULL

then, when insert into the table t with row like (id, a , b) values (2,2,2), the list partition pruner try to locate the partition by evaluating the partition expression one by one, from the partition p0:

when id is 2, the expression for partition p0 id in (0,1) is false. it means not in this partition, go on;
when id is 2, the expression for partition p1 id in (2,3) is true. Great, it's is the right partition for id = 2, we can finish locate process now.

As you can see, when there are many partitions and the values for each partition very large, the performance of the list partition pruner will be bad since it has to evaluate so many expressions.

Ideas for improvement

We can use a map instead of an expression for all partition values. such as:

0 => p0
1 => p0
2 => p1
3 => p1
4 => p2
5 => p2
NULL => p2

The list partition pruner now to locate id = 2 will only check the map once, then it will know the partition of id = 2 is partition p1. The performance of using mapto locate is much better than evaluatein` expressions.

The text was updated successfully, but these errors were encountered:

crazycs520 added the type/enhancement The issue or PR belongs to an enhancement. label Dec 15, 2020

crazycs520 self-assigned this Dec 15, 2020

This was referenced Dec 15, 2020

Support list (columns) partition table #20678

Closed

planner: add special partition pruner for list columns partition #21577

Merged

crazycs520 closed this as completed in #21577 Dec 23, 2020

crazycs520 mentioned this issue Dec 24, 2020

planner: add special partition pruner for list columns partition (#21577) #21997

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize the list partition pruner performance in huge partitions #21778

optimize the list partition pruner performance in huge partitions #21778

crazycs520 commented Dec 15, 2020 •

edited

Loading

optimize the list partition pruner performance in huge partitions #21778

optimize the list partition pruner performance in huge partitions #21778

Comments

crazycs520 commented Dec 15, 2020 • edited Loading

Development Task

Current implementation

Ideas for improvement

crazycs520 commented Dec 15, 2020 •

edited

Loading