Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optimize the list partition pruner performance in huge partitions #21778

Closed
crazycs520 opened this issue Dec 15, 2020 · 0 comments · Fixed by #21577 or #21997
Closed

optimize the list partition pruner performance in huge partitions #21778

crazycs520 opened this issue Dec 15, 2020 · 0 comments · Fixed by #21577 or #21997
Assignees
Labels
type/enhancement The issue or PR belongs to an enhancement.

Comments

@crazycs520
Copy link
Contributor

crazycs520 commented Dec 15, 2020

Development Task

Related to #20678

When there are many partitions, the list partition pruner performance is not good enough. such as the below table has 1000 partitions, each partition has 50 values:

CREATE TABLE `t` (
  `id` int(11) DEFAULT NULL,
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL,
  `name` varchar(10) DEFAULT NULL,
  UNIQUE KEY `id` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
PARTITION BY LIST COLUMNS(id) (
  PARTITION `p0` VALUES IN (0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49),
  PARTITION `p1` VALUES IN (50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99),
  PARTITION `p2` VALUES IN (100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149),
...
...
...
  PARTITION `p999` VALUES IN (49950,49951,49952,49953,49954,49955,49956,49957,49958,49959,49960,49961,49962,49963,49964,49965,49966,49967,49968,49969,49970,49971,49972,49973,49974,49975,49976,49977,49978,49979,49980,49981,49982,49983,49984,49985,49986,49987,49988,49989,49990,49991,49992,49993,49994,49995,49996,49997,49998,49999)
)

The following sections describe the current implementation and ideas for improvement.

For simplicity, the following table is used as an example to explain:

CREATE TABLE `t` (
  `id` int(11) DEFAULT NULL,
  `a` int(11) DEFAULT NULL,
  `b` int(11) DEFAULT NULL
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_bin
PARTITION BY LIST COLUMNS(id) (
  PARTITION `p0` VALUES IN (0,1),
  PARTITION `p1` VALUES IN (2,3),
  PARTITION `p2` VALUES IN (4,5, NULL)
)

Current implementation

Currently, we build an expression for each partition to locate partition, such as:

  • the expression for partition p0 is: id in (0,1)
  • the expression for partition p1 is: id in (2,3)
  • the expression for partition p2 is: id in (4,5) or id is NULL

then, when insert into the table t with row like (id, a , b) values (2,2,2), the list partition pruner try to locate the partition by evaluating the partition expression one by one, from the partition p0:

  1. when id is 2, the expression for partition p0 id in (0,1) is false. it means not in this partition, go on;
  2. when id is 2, the expression for partition p1 id in (2,3) is true. Great, it's is the right partition for id = 2, we can finish locate process now.

As you can see, when there are many partitions and the values for each partition very large, the performance of the list partition pruner will be bad since it has to evaluate so many expressions.

Ideas for improvement

We can use a map instead of an expression for all partition values. such as:

0 => p0
1 => p0
2 => p1
3 => p1
4 => p2
5 => p2
NULL => p2

The list partition pruner now to locate id = 2 will only check the map once, then it will know the partition of id = 2 is partition p1. The performance of using mapto locate is much better than evaluatein` expressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/enhancement The issue or PR belongs to an enhancement.
Projects
None yet
1 participant