table: use evalBuffer to improve performance of locatePartition #18818

imtbkcat · 2020-07-28T06:39:36Z

What problem does this PR solve?

Issue Number: close #16667

Problem Summary: MutRowFromDatums is so heavy that cause locatePartition performance worse than expected. Especially in LOAD DATA, partition table need nearly twice time than general table.

What is changed and how it works?

What's Changed:

For single column partition key, like partition by hash(col) or partition by range(col), get integer value directly.
For expression as partition key, use evalBuffer to avoid MutRowFromDatums repeatly.

How it Works:
This two method will avoid using MutRowFromDatums many times. improve the write performance of partition table.

Related changes

Need to cherry-pick to the release branch

Check List

Tests

Unit test (will be added later)
Manual test (add detailed scripts or steps below)

Side effects

None

Release note

table: improve the write performance of partition table

imtbkcat · 2020-07-28T07:01:37Z

Here is a simple LOAD DATA benchmark result:

Before this fix:

range partition table:

MySQL [test]> LOAD DATA LOCAL INFILE '2017Q3-capitalbikeshare-tripdata.csv' INTO TABLE trips_range FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type);
Query OK, 1191585 rows affected (25.168 sec)
Records: 1191585  Deleted: 0  Skipped: 0  Warnings: 0

hash partition table:

MySQL [test]> LOAD DATA LOCAL INFILE '2017Q3-capitalbikeshare-tripdata.csv' INTO TABLE trips_hash FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type);
Query OK, 1191585 rows affected (15.620 sec)
Records: 1191585  Deleted: 0  Skipped: 0  Warnings: 0

general table:

MySQL [test]> LOAD DATA LOCAL INFILE '2017Q3-capitalbikeshare-tripdata.csv' INTO TABLE trips_general FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type);
Query OK, 1191585 rows affected (9.432 sec)
Records: 1191585  Deleted: 0  Skipped: 0  Warnings: 0

After using optimize:

range:
MySQL [test]> LOAD DATA LOCAL INFILE '2017Q3-capitalbikeshare-tripdata.csv' INTO TABLE trips_range FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type);
Query OK, 1191585 rows affected (9.566 sec)
Records: 1191585  Deleted: 0  Skipped: 0  Warnings: 0

hash:
MySQL [test]> LOAD DATA LOCAL INFILE '2017Q3-capitalbikeshare-tripdata.csv' INTO TABLE trips_hash FIELDS TERMINATED BY ',' ENCLOSED BY '\"' LINES TERMINATED BY '\r\n' IGNORE 1 LINES (duration, start_date, end_date, start_station_number, start_station, end_station_number, end_station, bike_number, member_type);
Query OK, 1191585 rows affected (10.437 sec)
Records: 1191585  Deleted: 0  Skipped: 0  Warnings: 0

The performance improvement is very clear in LOAD DATA cases.

codecov · 2020-07-28T07:12:12Z

Codecov Report

Merging #18818 into master will not change coverage.
The diff coverage is n/a.

@@             Coverage Diff             @@
##             master     #18818   +/-   ##
===========================================
  Coverage   79.1417%   79.1417%           
===========================================
  Files           550        550           
  Lines        149749     149749           
===========================================
  Hits         118514     118514           
  Misses        21689      21689           
  Partials       9546       9546

tiancaiamao · 2020-07-29T02:35:21Z

table/tables/partition.go

+func (t *partitionedTable) locateRangePartition(ctx sessionctx.Context, pi *model.PartitionInfo, r []types.Datum) (int, error) {
+	var ret int64
+	if col, ok := t.partitionExpr.Expr.(*expression.Column); ok {
+		ret = r[col.Index].GetInt64()


Would the column be null?

I add some test to cover these cases, PTAL @tiancaiamao

tiancaiamao · 2020-07-29T02:42:26Z

table/tables/partition.go

 // TODO: supports linear hashing
 func (t *partitionedTable) locateHashPartition(ctx sessionctx.Context, pi *model.PartitionInfo, r []types.Datum) (int, error) {
-	ret, isNull, err := t.partitionExpr.Expr.EvalInt(ctx, chunk.MutRowFromDatums(r).ToRow())
+	if col, ok := t.partitionExpr.Expr.(*expression.Column); ok {
+		ret := r[col.Index].GetInt64()


r[col.Index] is null?

tiancaiamao · 2020-07-29T02:46:48Z

Please add more test to cover the new code.
Especially the +-1 case and null value.

tiancaiamao · 2020-08-18T03:11:20Z

LGTM

lysu

LGTM

table/tables/partition.go

lysu · 2020-08-27T11:49:01Z

/merge

ti-srebot · 2020-08-27T11:49:03Z

Your auto merge job has been accepted, waiting for:

19527
19528
19485

ti-srebot · 2020-08-27T12:13:22Z

/run-all-tests

imtbkcat · 2020-09-01T04:32:33Z

/run-all-tests

lysu

LGTM

imtbkcat · 2020-09-01T06:37:50Z

/merge

ti-srebot · 2020-09-01T06:41:38Z

/run-all-tests

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>

ti-srebot · 2020-09-01T06:50:52Z

cherry pick to release-3.0 in PR #19647

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>

ti-srebot · 2020-09-01T06:52:24Z

cherry pick to release-4.0 in PR #19649

…) (#19647) Signed-off-by: ti-srebot <ti-srebot@pingcap.com>

imtbkcat added type/enhancement The issue or PR belongs to an enhancement. sig/transaction SIG:Transaction needs-cherry-pick-3.0 labels Jul 28, 2020

imtbkcat requested review from lysu, tiancaiamao and jackysp July 28, 2020 06:39

tiancaiamao reviewed Jul 29, 2020

View reviewed changes

imtbkcat force-pushed the acc_partition_insert branch from 311d51a to fd0cddc Compare August 4, 2020 08:20

ti-srebot added the status/LGT1 Indicates that a PR has LGTM 1. label Aug 18, 2020

lysu reviewed Aug 19, 2020

View reviewed changes

table/tables/partition.go Show resolved Hide resolved

imtbkcat force-pushed the acc_partition_insert branch from ae241e8 to c4e4227 Compare August 25, 2020 11:01

imtbkcat requested a review from a team as a code owner August 25, 2020 11:01

imtbkcat requested review from lzmhhh123 and removed request for a team August 25, 2020 11:01

imtbkcat force-pushed the acc_partition_insert branch from 704a42d to 7be36cf Compare August 26, 2020 05:35

ti-srebot removed the status/LGT1 Indicates that a PR has LGTM 1. label Aug 27, 2020

ti-srebot previously approved these changes Aug 27, 2020

View reviewed changes

ti-srebot added the status/LGT2 Indicates that a PR has LGTM 2. label Aug 27, 2020

ti-srebot added the status/can-merge Indicates a PR has been approved by a committer. label Aug 27, 2020

imtbkcat added 12 commits September 1, 2020 12:31

go fmt

fb00659

format import

64e8e7a

fix ci test

96caa21

use pool to avoid race

b7cd93c

use pointer

3777069

add some test

8655243

fix range partition bug

9ffbd23

add multi table test case

bd3bff8

fix test case fail

3911ebf

fix mysql test case

24356ea

fix mysql test case

38c7f00

add addition column for handle

56733c0

imtbkcat force-pushed the acc_partition_insert branch from faeaecc to 56733c0 Compare September 1, 2020 04:32

lysu approved these changes Sep 1, 2020

View reviewed changes

Merge branch 'master' into acc_partition_insert

54b7b8d

ti-srebot merged commit 349adf8 into pingcap:master Sep 1, 2020

ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Sep 1, 2020

cherry pick pingcap#18818 to release-3.0

ba443b3

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>

ti-srebot mentioned this pull request Sep 1, 2020

table: use evalBuffer to improve performance of locatePartition (#18818) #19647

Merged

ti-srebot pushed a commit to ti-srebot/tidb that referenced this pull request Sep 1, 2020

cherry pick pingcap#18818 to release-4.0

a3aafe7

Signed-off-by: ti-srebot <ti-srebot@pingcap.com>

ti-srebot mentioned this pull request Sep 1, 2020

table: use evalBuffer to improve performance of lo ... (#18818) #19649

Merged

zz-jason pushed a commit that referenced this pull request Sep 3, 2020

table: use evalBuffer to improve performance of lo ... (#18818) (#19649)

104d372

ti-srebot added a commit that referenced this pull request Sep 5, 2020

table: use evalBuffer to improve performance of locatePartition (#18818…

bd63653

…) (#19647) Signed-off-by: ti-srebot <ti-srebot@pingcap.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

table: use evalBuffer to improve performance of locatePartition #18818

table: use evalBuffer to improve performance of locatePartition #18818

imtbkcat commented Jul 28, 2020

imtbkcat commented Jul 28, 2020

codecov bot commented Jul 28, 2020 •

edited

Loading

tiancaiamao Jul 29, 2020

imtbkcat Aug 4, 2020

tiancaiamao Jul 29, 2020

tiancaiamao commented Jul 29, 2020

tiancaiamao commented Aug 18, 2020

lysu left a comment •

edited

Loading

lysu commented Aug 27, 2020

ti-srebot commented Aug 27, 2020

ti-srebot commented Aug 27, 2020

imtbkcat commented Sep 1, 2020

lysu left a comment

imtbkcat commented Sep 1, 2020

ti-srebot commented Sep 1, 2020

ti-srebot commented Sep 1, 2020

ti-srebot commented Sep 1, 2020

table: use evalBuffer to improve performance of locatePartition #18818

table: use evalBuffer to improve performance of locatePartition #18818

Conversation

imtbkcat commented Jul 28, 2020

What problem does this PR solve?

What is changed and how it works?

Related changes

Check List

Release note

imtbkcat commented Jul 28, 2020

codecov bot commented Jul 28, 2020 • edited Loading

Codecov Report

tiancaiamao Jul 29, 2020

Choose a reason for hiding this comment

imtbkcat Aug 4, 2020

Choose a reason for hiding this comment

tiancaiamao Jul 29, 2020

Choose a reason for hiding this comment

tiancaiamao commented Jul 29, 2020

tiancaiamao commented Aug 18, 2020

lysu left a comment • edited Loading

Choose a reason for hiding this comment

lysu commented Aug 27, 2020

ti-srebot commented Aug 27, 2020

ti-srebot commented Aug 27, 2020

imtbkcat commented Sep 1, 2020

lysu left a comment

Choose a reason for hiding this comment

imtbkcat commented Sep 1, 2020

ti-srebot commented Sep 1, 2020

ti-srebot commented Sep 1, 2020

ti-srebot commented Sep 1, 2020

codecov bot commented Jul 28, 2020 •

edited

Loading

lysu left a comment •

edited

Loading