Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ddl: scatter the regions of table when creating them #10980

Merged
merged 3 commits into from
Jul 10, 2019

Conversation

jackysp
Copy link
Member

@jackysp jackysp commented Jun 28, 2019

Signed-off-by: Shuaipeng Yu jackysp@gmail.com

What problem does this PR solve?

It’s a bit late to schedule the region when PD found the hotspot. It is better to scatter them when they are empty.

What is changed and how it works?

Scatter the regions of the tables when creating them.

Check List

Tests

  • Manual test (add detailed scripts or steps below)
    Created 100 tables and compare the regions distribution.
    master:

屏幕快照 2019-06-28 下午7 11 10

This PR:

屏幕快照 2019-06-29 下午10 51 37

Code changes

  • Has exported variable/fields change

Side effects

  • Increased code complexity

@codecov
Copy link

codecov bot commented Jun 28, 2019

Codecov Report

Merging #10980 into master will decrease coverage by 0.139%.
The diff coverage is 48.75%.

@@               Coverage Diff                @@
##             master     #10980        +/-   ##
================================================
- Coverage   81.1735%   81.0344%   -0.1391%     
================================================
  Files           421        422         +1     
  Lines         89916      89689       -227     
================================================
- Hits          72988      72679       -309     
- Misses        11687      11749        +62     
- Partials       5241       5261        +20

@jackysp jackysp changed the title ddl: scatter the regions of table and index when creating them ddl: scatter the regions of table when creating them Jun 29, 2019
@jackysp jackysp force-pushed the scatter_table branch 2 times, most recently from 3f48412 to 8f75385 Compare June 30, 2019 03:41
@jackysp
Copy link
Member Author

jackysp commented Jun 30, 2019

/run-integration-common-test

@jackysp
Copy link
Member Author

jackysp commented Jun 30, 2019

/run-all-tests

@jackysp
Copy link
Member Author

jackysp commented Jun 30, 2019

/run-integration-common-test tidb-test=pr/844

@jackysp
Copy link
Member Author

jackysp commented Jun 30, 2019

For 100 tables, waiting to scatter finish costs 24 min, and no waiting cost 6 min. Try to make the default value of tidb_wait_split_region_finish to off.

@jackysp
Copy link
Member Author

jackysp commented Jul 1, 2019

Change tidb_wait_split_region_finish to global scope. It is a global only variable, so it keeps the same behavior as the global only variables in MySQL.

$ mysql -uroot test
Reading table information for completion of table and column names
You can turn off this feature to get a quicker startup with -A

Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 15
Server version: 8.0.16 Homebrew

Copyright (c) 2000, 2019, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> load data local infile "~/tmp.csv" into table t;
Query OK, 1 row affected (0.07 sec)
Records: 1  Deleted: 0  Skipped: 0  Warnings: 0

mysql> set global local_infile=0;
Query OK, 0 rows affected (0.00 sec)

mysql> load data local infile "~/tmp.csv" into table t;
ERROR 1148 (42000): The used command is not allowed with this MySQL version

-- set local_infile to on in another session.
mysql> load data local infile "~/tmp.csv" into table t;
Query OK, 1 row affected (0.05 sec)
Records: 1  Deleted: 0  Skipped: 0  Warnings: 0

@jackysp
Copy link
Member Author

jackysp commented Jul 1, 2019

/run-all-tests

}

func (s *tikvStore) splitRegion(splitKey kv.Key) (*metapb.Region, error) {
func (s *tikvStore) SplitRegion(splitKey kv.Key, scatter bool) (uint64, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need comment for the returned uint64 value.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think SplitRegion is better just used for splitting region, scatter region is another action. Use a boolean to indicate different actions is not clean code. Keep SplitRegionAndScatter is better?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SplitRegionAndScatter is even bad. I think it’s worse to combine the names of the actions together as the function name. And we also need an interface named SplitRegion. The best way is to add a standalone interface for scattering, but now we have at least one interface saved, and split and scatter are really strongly related. If we need to use scatter separately in the future, I think we can separate them again.

@@ -697,7 +697,7 @@ var defaultSysVars = []*SysVar{
{ScopeGlobal | ScopeSession, TiDBOptJoinReorderThreshold, strconv.Itoa(DefTiDBOptJoinReorderThreshold)},
{ScopeSession, TiDBCheckMb4ValueInUTF8, BoolToIntStr(config.GetGlobalConfig().CheckMb4ValueInUTF8)},
{ScopeSession, TiDBSlowQueryFile, ""},
{ScopeSession, TiDBWaitSplitRegionFinish, BoolToIntStr(DefTiDBWaitSplitRegionFinish)},
{ScopeGlobal, TiDBWaitSplitRegionFinish, BoolToIntStr(DefTiDBWaitSplitRegionFinish)},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why change to Global scope?

@jackysp
Copy link
Member Author

jackysp commented Jul 1, 2019

/run-all-tests

@jackysp
Copy link
Member Author

jackysp commented Jul 2, 2019

PTAL @crazycs520 @winkyao @coocood

@coocood
Copy link
Member

coocood commented Jul 2, 2019

LGTM

} else {
regionIDs = append(regionIDs, splitRecordRegion(store, tbInfo.ID, scatter))
}
regionIDs = append(regionIDs, splitIndexRegion(store, tbInfo, scatter)...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Split index region for a common table, Is it expected?

It used to only split a table region for a common table, doesn't split the index regions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

emm... Yes, it is expected.

Copy link
Contributor

@crazycs520 crazycs520 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@crazycs520
Copy link
Contributor

@winkyao PTAL

@jackysp
Copy link
Member Author

jackysp commented Jul 9, 2019

PTAL @winkyao

Signed-off-by: Shuaipeng Yu <jackysp@gmail.com>
ddl/ddl_api.go Outdated Show resolved Hide resolved
ddl/ddl_api.go Outdated
}
} else if atomic.LoadUint32(&EnableSplitTableRegion) != 0 {
scatterRegion := variable.TiDBOptOn(val)
Copy link
Contributor

@winkyao winkyao Jul 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the variable.GetGlobalSystemVar return err, scatterRegion should be false?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think so.

Copy link
Contributor

@winkyao winkyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM

Signed-off-by: Shuaipeng Yu <jackysp@gmail.com>
Copy link
Contributor

@winkyao winkyao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@winkyao winkyao added require-LGT3 Indicates that the PR requires three LGTM. status/LGT3 The PR has already had 3 LGTM. and removed require-LGT3 Indicates that the PR requires three LGTM. labels Jul 10, 2019
@winkyao
Copy link
Contributor

winkyao commented Jul 10, 2019

/run-all-tests

@jackysp jackysp merged commit 89baed8 into pingcap:master Jul 10, 2019
@crazycs520
Copy link
Contributor

@jackysp Please cherry-pick this PR to v3.0 and v2.1.

jackysp added a commit to jackysp/tidb that referenced this pull request Jul 11, 2019
 Conflicts:
	ddl/ddl_api.go
	ddl/table.go
	sessionctx/variable/tidb_vars.go
	sessionctx/variable/varsutil.go
	store/tikv/split_region.go
jackysp added a commit to jackysp/tidb that referenced this pull request Jul 11, 2019
 Conflicts:
	ddl/ddl_api.go
	ddl/table.go
	sessionctx/variable/tidb_vars.go
	sessionctx/variable/varsutil.go
	store/tikv/split_region.go
@jackysp jackysp deleted the scatter_table branch February 27, 2020 13:32
@you06 you06 added the sig/sql-infra SIG: SQL Infra label Mar 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
sig/sql-infra SIG: SQL Infra status/LGT3 The PR has already had 3 LGTM.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants