-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
executor: improve wide table insert & update performance #7935
Conversation
/run-all-tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Please add a performance test to integrate tests. |
There are some bugs in first version modification. it works well if but when so Now, we add the prefect way is modify all builtinFunc impl flag, but it's too much work to do, so now just modify
so, after this PR, sql like @zz-jason @tiancaiamao PTAL |
executor/insert_common.go
Outdated
@@ -216,6 +240,13 @@ func (e *InsertValues) evalRow(cols []*table.Column, list []expression.Expressio | |||
|
|||
offset := cols[i].Offset | |||
row[offset], hasValue[offset] = val1, true | |||
if expr.Flag()&expression.FlagChunkReused > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FlagChunkReused
makes me confused, if the expression have a reuse flag, than we create a new chunk ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, expression reuse chunk, so we can not reuse old chunk for following eval~
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about expression.HoldChunkMemory
, it the returned result of an expression need to hold the memory of the chunk, we need to allocate a new chunk.
executor/insert_common.go
Outdated
if e.evalBuffer != nil { | ||
return | ||
} | ||
tpsLen := len(e.Table.Cols()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about s/tpsLen/numCols/
executor/insert_common.go
Outdated
if e.hasExtraHandle { | ||
e.evalBufferTypes[len(e.evalBufferTypes)-1] = types.NewFieldType(mysql.TypeLonglong) | ||
} | ||
mutChunk := chunk.MutRowFromTypes(e.evalBufferTypes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about s/mutChunk/mutRow/
expression/expression.go
Outdated
|
||
const ( | ||
// FlagHoldChunkMemory indicates expression return result maybe returned by unsafe operation over chunk. | ||
FlagHoldChunkMemory Flag = 1 << iota |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, this is true. I think we can remove this flag and the Flag()
API from the expression
interface.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after macro bench, always copy result in insert_rows
is
1.059124ms
1.113391ms
1.037529ms
1.018771ms
998.436µs
1.010835ms
929.046µs
1.502075ms
972.367µs
984.169µs
990.929µs
967.04µs
1.797491ms
1.506823ms
957.703µs
979.513µs
1.001699ms
1.062672ms
it's litte differ to not-copy-flag solution, so we choose always copy now
PTAL @tiancaiamao @zz-jason
expression/column.go
Outdated
@@ -47,7 +47,7 @@ func (col *CorrelatedColumn) Eval(row chunk.Row) (types.Datum, error) { | |||
} | |||
|
|||
// EvalInt returns int representation of CorrelatedColumn. | |||
func (col *CorrelatedColumn) EvalInt(ctx sessionctx.Context, row chunk.Row) (int64, bool, error) { | |||
func (col *CorrelatedColumn) EvalInt(ctx sessionctx.Context, _ chunk.Row) (int64, bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change of this file is irrelevant, can we revert it back?
executor/update.go
Outdated
@@ -141,6 +142,8 @@ func (e *UpdateExec) fetchChunkRows(ctx context.Context) error { | |||
fields := e.children[0].retTypes() | |||
globalRowIdx := 0 | |||
chk := e.children[0].newFirstChunk() | |||
mutChunk := chunk.MutRowFromTypes(fields) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/mutChunk/mutRow/
executor/update.go
Outdated
@@ -42,6 +42,7 @@ type UpdateExec struct { | |||
// columns2Handle stores relationship between column ordinal to its table handle. | |||
// the columns ordinals is present in ordinal range format, @see executor.cols2Handle | |||
columns2Handle cols2HandleSlice | |||
evalBuffer *chunk.MutRow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
evalBuffer
can be declared as chunk.MutRow
expression/builtin.go
Outdated
@@ -49,11 +49,12 @@ func newBaseBuiltinFunc(ctx sessionctx.Context, args []Expression) baseBuiltinFu | |||
if ctx == nil { | |||
panic("ctx should not be nil") | |||
} | |||
return baseBuiltinFunc{ | |||
fn := baseBuiltinFunc{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change of this file is irrelevant, can we revert it back?
expression/expression.go
Outdated
@@ -36,6 +36,14 @@ const ( | |||
scalarFunctionFlag byte = 3 | |||
) | |||
|
|||
// Flag represents expression's property. | |||
type Flag byte |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to be used in next few lines. FlagHoldChunkMemory Flag = 1 << iota
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I mean this line and the flowing few lines can all be removed.
executor/insert_common.go
Outdated
@@ -117,9 +119,30 @@ func (e *InsertValues) getColumns(tableCols []*table.Column) ([]*table.Column, e | |||
return nil, errors.Trace(err) | |||
} | |||
|
|||
e.initEvalBuffer() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function call can be put to InsertExec.Open()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe not convience to do that, because we need e.hasExtraHandle
which is assigned in getColumns
now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getColumns()
is called in the Next()
function, Maybe we can move it to the Open()
function:
InsertExec.Open:
getColumns()
initEvalBuffer()
InsertExec.Next:
e.insertRowsFromSelect(), or:
e.insertRows()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another thing is, according to the callers of the function getColumns()
, maybe we can also improve the performance of the replace
statement:
3 executor/insert.go|132 col 17| cols, err := e.getColumns(e.Table.Cols())
4 executor/replace.go|181 col 17| cols, err := e.getColumns(e.Table.Cols())
5 executor/builder.go|576 col 28| columns, err := insertVal.getColumns(tableCols)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
finally, I move getColumns to buildExecutor just like loadData does...and move initEvalBuffer to Open, and replace pointer to direct ref.
PTAL @zz-jason , although CI doesn't work
/run-all-tests |
LGTM @zz-jason |
executor/insert_common.go
Outdated
@@ -50,9 +50,13 @@ type InsertValues struct { | |||
GenColumns []*ast.ColumnName | |||
GenExprs []expression.Expression | |||
|
|||
InsertColumns []*table.Column |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to export it? s/InsertColumns/insertColumns/
executor/insert_common.go
Outdated
@@ -114,10 +120,25 @@ func (e *InsertValues) getColumns(tableCols []*table.Column) ([]*table.Column, e | |||
// Check column whether is specified only once. | |||
err = table.CheckOnce(cols) | |||
if err != nil { | |||
return nil, errors.Trace(err) | |||
return errors.Trace(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no need to Trace
any more?
/run-all-tests PTAL @zz-jason |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
/run-all-tests |
bugfix fixed pingcap#7518 expression: MySQL compatible current_user function (pingcap#7801) plan: propagate constant over outer join (pingcap#7794) - extract `outerCol = const` from join conditions and filter conditions, substitute `outerCol` in join conditions with `const`; - extract `outerCol = innerCol` from join conditions, derive new join conditions based on this column equal condition and `outerCol` related expressions in join conditions and filter conditions; util/timeutil: fix data race caused by forgetting set stats lease to 0 (pingcap#7901) stats: handle ddl event for partition table (pingcap#7903) plan: implement Operand and Pattern of cascades planner. (pingcap#7910) planner: not convert to TableDual if empty range is derived from deferred constants (pingcap#7808) plan: move projEliminate behind aggEliminate (pingcap#7909) admin: fix admin check table bug of byte compare (pingcap#7887) * admin: remove reflect deepEqual stats: fix panic caused by empty histogram (pingcap#7912) plan: fix panic caused by empty schema of LogicalTableDual (pingcap#7906) * fix drop view if exist error (pingcap#7833) executor: refine `explain analyze` (pingcap#7888) executor: add an variable to compatible with MySQL insert for OGG (pingcap#7863) expression: maintain `DeferredExpr` in aggressive constant folding. (pingcap#7915) stats: fix histogram boundaries overflow error (pingcap#7883) ddl:support the definition of `null` change to `not null` using `alter table` (pingcap#7771) * ddl:support the definition of null change to not null using alter table ddl: add check when create table with foreign key. (pingcap#7885) * ddl: add check when create table with foreign key planner: eliminate if null on non null column (pingcap#7924) executor: fix a bug in point get (pingcap#7934) planner, executor: refine ColumnPrune for LogicalUnionAll (pingcap#7930) executor: fix panic when limit is too large (pingcap#7936) ddl: add TiDB version to metrics (pingcap#7902) stats: limit the length of sample values (pingcap#7931) vendor: update tipb (pingcap#7893) planner: support the Group and GroupExpr for the cascades planner (pingcap#7917) store/tikv: log more information when other err occurs (pingcap#7948) types: fix date time parse (pingcap#7933) ddl: just print error message when ddl job is normal to calcel, to eliminate noisy log (pingcap#7875) stats: update delta info for partition table (pingcap#7947) explaintest: add explain test for partition pruning (pingcap#7505) util: move disjoint set to util package (pingcap#7950) util: add PreAlloc4Row and Insert for Chunk and List (pingcap#7916) executor: add the slow log for commit (pingcap#7951) expression: add builtin json_keys (pingcap#7776) privilege: add USAGE in `show grants` for mysql compatibility (pingcap#7955) ddl: fix invailid ddl job panic (pingcap#7940) *: move ast.NewValueExpr to standalone parser_driver package (pingcap#7952) Make the ast package get rid of the dependency of types.Datum server: allow cors http request (pingcap#7939) *: move `Statement` and `RecordSet` from ast to sqlexec package (pingcap#7970) pr suggestion update executor/aggfuncs: split unit tests to corresponding file (pingcap#7993) store/tikv: fix typo (pingcap#7990) executor, planner: clone proj schema for different children in buildProj4Union (pingcap#7999) executor: let information_schema be the first database in ShowDatabases (pingcap#7938) stats: use local feedback for partition table (pingcap#7963) executor: add unit test for aggfuncs (pingcap#7966) server: add log for binary execute statement (pingcap#7987) admin: refine admin check decoder (pingcap#7862) executor: improve wide table insert & update performance (pingcap#7935) ddl: fix reassigned partition id in `truncate table` does not take effect (pingcap#7919) fix reassigned partition id in truncate table does not take effect add changelog for 2.1.0 rc4 (pingcap#8020) *: make parser package dependency as small as possible (pingcap#7989) parser: support `:=` in the `set` syntax (pingcap#8018) According to MySQL document, `set` use the = assignment operator, but the := assignment operator is also permitted stats: garbage collect stats for partition table (pingcap#7962) docs: add the proposal for the column pool (pingcap#7988) expression: refine built-in func truncate to support uint arg (pingcap#8000) stats: support show stats for partition table (pingcap#8023) stats: update error rate for partition table (pingcap#8022) stats: fix estimation for out of range point queries (pingcap#8015) *: move parser to a separate repository (pingcap#8036) executor: fix wrong result when index join on union scan. (pingcap#8031) Do not modify Plan of dataReaderBuilder directly, because it would impact next batch of outer rows, as well as other concurrent inner workers. Instead, build a local child builder to store the child plan. planner: fix a panic of a cached prepared statement with IndexScan (pingcap#8017) *: fix the issue of executing DDL after executing SQL failure in txn (pingcap#8044) * ddl, executor: fix the issue of executing DDL after executing SQL failure in txn add unit test remove debug info add like evaluator case sensitive test ddl, domain: make schema correct after canceling jobs (pingcap#7997) unit test fix code format proposal: maintaining histograms in plan. (pingcap#7605) support _tidb_rowid for table scan range (pingcap#8047) var rename fix
What problem does this PR solve?
from one of our customers, wide table's INSERT in 2.1 is slower than 2.0
the question is
evalRow
callchunk.MutRowFromDatums
values-len x column-num times, andMutRowFromDatums
will make heavy alloc operationthis is cpu pprof result in 58 column insert under dead loop
What is changed and how it works?
eval buffer can be reuse and modify partly(if
x = 3, y = x + 1
situation)Check List
Tests
single thread dead loop to execute a sql
insert into 58 columns table
and monitorinsertValues#insertRows
time.before this PR:
after this PR:
Code changes
Side effects
Related changes
Other Question
MutRowFromDatums
is devil, we should avoid to use it and delete other usage tooThis change is