-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stats: dynamically update the average column size #6170
Conversation
executor/write.go
Outdated
|
||
ctx.GetSessionVars().TxnCtx.UpdateDeltaForTable(t.Meta().ID, 0, 1) | ||
colSize := make(map[int64]int64) | ||
for id, col := range t.WritableCols() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should also consider the column state?
executor/write.go
Outdated
ctx.GetSessionVars().TxnCtx.UpdateDeltaForTable(t.Meta().ID, 0, 1) | ||
colSize := make(map[int64]int64) | ||
for id, col := range t.WritableCols() { | ||
colSize[col.ID] = int64(len(newData[id].GetBytes()) - len(oldData[id].GetBytes())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to record the col size for fix length type.
statistics/handle_test.go
Outdated
hist.TotColSize = temp | ||
|
||
c.Assert(hist.CMSketch, NotNil) | ||
c.Assert(newStatsTbl.Columns[id].CMSketch, NotNil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is already checked at CMSketch.Equal
.
statistics/update.go
Outdated
} | ||
updated := h.ctx.GetSessionVars().StmtCtx.AffectedRows() > 0 | ||
_, err = h.ctx.(sqlexec.SQLExecutor).Execute(ctx, "commit") | ||
return updated, errors.Trace(err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the error is not nil, then updated will be always true.
statistics/update.go
Outdated
version := h.ctx.Txn().StartTS() | ||
|
||
for key, val := range delta.ColSize { | ||
sql := fmt.Sprintf("update mysql.stats_histograms set version = %d, tot_col_size = tot_col_size + %d where hist_id = %d and table_id = %d", version, val, key, id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be two histograms with the same hist_id
and table_id
.
executor/write.go
Outdated
ctx.GetSessionVars().TxnCtx.UpdateDeltaForTable(t.Meta().ID, 0, 1) | ||
colSize := make(map[int64]int64) | ||
for id, col := range t.WritableCols() { | ||
colSize[col.ID] = int64(len(newData[id].GetBytes()) - len(oldData[id].GetBytes())) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GetBytes
is only valid when the Kind of the Datum is KindString or KindBytes.
Maybe we need to add a method for Datum to return the size?
executor/write.go
Outdated
colSize := make(map[int64]int64) | ||
for id, col := range t.WritableCols() { | ||
if col.State == model.StatePublic && newData[id].Size() != 0 { | ||
colSize[col.ID] = int64(newData[id].Size() - oldData[id].Size()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not right? If the new data happens to be null, but the old data is not null.
executor/write.go
Outdated
colSize := make(map[int64]int64) | ||
for id, col := range t.WritableCols() { | ||
if col.State == model.StatePublic { | ||
colSize[col.ID] = -int64(data[id].Size()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the Size
is 0, it is not needed to update it?
table/tables/tables.go
Outdated
colSize := make(map[int64]int64) | ||
for id, col := range t.WritableCols() { | ||
if col.State == model.StatePublic { | ||
colSize[col.ID] = int64(r[id].Size()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
types/datum.go
Outdated
@@ -122,6 +122,11 @@ func (d *Datum) IsNull() bool { | |||
return d.k == KindNull | |||
} | |||
|
|||
// Size gets the length of byte of the datum. | |||
func (d *Datum) Size() int { | |||
return len(d.b) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to switch on d.k
to determine the size of the datum.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or if we only care about varlen size, then we don't need this method, use len(d.GetBytes())
will do.
@fipped Please resolve the conflicts. |
@lamxTyler PTAL |
statistics/update.go
Outdated
if err = h.dumpTableStatColSizeToKV(id, item); err != nil { | ||
return errors.Trace(err) | ||
} | ||
m := h.globalMap[id] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not need to update the map if the updated
is true, just delete it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/run-all-tests |
PTAL @coocood |
sessionctx/variable/session.go
Outdated
@@ -101,13 +101,21 @@ type TransactionContext struct { | |||
} | |||
|
|||
// UpdateDeltaForTable updates the delta info for some table. | |||
func (tc *TransactionContext) UpdateDeltaForTable(tableID int64, delta int64, count int64) { | |||
func (tc *TransactionContext) UpdateDeltaForTable(tableID int64, delta int64, count int64, colSize *map[int64]int64) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to use pointer for a map.
executor/write.go
Outdated
|
||
ctx.GetSessionVars().TxnCtx.UpdateDeltaForTable(t.Meta().ID, 0, 1) | ||
colSize := make(map[int64]int64) | ||
for id, col := range t.WritableCols() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just range over t.Cols()
, it will return public columns only.
executor/write.go
Outdated
@@ -392,7 +400,16 @@ func (e *DeleteExec) removeRow(ctx sessionctx.Context, t table.Table, h int64, d | |||
} | |||
ctx.StmtAddDirtyTableOP(DirtyTableDeleteRow, t.Meta().ID, h, nil) | |||
ctx.GetSessionVars().StmtCtx.AddAffectedRows(1) | |||
ctx.GetSessionVars().TxnCtx.UpdateDeltaForTable(t.Meta().ID, -1, 1) | |||
colSize := make(map[int64]int64) | |||
for id, col := range t.WritableCols() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
LGTM |
This PR is about the work of updating the tot_col_size of histograms when executing insert, delete and update statement. Therefore, a correct average column size can be computed when we use
show stats_histograms
statement.The tot_col_size we dynamically maintain will be a little different from that computed by
analyze
statement since we get the size from the value directly instead of computing from the coded bytes, that is we leave out the length byte.PTAL @lamxTyler @winoros