Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: update stats using query feedback #6197

Merged
merged 12 commits into from
Apr 10, 2018
Merged

*: update stats using query feedback #6197

merged 12 commits into from
Apr 10, 2018

Conversation

alivxxx
Copy link
Contributor

@alivxxx alivxxx commented Apr 2, 2018

This pr dumps the collected feedback into kv and use these feedback to update the stats, including the histogram and CM Sketch.
PTAL @coocood @winoros @shenli

@alivxxx
Copy link
Contributor Author

alivxxx commented Apr 2, 2018

/run-all-tests

domain/domain.go Outdated
continue
}
err = statsHandle.HandleUpdateStats()
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a metrics here.

Copy link
Contributor Author

@alivxxx alivxxx Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is better to add metrics in a separate pr, it is already too large now.

}
case <-dumpFeedbackTicker.C:
err = statsHandle.DumpStatsFeedbackToKV()
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will do it in a separate pr.

@@ -54,6 +54,17 @@ func (c *CMSketch) InsertBytes(bytes []byte) {
}
}

func (c *CMSketch) setValue(h1, h2 uint64, count uint32) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment for this function.

return buf.Bytes(), nil
}

func decodeFeedback(val []byte, q *QueryFeedback, c *CMSketch) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comments for the following code logic.

}

for _, t := range tests {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add comment for the test case.

@@ -190,6 +191,100 @@ func (h *Handle) dumpTableStatDeltaToKV(id int64, delta variable.TableDelta) (bo
return updated, errors.Trace(err)
}

// DumpStatsFeedbackToKV dumps the stats feedback to KV.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to add a few metrics for the following operations?

Copy link
Contributor Author

@alivxxx alivxxx Apr 3, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I will do it in a separate pr.

// For now, we do not use the query feedback, so just set it to 1.
const maxQueryFeedBackCount = 1
// MaxQueryFeedbackCount is the max number of feedback that cache in memory.
var MaxQueryFeedbackCount = 1000
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about 1 << 10 ?

if err != nil {
return errors.Trace(err)
}
sql := fmt.Sprintf("delete from mysql.stats_feedback where table_id = %d and hist_id = %d and is_index = %d", tableID, hist.ID, isIndex)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May the delete exceeds transaction limit?

@@ -493,3 +493,127 @@ func buildNewHistogram(h *Histogram, buckets []bucket) *Histogram {
}
return hist
}

// QueryFeedbackPB is used to serialize the QueryFeedback.
type QueryFeedbackPB struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name has PB but It's not protobuf.
Do you plan to change it to protobuf in the future?

@@ -243,7 +243,7 @@ func (s *testSuite) TestAggregation(c *C) {

result = tk.MustQuery("select count(*) from information_schema.columns")
// When adding new memory columns in information_schema, please update this variable.
columnCountOfAllInformationSchemaTables := "737"
columnCountOfAllInformationSchemaTables := "741"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 column added instead of 4?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is 4 column added.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, i misread it.

@coocood
Copy link
Member

coocood commented Apr 4, 2018

LGTM

@coocood coocood added the status/LGT1 Indicates that a PR has LGTM 1. label Apr 4, 2018
@@ -54,6 +54,18 @@ func (c *CMSketch) InsertBytes(bytes []byte) {
}
}

// setValue sets the count for value that hashed into (h1, h2).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(h1, h2) is an interval?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it means the hash value pair.

@alivxxx
Copy link
Contributor Author

alivxxx commented Apr 9, 2018

PTAL @winoros

func encodeFeedback(q *QueryFeedback) ([]byte, error) {
var pb *queryFeedback
var err error
if q.hist.tp.Tp == mysql.TypeLong {
Copy link
Member

@winoros winoros Apr 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make sure that the Tp there is only TypeLong? Not possible to be TypeLongLong?


// HandleUpdateStats update the stats using feedback.
func (h *Handle) HandleUpdateStats(is infoschema.InfoSchema) error {
sql := fmt.Sprintf("select table_id, hist_id, is_index, feedback from mysql.stats_feedback order by table_id, hist_id, is_index")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no need to fmt.Sprintf

}
sql := fmt.Sprintf("insert into mysql.stats_feedback (table_id, hist_id, is_index, feedback) values "+
"(%d, %d, %d, X'%X')", fb.tableID, fb.hist.ID, isIndex, vals)
_, err = h.ctx.(sqlexec.SQLExecutor).Execute(context.TODO(), sql)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not using RestrictedSQLExecutor ?

@alivxxx
Copy link
Contributor Author

alivxxx commented Apr 10, 2018

PTAL @winoros @zz-jason

winoros
winoros previously approved these changes Apr 10, 2018
Copy link
Member

@winoros winoros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@winoros winoros added conflicting status/LGT2 Indicates that a PR has LGTM 2. and removed status/LGT1 Indicates that a PR has LGTM 1. labels Apr 10, 2018
@alivxxx
Copy link
Contributor Author

alivxxx commented Apr 10, 2018

PTAL @winoros

@alivxxx alivxxx merged commit 19573c6 into pingcap:master Apr 10, 2018
@alivxxx alivxxx deleted the store branch April 10, 2018 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
component/statistics priority/P1 The issue has P1 priority. status/LGT2 Indicates that a PR has LGTM 2.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants