Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: add system table mysql.stats_meta #2766

Merged
merged 10 commits into from
Mar 16, 2017
Merged

*: add system table mysql.stats_meta #2766

merged 10 commits into from
Mar 16, 2017

Conversation

hanfei1991
Copy link
Member

We add a new system table mysql.stats_meta to record table stats version.
In futher, this table will record the cardinality and distinct value of a table.
We load stats info in a loop. Everytime we just load the stats whose version is greator than last stats version.
Everytime we analyze a table, we update the version of that table stats.
@shenli @coocood @zimulala @tiancaiamao @lamxTyler PTAL

bootstrap.go Outdated
version bigint(64) unsigned NOT NULL,
table_id bigint(64) NOT NULL,
cardinality int(64) unsigned NOT NULL DEFAULT 0,
dist int(64) unsigned NOT NULL DEFAULT 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cardinality is distince count

tbl: statsTbl,
version: version,
}
statsTblCache.cache[id] = si
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this concurrency safe?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statsTblCache.m.Lock() already called. @lamxTyler

tbl, err = is.TableByName(model.NewCIStr("test"), model.NewCIStr("t"))
c.Assert(err, IsNil)
tbl, err = is.TableByName(model.NewCIStr("test"), model.NewCIStr("t"))
c.Assert(err, IsNil)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Those two lines is the same as the 56-57 lines.


var statsTblCache = statsCache{cache: map[int64]*statsInfo{}}

// GetStatisticsTableCache retrieves the statistics table from cache, and will refill cache if necessary.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment should be updated, it doesn't refill cache now.

if tpb != nil {
tbl, err = statistics.TableFromPB(tableInfo, tpb)
// Error is not nil may mean that there are some ddl changes on this table, so the origin
// statistics can not be used any more, we give it a pseudo one and save in cache.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the pseudo one is not easily see, keep it nil is not the same as give it a pseudo one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is a nil stats, we always return a pseudo one.

@@ -298,6 +307,8 @@ func doDDLWorks(s Session) {
mustExecute(s, CreateTiDBTable)
// Create help table.
mustExecute(s, CreateHelpTopic)
// Create stats_meta table.
mustExecute(s, CreateStatsMetaTable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a compatibility problem here.
Older version TiDB already bootstraped, and it doesn't contain mysql.stats_meta table, when update to newer TiDB, this table is absent.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I will add the upgrade logic soon.

lease := do.DDL().GetLease()
if lease > 0 {
go func(do *Domain) {
ticker := time.NewTicker(lease)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the reload interval equal to a lease?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When a user execute analyze, we must notify others as soon as possible. I think a lease is ok.

if err != nil {
return errors.Trace(err)
}
insertSQL := fmt.Sprintf("insert into mysql.stats_meta (version, table_id) values(%d, %d)", version, e.tblInfo.ID)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add a whitespace between values(%d, %d) ?

return si.tbl
// Update reads stats meta from store and updates the stats map.
func (h *Handle) Update(m *meta.Meta, is infoschema.InfoSchema) error {
sql := fmt.Sprintf("SELECT version, table_id from mysql.stats_meta where version > %d order by version", h.lastVersion)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not quite understand about the version, do you mean statistics information is out of date when infoschema changed ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stats info version will increase when the table is analyzed. The version is analyze command's startTS.

tbl: statsTbl,
version: version,
}
statsTblCache.cache[id] = si
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

statsTblCache.m.Lock() already called. @lamxTyler

session.go Outdated
@@ -114,6 +114,9 @@ type session struct {
parser *parser.Parser

sessionVars *variable.SessionVars

// sysSessionPool stores sessions for execute restricted sql for every domain.
sysSessionPool *sync.Pool
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why sysSessionPool in so many place? It's hard to maintain.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They all be fetched from same domain. Otherwise which place is ok?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's easy to define a function such as

func (s *session) sysSessionPool() *sync.Pool {
    return sessionctx.GetDomain(s).SysSessionPool()
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rest LGTM
@coocood @lamxTyler

@alivxxx
Copy link
Contributor

alivxxx commented Mar 13, 2017

LGTM

@hanfei1991
Copy link
Member Author

@tiancaiamao PTAL

@tiancaiamao
Copy link
Contributor

LGTM

@tiancaiamao
Copy link
Contributor

@coocood

bootstrap.go Outdated
version bigint(64) unsigned NOT NULL,
table_id bigint(64) NOT NULL,
distinct_ratio double unsigned NOT NULL DEFAULT 0,
count bigint(64) unsigned NOT NULL DEFAULT 0,
Copy link
Member

@coocood coocood Mar 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add a modify_count column, modify_count can be updated along with count, no extra cost.

@coocood
Copy link
Member

coocood commented Mar 15, 2017

@hanfei1991

modify_count can be used to detect potential stats inaccuracy.

@@ -16,6 +16,7 @@ package executor
import (
"math/rand"

"fmt"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use goimports to format import.

version, tableID := row.Data[0].GetUint64(), row.Data[1].GetInt64()
table, ok := is.TableByID(tableID)
if !ok {
log.Warnf("Unknown table ID %d in stats meta table, maybe it has been dropped", tableID)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very common that a table has been dropped, should not log warning here.
This is an error only when we inserted a wrong table ID, so I think debug level log is enough.

c.Assert(err, IsNil)
tableInfo = tbl.Meta()
statsTbl = statscache.GetStatisticsTableCache(tableInfo)
c.Assert(statsTbl.Pseudo, IsTrue)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the schema has changed, we should not get a Pseudo table stats.
If the stats doesn't match the new schema, we should adapt the stats for the new schema.
Or we will keep using the pseudo stats for a long time.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should adapt the stats after doing ddl. I will implement it in future.

bootstrap.go Outdated
@@ -120,7 +120,7 @@ const (
CreateStatsMetaTable = `CREATE TABLE if not exists mysql.stats_meta (
version bigint(64) unsigned NOT NULL,
table_id bigint(64) NOT NULL,
distinct_ratio double unsigned NOT NULL DEFAULT 0,
modify_count bigint(64) NOT NULL DEFAULT 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why removed distinct_ratio?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

distinct_ratio is for every column, which should be added to another table.

@coocood
Copy link
Member

coocood commented Mar 16, 2017

LGTM

@hanfei1991 hanfei1991 merged commit 66085c3 into master Mar 16, 2017
@hanfei1991 hanfei1991 deleted the hanfei/stats branch March 16, 2017 06:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants