Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: support automatically create sql baselines #12434

Merged
merged 11 commits into from
Oct 17, 2019

Conversation

alivxxx
Copy link
Contributor

@alivxxx alivxxx commented Sep 27, 2019

What problem does this PR solve?

Support automatically capture sql baselines.

What is changed and how it works?

This PR uses a background job to loop over the statements summary to find selects that occured more than once and add sql bindings for them. The backgroud job can be controlled by variable "tidb_capture_plan_baselines", and since this feature relys on statements, "tidb_enable_stmt_summary" should also be set in order to enable the feature.

Check List

Tests

  • Unit test

Code changes

  • Has exported function/method change

Side effects

  • None

Related changes

  • None

Release note

  • support automatically capture sql baselines

@codecov
Copy link

codecov bot commented Sep 27, 2019

Codecov Report

Merging #12434 into master will increase coverage by 0.2428%.
The diff coverage is n/a.

@@               Coverage Diff                @@
##             master     #12434        +/-   ##
================================================
+ Coverage   80.0043%   80.2471%   +0.2428%     
================================================
  Files           462        462                
  Lines        106273     107301      +1028     
================================================
+ Hits          85023      86106      +1083     
+ Misses        14938      14892        -46     
+ Partials       6312       6303         -9

Copy link
Member

@winoros winoros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm
Is there some TP type benchmark result to show its influence?

@alivxxx
Copy link
Contributor Author

alivxxx commented Oct 10, 2019

/bench

@sre-bot
Copy link
Contributor

sre-bot commented Oct 10, 2019

@@                               Benchmark Diff                               @@
================================================================================
--- tidb: 8392e1be6b15b0e276356e9d02517ab08945a637
+++ tidb: 7700dab95cc4f6c943e525ecdf50f6830b2277f7
tikv: e231d45170f2efeddf855c81b6f90d2a7f094d1b
pd: 4acaa8c715d40a07f521dd85ef7dcd4118064289
================================================================================
test-1: < oltp_point_select >
    * QPS : 78243.60 ± 1.9166% (std=1049.51) delta: -0.72% (p=0.505)
    * AvgMs : 3.27 ± 1.8349% (std=0.04) delta: 1.40%
    * PercentileMs99 : 6.74 ± 1.0679% (std=0.06) delta: 1.08%
            
test-2: < oltp_read_write >
    * QPS : 37395.25 ± 0.4093% (std=89.09) delta: -2.23% (p=0.002)
    * AvgMs : 137.40 ± 0.4731% (std=0.40) delta: 2.41%
    * PercentileMs99 : 267.41 ± 0.0000% (std=0.00) delta: 3.67%
            
test-3: < oltp_insert >
    * QPS : 22192.97 ± 0.9017% (std=143.59) delta: -0.18% (p=0.950)
    * AvgMs : 11.53 ± 0.8848% (std=0.08) delta: 0.18%
    * PercentileMs99 : 23.27 ± 1.0830% (std=0.21) delta: -1.07%
            
test-4: < oltp_update_index >
    * QPS : 17394.34 ± 0.5992% (std=75.64) delta: -2.18% (p=0.000)
    * AvgMs : 14.71 ± 0.6524% (std=0.07) delta: 2.31%
    * PercentileMs99 : 30.59 ± 1.0788% (std=0.27) delta: -2.49%
            
test-5: < oltp_update_non_index >
    * QPS : 29210.34 ± 0.0594% (std=12.36) delta: -2.49% (p=0.000)
    * AvgMs : 8.76 ± 0.0000% (std=0.00) delta: 2.55%
    * PercentileMs99 : 18.95 ± 0.0000% (std=0.00) delta: -10.23%
            

https://perf.pingcap.com

@alivxxx
Copy link
Contributor Author

alivxxx commented Oct 11, 2019

@winoros You can take a look at the bench result.

domain/domain.go Outdated Show resolved Hide resolved
domain/domain.go Outdated
defer recoverInDomain("loadBindInfoLoop", false)
defer recoverInDomain("globalBindHandleWorkerLoop", false)
loadTicker := time.NewTicker(bindinfo.Lease)
captureBaselineTicker := time.NewTicker(bindinfo.Lease)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just use a single ticker for 2 actions?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, we can.

stmtsummary.StmtSummaryByDigestMap.SetEnabled(sVal, false)
break
case variable.TiDBCapturePlanBaseline:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we add dependency check between these 2 variables? if not, would there be possible panics when accessing structures of statements?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it won't panic because the StmtSummaryByDigestMap is initilized as a global variable, so it is always valid.

continue
}
h.sctx.Lock()
err = h.sctx.RefreshTxnCtx(context.TODO())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this function call?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will create a new txn and the txn maybe used by the optimizer when generating hints, since it may execute some sub queries, and without it, the test will panic.


// GetVal gets the value.
func (v *serverGlobalVariable) GetVal() string {
v.Lock()
Copy link
Contributor

@lzmhhh123 lzmhhh123 Oct 16, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just set read lock here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is not necessary because we do not expect the server global variables be set or read very frequent, a mutex is enough now.

@@ -230,6 +230,26 @@ func (ssMap *stmtSummaryByDigestMap) ToDatum() [][]types.Datum {
return rows
}

// GetMoreThanOnceSelect gets select SQLs that occurred more than once.
func (ssMap *stmtSummaryByDigestMap) GetMoreThanOnceSelect() ([]string, []string) {
ssMap.Lock()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a mutex, not rwmutex.

for _, value := range values {
summary := value.(*stmtSummaryByDigest)
summary.Lock()
if strings.HasPrefix(summary.normalizedSQL, "select") && summary.execCount > 1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity, can we make the SQL occurrent number under control instead of 1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can, but we can do it when only when it is necessary because doing it would require another session variables, for now, it is enough.

Copy link
Contributor

@lzmhhh123 lzmhhh123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@alivxxx alivxxx added the status/can-merge Indicates a PR has been approved by a committer. label Oct 17, 2019
@sre-bot
Copy link
Contributor

sre-bot commented Oct 17, 2019

/run-all-tests

@alivxxx
Copy link
Contributor Author

alivxxx commented Oct 17, 2019

/run-all-tests

@alivxxx alivxxx merged commit 740b776 into pingcap:master Oct 17, 2019
@alivxxx alivxxx deleted the capture-baseline branch October 17, 2019 06:16
XiaTianliang pushed a commit to XiaTianliang/tidb that referenced this pull request Dec 21, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
status/can-merge Indicates a PR has been approved by a committer. type/new-feature type/usability
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants