-
Notifications
You must be signed in to change notification settings - Fork 5.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
session: migrate test-infra to testify for tidb_test.go and bootstrap_test.go #28555
Conversation
…_test.go This closes pingcap#28322 . This closes pingcap#28329 . Signed-off-by: tison <wander4096@gmail.com>
[REVIEW NOTIFICATION] This pull request has been approved by:
To complete the pull request process, please ask the reviewers in the list to review by filling The full list of commands accepted by this bot can be found here. Reviewer can indicate their review by submitting an approval review. |
/cc @lance6716 @morgo @lance6716 this PR changes the failpoint you introduced in #18714 to avoid goroutine leak. Please review. |
@@ -569,7 +569,7 @@ func (d *ddl) doDDLJob(ctx sessionctx.Context, job *model.Job) error { | |||
i := 0 | |||
for { | |||
failpoint.Inject("storeCloseInLoop", func(_ failpoint.Value) { | |||
d.cancel() | |||
_ = d.Stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lance6716 at this line only, so that we don't just cancel the context but trigger a graceful shutdown. I'm unsure whether or not it is the original failpoint want to do. In another way, we can ignore the leak.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with this change.
The original failpoint is to imitate that the TiDB-as-a-library is being closed when handling a DDL, to check if the goroutine is stuck inside this for-loop. My original test code is #18714 , but after some iterations it becomes current invasive failpoint 😂
I'll review rest of code tomorrow
oomAction := config.GetGlobalConfig().OOMAction | ||
defer func() { | ||
config.UpdateGlobal(func(conf *config.Config) { | ||
conf.OOMAction = oomAction | ||
}) | ||
}() | ||
ctx := context.Background() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add this store and restore for
- TestUpgrade
- TestIssue17979_1
- TestIssue20900_2
Otherwise TestIssue17979_2 failed as TestUpgrade set the default oomAction to "log". Previously we don't meet this issue possibly due to different tests executed order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
names like TestIssue17979_1 is not good right? how about give them proper name and add link in comments?
} | ||
|
||
for _, ca := range cases { | ||
func() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wrap in a function so that defer
works properly.
s.dbName = "test_bootstrap" | ||
s.dbNameBootstrap = "test_main_db_bootstrap" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't actually make use of them
SetIndexUsageSyncLease(1) | ||
defer SetIndexUsageSyncLease(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes global variable and thus run in serial.
require.NoError(t, err) | ||
return se |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
se.Auth is for TestForIssue23387
only.
func removeStore(c *C, dbPath string) { | ||
os.RemoveAll(dbPath) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really in use.
Signed-off-by: tison <wander4096@gmail.com>
/run-check_dev_2 |
1 similar comment
/run-check_dev_2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good
if v.Scope != variable.ScopeSession { | ||
count++ | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit mask of options, testing for ScopeSession might be buggy if we introduce any other scopes like ScopeInstance. Using helper funcs is better suited:
tidb/sessionctx/variable/sysvar.go
Lines 234 to 247 in d458059
// HasNoneScope returns true if the scope for the sysVar is None. | |
func (sv *SysVar) HasNoneScope() bool { | |
return sv.Scope == ScopeNone | |
} | |
// HasSessionScope returns true if the scope for the sysVar includes session. | |
func (sv *SysVar) HasSessionScope() bool { | |
return sv.Scope&ScopeSession != 0 | |
} | |
// HasGlobalScope returns true if the scope for the sysVar includes global. | |
func (sv *SysVar) HasGlobalScope() bool { | |
return sv.Scope&ScopeGlobal != 0 | |
} |
I did not realize it, but ScopeNone
variables are also persisted to the table (i.e. != ScopeSession is both Global and None). This looks to be intentional because system_time_zone
needs to be persisted to the cluster. There might actually be bugs here because a Get on a ScopeNone
returns sv.Value
and doesn't check the persisted value:
tidb/sessionctx/variable/varsutil.go
Lines 172 to 182 in d458059
// GetSessionOrGlobalSystemVar gets a system variable. | |
// If it is a session only variable, use the default value defined in code. | |
// Returns error if there is no such variable. | |
func GetSessionOrGlobalSystemVar(s *SessionVars, name string) (string, error) { | |
sv := GetSysVar(name) | |
if sv == nil { | |
return "", ErrUnknownSystemVar.GenWithStackByArgs(name) | |
} | |
if sv.HasNoneScope() { | |
return sv.Value, nil | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did some additional research, and all the ScopeNone
variables look to be safe to non-persist in the system table. They are usually overwritten by each TiDB server during startup.
I've forked this to #28667
Co-authored-by: Morgan Tocker <tocker@gmail.com>
Signed-off-by: tison <wander4096@gmail.com>
/run-check_dev_2 |
Signed-off-by: tison <wander4096@gmail.com>
@morgo CI fixed. Please give this PR one more round review. |
/merge |
This pull request has been accepted and is ready to merge. Commit hash: ac6a0db
|
conf.TiKVClient.AsyncCommit.SafeWindow = 0 | ||
conf.TiKVClient.AsyncCommit.AllowedClockDrift = 0 | ||
}) | ||
tikv.EnableFailpoints() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See the chart here http://www.zenlife.tk:18081/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tiancaiamao is there a way configuring tests and benchmarks separately? Previous we turn on failpoints in test, but it is a global state change. I guess we don't observe "regression" then because it occasionally run benchmark at first?
@disksing IIUC we discuss about tikv failpoints regressions ever, could you also take a look at this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tiancaiamao emmm, it is because we already run benchmark & tests separately, but TestMain affects either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to check if it is running bench or test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tiancaiamao hacking and found several candidates to address this regression, please check #28709
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This closes #28322 .
This closes #28329 .
Release note