Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endtoend: stress tests for VTGate FOREIGN KEY support #13799

Merged
merged 33 commits into from
Sep 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
78656be
Endtoend: stress tests for VTGate FOREIGN KEY support
shlomi-noach Aug 16, 2023
ea1120c
better broken replication message
shlomi-noach Aug 16, 2023
8c7e056
generalize reference actions
shlomi-noach Aug 17, 2023
a80a873
Refactor validateMetrics()
shlomi-noach Aug 17, 2023
235eb1c
refactored TestInitialSetup()
shlomi-noach Aug 17, 2023
928d2e2
refactored executeFKTest()
shlomi-noach Aug 17, 2023
d0ca43b
only one time GITHUB setup
shlomi-noach Aug 17, 2023
2b15119
ExecuteFKTest is public
shlomi-noach Aug 17, 2023
afa9364
prepare for onUpdateAction
shlomi-noach Aug 17, 2023
762ae44
running variety of ON UPDATE actions
shlomi-noach Aug 17, 2023
4206ef3
Only seed tables once, then re-used a cloned image of that seed
shlomi-noach Aug 17, 2023
33393cc
UPDATEs can modify 'id' column, affecting ON UPDATE SET NULL, ON UPDA…
shlomi-noach Aug 17, 2023
ccedd35
provisional support for OnlineDDL, though not really using it
shlomi-noach Aug 17, 2023
0d78ad3
creating a vschema with 'foreignKeyMode: FK_MANAGED', using WaitForCo…
shlomi-noach Aug 23, 2023
a63faba
include parent table, name the test
shlomi-noach Aug 23, 2023
7323624
validate no errors for UPDATEs with ON UPDATE CASCADE. Validate no er…
shlomi-noach Aug 27, 2023
9c8650a
Refactor test
shlomi-noach Aug 27, 2023
adca64a
clear fk failrue metrics
shlomi-noach Aug 27, 2023
d0449d6
isFKError: reverse filtering logic. Assume everything is a FK error u…
shlomi-noach Aug 29, 2023
a746c64
Merge branch 'main' into endtoend-fk-vtgate
shlomi-noach Aug 29, 2023
b0dfdca
support ERTooManyUserConnections
shlomi-noach Aug 30, 2023
d527cf0
go mod tidy
shlomi-noach Aug 30, 2023
564cac6
resolved conflict
shlomi-noach Aug 31, 2023
725ca88
general preparation for Online DDL (though not enabled). Support more…
shlomi-noach Aug 31, 2023
4a072bb
Merge branch 'main' into endtoend-fk-vtgate
shlomi-noach Sep 11, 2023
319e2bc
update workflow file
shlomi-noach Sep 11, 2023
f5637fe
collect sample errors; reduce number of tests
shlomi-noach Sep 12, 2023
866c8cb
Merge branch 'main' into endtoend-fk-vtgate
shlomi-noach Sep 15, 2023
3e8d61e
allow deadlock error
shlomi-noach Sep 18, 2023
58afc31
Merge branch 'main' into endtoend-fk-vtgate
shlomi-noach Sep 18, 2023
3b86380
restore full ON DELETE <-> ON UPDATE combination types
shlomi-noach Sep 18, 2023
61de371
normalizing WaitForReplicationPos
shlomi-noach Sep 18, 2023
a33b746
WaitForReplicationPos: support 'validateReplication'
shlomi-noach Sep 18, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
148 changes: 148 additions & 0 deletions .github/workflows/cluster_endtoend_vtgate_foreignkey_stress.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,148 @@
# DO NOT MODIFY: THIS FILE IS GENERATED USING "make generate_ci_workflows"

name: Cluster (vtgate_foreignkey_stress)
on: [push, pull_request]
concurrency:
group: format('{0}-{1}', ${{ github.ref }}, 'Cluster (vtgate_foreignkey_stress)')
cancel-in-progress: true

permissions: read-all

env:
LAUNCHABLE_ORGANIZATION: "vitess"
LAUNCHABLE_WORKSPACE: "vitess-app"
GITHUB_PR_HEAD_SHA: "${{ github.event.pull_request.head.sha }}"

jobs:
build:
name: Run endtoend tests on Cluster (vtgate_foreignkey_stress)
runs-on: gh-hosted-runners-4cores-1

steps:
- name: Skip CI
run: |
if [[ "${{contains( github.event.pull_request.labels.*.name, 'Skip CI')}}" == "true" ]]; then
echo "skipping CI due to the 'Skip CI' label"
exit 1
fi

- name: Check if workflow needs to be skipped
id: skip-workflow
run: |
skip='false'
if [[ "${{github.event.pull_request}}" == "" ]] && [[ "${{github.ref}}" != "refs/heads/main" ]] && [[ ! "${{github.ref}}" =~ ^refs/heads/release-[0-9]+\.[0-9]$ ]] && [[ ! "${{github.ref}}" =~ "refs/tags/.*" ]]; then
skip='true'
fi
echo Skip ${skip}
echo "skip-workflow=${skip}" >> $GITHUB_OUTPUT

PR_DATA=$(curl \
-H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" \
-H "Accept: application/vnd.github.v3+json" \
"https://api.github.com/repos/${{ github.repository }}/pulls/${{ github.event.pull_request.number }}")
draft=$(echo "$PR_DATA" | jq .draft -r)
echo "is_draft=${draft}" >> $GITHUB_OUTPUT

- name: Check out code
if: steps.skip-workflow.outputs.skip-workflow == 'false'
uses: actions/checkout@v3

- name: Check for changes in relevant files
if: steps.skip-workflow.outputs.skip-workflow == 'false'
uses: frouioui/paths-filter@main
id: changes
with:
token: ''
filters: |
end_to_end:
- 'go/**/*.go'
- 'test.go'
- 'Makefile'
- 'build.env'
- 'go.sum'
- 'go.mod'
- 'proto/*.proto'
- 'tools/**'
- 'config/**'
- 'bootstrap.sh'
- '.github/workflows/cluster_endtoend_vtgate_foreignkey_stress.yml'

- name: Set up Go
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
uses: actions/setup-go@v4
with:
go-version: 1.21.0

- name: Set up python
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
uses: actions/setup-python@v4

- name: Tune the OS
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
run: |
# Limit local port range to not use ports that overlap with server side
# ports that we listen on.
sudo sysctl -w net.ipv4.ip_local_port_range="22768 65535"
# Increase the asynchronous non-blocking I/O. More information at https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_use_native_aio
echo "fs.aio-max-nr = 1048576" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p /etc/sysctl.conf

- name: Get dependencies
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
run: |

# Get key to latest MySQL repo
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 467B942D3A79BD29
# Setup MySQL 8.0
wget -c https://dev.mysql.com/get/mysql-apt-config_0.8.24-1_all.deb
echo mysql-apt-config mysql-apt-config/select-server select mysql-8.0 | sudo debconf-set-selections
sudo DEBIAN_FRONTEND="noninteractive" dpkg -i mysql-apt-config*
sudo apt-get update
# Install everything else we need, and configure
sudo apt-get install -y mysql-server mysql-client make unzip g++ etcd curl git wget eatmydata xz-utils libncurses5

sudo service mysql stop
sudo service etcd stop
sudo ln -s /etc/apparmor.d/usr.sbin.mysqld /etc/apparmor.d/disable/
sudo apparmor_parser -R /etc/apparmor.d/usr.sbin.mysqld
go mod download

# install JUnit report formatter
go install github.com/vitessio/go-junit-report@HEAD

- name: Setup launchable dependencies
if: steps.skip-workflow.outputs.is_draft == 'false' && steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true' && github.base_ref == 'main'
run: |
# Get Launchable CLI installed. If you can, make it a part of the builder image to speed things up
pip3 install --user launchable~=1.0 > /dev/null

# verify that launchable setup is all correct.
launchable verify || true

# Tell Launchable about the build you are producing and testing
launchable record build --name "$GITHUB_RUN_ID" --no-commit-collection --source .

- name: Run cluster endtoend test
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
timeout-minutes: 45
run: |
# We set the VTDATAROOT to the /tmp folder to reduce the file path of mysql.sock file
# which musn't be more than 107 characters long.
export VTDATAROOT="/tmp/"
source build.env

set -exo pipefail

# run the tests however you normally do, then produce a JUnit XML file
eatmydata -- go run test.go -docker=false -follow -shard vtgate_foreignkey_stress | tee -a output.txt | go-junit-report -set-exit-code > report.xml

- name: Print test output and Record test result in launchable if PR is not a draft
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true' && always()
run: |
if [[ "${{steps.skip-workflow.outputs.is_draft}}" == "false" ]]; then
# send recorded tests to launchable
launchable record tests --build "$GITHUB_RUN_ID" go-test . || true
fi

# print test output
cat output.txt
1 change: 1 addition & 0 deletions go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -107,6 +107,7 @@ require (
github.com/spf13/jwalterweatherman v1.1.0
github.com/xlab/treeprint v1.2.0
go.uber.org/goleak v1.2.1
golang.org/x/exp v0.0.0-20230817173708-d852ddb80c63
golang.org/x/sync v0.3.0
modernc.org/sqlite v1.20.3
)
Expand Down
2 changes: 2 additions & 0 deletions go.sum
Original file line number Diff line number Diff line change
Expand Up @@ -677,6 +677,8 @@ golang.org/x/exp v0.0.0-20191227195350-da58074b4299/go.mod h1:2RIsYlXP63K8oxa1u0
golang.org/x/exp v0.0.0-20200119233911-0405dc783f0a/go.mod h1:2RIsYlXP63K8oxa1u096TMicItID8zy7Y6sNkU49FU4=
golang.org/x/exp v0.0.0-20200207192155-f17229e696bd/go.mod h1:J/WKrq2StrnmMY6+EHIKF9dgMWnmCNThgcyBT1FY9mM=
golang.org/x/exp v0.0.0-20200224162631-6cc2880d07d6/go.mod h1:3jZMyOhIsHpP37uCMkUooju7aAi5cS1Q23tOzKc+0MU=
golang.org/x/exp v0.0.0-20230817173708-d852ddb80c63 h1:m64FZMko/V45gv0bNmrNYoDEq8U5YUhetc9cBWKS1TQ=
golang.org/x/exp v0.0.0-20230817173708-d852ddb80c63/go.mod h1:0v4NqG35kSWCMzLaMeX+IQrlSnVE/bqGSyC2cz/9Le8=
golang.org/x/image v0.0.0-20190227222117-0694c2d4d067/go.mod h1:kZ7UVZpmo3dzQBMxlp+ypCbDeSB+sBbTgSJuh5dn5js=
golang.org/x/image v0.0.0-20190802002840-cff245a6509b/go.mod h1:FeLwcggjj3mMvU+oOTbSwawSJRM1uh48EjtB4UJZlP0=
golang.org/x/lint v0.0.0-20181026193005-c67002cb31c3/go.mod h1:UVdnD1Gm6xHRNCYTkRU2/jEulfH38KcIWyp/GAMgvoE=
Expand Down
51 changes: 39 additions & 12 deletions go/test/endtoend/cluster/cluster_util.go
Original file line number Diff line number Diff line change
Expand Up @@ -223,26 +223,53 @@ func filterResultWhenRunsForCoverage(input string) string {
return result
}

func ValidateReplicationIsHealthy(t *testing.T, tablet *Vttablet) bool {
query := "show replica status"
rs, err := tablet.VttabletProcess.QueryTablet(query, "", true)
assert.NoError(t, err)
row := rs.Named().Row()
require.NotNil(t, row)

ioRunning := row.AsString("Replica_IO_Running", "")
require.NotEmpty(t, ioRunning)
ioHealthy := assert.Equalf(t, "Yes", ioRunning, "Replication is broken. Replication status: %v", row)
sqlRunning := row.AsString("Replica_SQL_Running", "")
require.NotEmpty(t, sqlRunning)
sqlHealthy := assert.Equalf(t, "Yes", sqlRunning, "Replication is broken. Replication status: %v", row)

return ioHealthy && sqlHealthy
}

// WaitForReplicationPos will wait for replication position to catch-up
func WaitForReplicationPos(t *testing.T, tabletA *Vttablet, tabletB *Vttablet, hostname string, timeout float64) {
func WaitForReplicationPos(t *testing.T, tabletA *Vttablet, tabletB *Vttablet, validateReplication bool, timeout time.Duration) {
hostname := "localhost"
ctx, cancel := context.WithTimeout(context.Background(), timeout)
defer cancel()
ticker := time.NewTicker(10 * time.Millisecond)
defer ticker.Stop()

replicationPosA, _ := GetPrimaryPosition(t, *tabletA, hostname)
for {
if validateReplication {
if !ValidateReplicationIsHealthy(t, tabletB) {
assert.FailNowf(t, "Replication broken on tablet %v. Will not wait for position", tabletB.Alias)
}
if t.Failed() {
return
}
}
replicationPosB, _ := GetPrimaryPosition(t, *tabletB, hostname)
if positionAtLeast(t, tabletA, replicationPosB, replicationPosA) {
break
return
}
msg := fmt.Sprintf("%s's replication position to catch up to %s's;currently at: %s, waiting to catch up to: %s", tabletB.Alias, tabletA.Alias, replicationPosB, replicationPosA)
waitStep(t, msg, timeout, 0.01)
}
}

func waitStep(t *testing.T, msg string, timeout float64, sleepTime float64) float64 {
timeout = timeout - sleepTime
if timeout < 0.0 {
t.Errorf("timeout waiting for condition '%s'", msg)
select {
case <-ctx.Done():
assert.FailNowf(t, "Timeout waiting for condition '%s'", msg)
return
case <-ticker.C:
}
}
time.Sleep(time.Duration(sleepTime) * time.Second)
return timeout
}

func positionAtLeast(t *testing.T, tablet *Vttablet, a string, b string) bool {
Expand Down
4 changes: 2 additions & 2 deletions go/test/endtoend/onlineddl/vtgate_util.go
Original file line number Diff line number Diff line change
Expand Up @@ -207,7 +207,7 @@ func CheckLaunchAllMigrations(t *testing.T, vtParams *mysql.ConnParams, expectCo
}

// CheckMigrationStatus verifies that the migration indicated by given UUID has the given expected status
func CheckMigrationStatus(t *testing.T, vtParams *mysql.ConnParams, shards []cluster.Shard, uuid string, expectStatuses ...schema.OnlineDDLStatus) {
func CheckMigrationStatus(t *testing.T, vtParams *mysql.ConnParams, shards []cluster.Shard, uuid string, expectStatuses ...schema.OnlineDDLStatus) bool {
query, err := sqlparser.ParseAndBind("show vitess_migrations like %a",
sqltypes.StringBindVariable(uuid),
)
Expand All @@ -229,7 +229,7 @@ func CheckMigrationStatus(t *testing.T, vtParams *mysql.ConnParams, shards []clu
}
}
}
assert.Equal(t, len(shards), count)
return assert.Equal(t, len(shards), count)
}

// WaitForMigrationStatus waits for a migration to reach either provided statuses (returns immediately), or eventually time out
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -51,7 +51,7 @@ func failoverExternalReparenting(t *testing.T, clusterInstance *cluster.LocalPro
primary.VttabletProcess.QueryTablet(demoteQuery, keyspaceUnshardedName, true)

// Wait for replica to catch up to primary.
cluster.WaitForReplicationPos(t, primary, replica, "localhost", 60.0)
cluster.WaitForReplicationPos(t, primary, replica, false, time.Minute)

duration := time.Since(start)
minUnavailabilityInS := 1.0
Expand Down
Loading
Loading