Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamic tablet throttler config: enable/disable, set metrics query/threshold #11604

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
86 commits
Select commit Hold shift + click to select a range
77eb4b2
Support syntax: ALTER VITESS_MIGRATION {DISABLE|ENABLE|THRESHOLD='...')
shlomi-noach Sep 22, 2022
225da72
vtgate supports ALTER VITESS_THROTTLER syntax
shlomi-noach Sep 22, 2022
fc5f4ae
initial support for AlterThrottler in query executor
shlomi-noach Sep 22, 2022
2854e46
towards enable/disable in throttler
shlomi-noach Sep 22, 2022
18052e7
no cleanup ticker for caches, single initialization
shlomi-noach Sep 22, 2022
8898537
init metrics query and threshold (temporary)
shlomi-noach Sep 22, 2022
d01cafb
replace if-else with switch
shlomi-noach Sep 22, 2022
c581f5d
parser: support 'show vitess_throttler status' command
shlomi-noach Sep 22, 2022
02fc818
vtgate: support 'show vitess_throttler status' command
shlomi-noach Sep 22, 2022
5030be6
implement SHOW VITESS_THROTTLER STATUS
shlomi-noach Sep 22, 2022
9e9f227
ugly quick hack for reducing flakiness
shlomi-noach Sep 22, 2022
2cbc682
undo hack
shlomi-noach Sep 22, 2022
2a8a6f9
IsEnabled() now implies IsOpen
shlomi-noach Sep 22, 2022
27688bb
Merge branch 'main' into alter-vitess-throttler
shlomi-noach Sep 28, 2022
c8480b0
THRESHOLD is decimal/integer
shlomi-noach Sep 28, 2022
b62bf4d
experimental, accessing topo for config
shlomi-noach Oct 12, 2022
907cd9b
make proto
shlomi-noach Oct 12, 2022
0abbd80
merge main, resolve conflict
shlomi-noach Oct 12, 2022
f4fe865
store throttler config is SrvKeyspace in all cells
shlomi-noach Oct 20, 2022
da41583
apply throttler config
shlomi-noach Oct 20, 2022
bdd9c0f
AlterThrottler does not get handled by QueryExecutor
shlomi-noach Oct 23, 2022
8bceb64
no schema, removed file
shlomi-noach Oct 27, 2022
3d22665
unsupport ALTER VITESS_THROTTLER syntax
shlomi-noach Oct 27, 2022
01c93be
repltracker enables heartbeats if ondemand is positive
shlomi-noach Oct 27, 2022
fcaf2a7
throttler does not requests heartbeats when disabled
shlomi-noach Oct 27, 2022
73ec72a
endtoend test to cover throttler config by topo
shlomi-noach Oct 30, 2022
a9b159d
make generate_ci_workflows
shlomi-noach Oct 30, 2022
bc1d195
graceful sleep
shlomi-noach Oct 30, 2022
61d2429
Enable throttler in Open() when not reading from topo (maintain backw…
shlomi-noach Oct 30, 2022
6b19ece
improve test logic with WaitGroup
shlomi-noach Oct 30, 2022
ff117dc
adapt test logic to new on-demand behavior. Add more test cases
shlomi-noach Oct 30, 2022
91bb390
Merge branch 'main' into vitess-throttler-dynamic-config
shlomi-noach Oct 30, 2022
32c781e
removed AlterThrottler from vtgate
shlomi-noach Oct 30, 2022
9fcf602
removed AlterThrottler from ast
shlomi-noach Oct 30, 2022
7f4eab9
update test
shlomi-noach Oct 30, 2022
679a072
test custom config
shlomi-noach Oct 30, 2022
b0f526d
validate return from custom query to default query
shlomi-noach Oct 30, 2022
67d2873
updated workflow
shlomi-noach Oct 30, 2022
e7d2a6e
update endtoend cli docs
shlomi-noach Oct 30, 2022
1601c57
update endtoend cli docs: --throttle_config_via_topo
shlomi-noach Oct 30, 2022
1f01d1f
vtadmin web proto
shlomi-noach Oct 30, 2022
211dd81
Merge branch 'main' into vitess-throttler-dynamic-config
shlomi-noach Nov 2, 2022
521843d
s/throttle_config_via_topo/throttler_config_via_topo
shlomi-noach Nov 2, 2022
cd95be2
do not listen on KeyspaceSrv if not configured to read from topo
shlomi-noach Nov 2, 2022
6283cd2
reverse --throttle_threshold default increase
shlomi-noach Nov 2, 2022
4c23a73
dashes: --throttler-config-via-topo
shlomi-noach Nov 3, 2022
929078a
throttler check: flag for skipping heartbeat request (avoids renewing…
shlomi-noach Nov 3, 2022
df68096
wait-based status checks
shlomi-noach Nov 3, 2022
eef3f7e
wait-based TestLag
shlomi-noach Nov 3, 2022
bc4811a
reorganizing, cleanup
shlomi-noach Nov 3, 2022
0a5c287
remove excessive sleep
shlomi-noach Nov 3, 2022
f924ef9
vtctl subflags use dashes
shlomi-noach Nov 3, 2022
f07f7e4
validate mutually exclusive flags
shlomi-noach Nov 3, 2022
25d9077
fix cli test
shlomi-noach Nov 3, 2022
e8603b1
empty commit to kick CI
shlomi-noach Nov 3, 2022
4570500
disable active reparents on tests
shlomi-noach Nov 7, 2022
c8bea17
do not return list of cells. Instead, just return exit code
shlomi-noach Nov 7, 2022
f576275
improve error message
shlomi-noach Nov 7, 2022
af0a070
typo
shlomi-noach Nov 7, 2022
d2a9823
copyright year
shlomi-noach Nov 7, 2022
06d34ec
vtctldclient: support UpdateThrottlerConfig
shlomi-noach Nov 7, 2022
6080903
vt admin proto
shlomi-noach Nov 7, 2022
b536bd0
refactored throttler code to its own file
shlomi-noach Nov 8, 2022
33c6aaf
simplify: options are of UpdateThrottlerConfigRequest type
shlomi-noach Nov 8, 2022
8e58298
added keyspace
shlomi-noach Nov 8, 2022
fbb28ed
implementing server-side UpdateThrottlerConfig
shlomi-noach Nov 8, 2022
93ed7c0
validating that vtctldclient works
shlomi-noach Nov 8, 2022
9d21abf
copyright year
shlomi-noach Nov 9, 2022
1f902ed
safer Open() flow that does not block execution
shlomi-noach Nov 9, 2022
ee8ac2e
expected CLI
shlomi-noach Nov 9, 2022
07ba21b
vt admin proto
shlomi-noach Nov 9, 2022
65b4b45
CustomQuerySet: identify existence of flag
shlomi-noach Nov 9, 2022
3ca7102
vt admin proto
shlomi-noach Nov 9, 2022
cd66ccd
merge main, resolve conflict
shlomi-noach Nov 10, 2022
43d0f2d
update ci workflow
shlomi-noach Nov 10, 2022
9fa03c0
Merge branch 'main' into vitess-throttler-dynamic-config
shlomi-noach Nov 13, 2022
a4a92c0
go mod tidy
shlomi-noach Nov 13, 2022
c70aed5
check 'Changed' after parsing
shlomi-noach Nov 14, 2022
b1f1a79
more verbose message; do give throttler timt to update config
shlomi-noach Nov 14, 2022
b0ba6fe
metricsQuery is atomic; report metrics query in SHOW VITESS_THROTTLER…
shlomi-noach Nov 14, 2022
4733345
cobra comment
shlomi-noach Nov 15, 2022
829bfa6
merge main, resolve conflict
shlomi-noach Nov 15, 2022
e82479f
goimport
shlomi-noach Nov 15, 2022
d82619b
fix CLI docs
shlomi-noach Nov 15, 2022
c3319ec
release notes
shlomi-noach Nov 17, 2022
72e775e
merge main, resolve conflict
shlomi-noach Nov 20, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
134 changes: 134 additions & 0 deletions .github/workflows/cluster_endtoend_tabletmanager_throttler_topo.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,134 @@
# DO NOT MODIFY: THIS FILE IS GENERATED USING "make generate_ci_workflows"

name: Cluster (tabletmanager_throttler_topo)
on: [push, pull_request]
concurrency:
group: format('{0}-{1}', ${{ github.ref }}, 'Cluster (tabletmanager_throttler_topo)')
cancel-in-progress: true

env:
LAUNCHABLE_ORGANIZATION: "vitess"
LAUNCHABLE_WORKSPACE: "vitess-app"
GITHUB_PR_HEAD_SHA: "${{ github.event.pull_request.head.sha }}"

jobs:
build:
name: Run endtoend tests on Cluster (tabletmanager_throttler_topo)
runs-on: ubuntu-20.04

steps:
- name: Skip CI
run: |
if [[ "${{contains( github.event.pull_request.labels.*.name, 'Skip CI')}}" == "true" ]]; then
echo "skipping CI due to the 'Skip CI' label"
exit 1
fi

- name: Check if workflow needs to be skipped
id: skip-workflow
run: |
skip='false'
if [[ "${{github.event.pull_request}}" == "" ]] && [[ "${{github.ref}}" != "refs/heads/main" ]] && [[ ! "${{github.ref}}" =~ ^refs/heads/release-[0-9]+\.[0-9]$ ]] && [[ ! "${{github.ref}}" =~ "refs/tags/.*" ]]; then
skip='true'
fi
echo Skip ${skip}
echo "::set-output name=skip-workflow::${skip}"

- name: Check out code
if: steps.skip-workflow.outputs.skip-workflow == 'false'
uses: actions/checkout@v3

- name: Check for changes in relevant files
if: steps.skip-workflow.outputs.skip-workflow == 'false'
uses: frouioui/paths-filter@main
id: changes
with:
token: ''
filters: |
end_to_end:
- 'go/**/*.go'
- 'test.go'
- 'Makefile'
- 'build.env'
- 'go.[sumod]'
- 'proto/*.proto'
- 'tools/**'
- 'config/**'
- 'bootstrap.sh'
- '.github/workflows/cluster_endtoend_tabletmanager_throttler_topo.yml'

- name: Set up Go
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
uses: actions/setup-go@v3
with:
go-version: 1.19.3

- name: Set up python
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
uses: actions/setup-python@v4

- name: Tune the OS
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
run: |
echo '1024 65535' | sudo tee -a /proc/sys/net/ipv4/ip_local_port_range
# Increase the asynchronous non-blocking I/O. More information at https://dev.mysql.com/doc/refman/5.7/en/innodb-parameters.html#sysvar_innodb_use_native_aio
echo "fs.aio-max-nr = 1048576" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p /etc/sysctl.conf

- name: Get dependencies
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
run: |

# Get key to latest MySQL repo
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-keys 467B942D3A79BD29
# Setup MySQL 8.0
wget -c https://dev.mysql.com/get/mysql-apt-config_0.8.20-1_all.deb
echo mysql-apt-config mysql-apt-config/select-server select mysql-8.0 | sudo debconf-set-selections
sudo DEBIAN_FRONTEND="noninteractive" dpkg -i mysql-apt-config*
sudo apt-get update
# Install everything else we need, and configure
sudo apt-get install -y mysql-server mysql-client make unzip g++ etcd curl git wget eatmydata xz-utils

sudo service mysql stop
sudo service etcd stop
sudo ln -s /etc/apparmor.d/usr.sbin.mysqld /etc/apparmor.d/disable/
sudo apparmor_parser -R /etc/apparmor.d/usr.sbin.mysqld
go mod download

# install JUnit report formatter
go install github.com/vitessio/go-junit-report@HEAD

- name: Setup launchable dependencies
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
run: |
# Get Launchable CLI installed. If you can, make it a part of the builder image to speed things up
pip3 install --user launchable~=1.0 > /dev/null

# verify that launchable setup is all correct.
launchable verify || true

# Tell Launchable about the build you are producing and testing
launchable record build --name "$GITHUB_RUN_ID" --source .

- name: Run cluster endtoend test
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true'
timeout-minutes: 45
run: |
# We set the VTDATAROOT to the /tmp folder to reduce the file path of mysql.sock file
# which musn't be more than 107 characters long.
export VTDATAROOT="/tmp/"
source build.env

set -x

# run the tests however you normally do, then produce a JUnit XML file
eatmydata -- go run test.go -docker=false -follow -shard tabletmanager_throttler_topo | tee -a output.txt | go-junit-report -set-exit-code > report.xml

- name: Print test output and Record test result in launchable
if: steps.skip-workflow.outputs.skip-workflow == 'false' && steps.changes.outputs.end_to_end == 'true' && always()
run: |
# send recorded tests to launchable
launchable record tests --build "$GITHUB_RUN_ID" go-test . || true

# print test output
cat output.txt
39 changes: 39 additions & 0 deletions doc/releasenotes/16_0_0_summary.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,14 @@

In [PR #11103](https://github.com/vitessio/vitess/pull/11103) we introduced the ability to resume a `VTGate` [`VStream` copy operation](https://vitess.io/docs/design-docs/vreplication/vstream/vscopy/). This is useful when a [`VStream` copy operation](https://vitess.io/docs/design-docs/vreplication/vstream/vscopy/) is interrupted due to e.g. a network failure or a server restart. The `VStream` copy operation can be resumed by specifying each table's last seen primary key value in the `VStream` request. Please see the [`VStream` docs](https://vitess.io/docs/16.0/reference/vreplication/vstream/) for more details.

### Tablet throttler

The tablet throttler can now be configured dynamically. Configuration is now found in the topo service, and applies to all tablets in all shards and cells of a given keyspace. For backwards compatibility `v16` still supports `vttablet`-based command line flags for throttler ocnfiguration.

It is possible to enable/disable, to change throttling threshold as well as the throttler query.

See https://github.com/vitessio/vitess/pull/11604

### Breaking Changes

#### Orchestrator Integration Deletion
Expand Down Expand Up @@ -43,6 +51,37 @@ Other aspects of the VReplication copy-phase logic are preserved:

Other phases, catchup, fast-forward, and replicating/"running", are unchanged.

### vttablet --throttler-config-via-topo

The flag `--throttler-config-via-topo` switches throttler configuration from `vttablet`-flags to the topo service. This flag is `false` by default, for backwards compatibility. It will default to `true` in future versions.

### vtctldclient UpdateThrottlerConfig

Tablet throttler configuration is now supported in `topo`. Updating the throttler configuration is done via `vtctldclient UpdateThrottlerConfig` and applies to all tablet in all cells for a given keyspace.

Examples:
```shell
# disable throttler; all throttler checks will return with "200 OK"
$ vtctldclient UpdateThrottlerConfig --disable commerce

# enable throttler; checks are responded with appropriate status per current metrics
$ vtctldclient UpdateThrottlerConfig --enable commerce

# Both enable and set threshold in same command. Since no query is indicated, we assume the default check for replication lag
$ vtctldclient UpdateThrottlerConfig --enable --threshold 5.0 commerce

# Change threshold. Does not affect enabled/disabled state of the throttler
$ vtctldclient UpdateThrottlerConfig --threshold 1.5 commerce

# Use a custom query
$ vtctldclient UpdateThrottlerConfig --custom_query "show global status like 'threads_running'" --check_as_check_self --threshold 50 commerce

# Restore default query and threshold
$ vtctldclient UpdateThrottlerConfig --custom_query "" --check_as_check_shard --threshold 1.5 commerce
```

See https://github.com/vitessio/vitess/pull/11604

### Important bug fixes

#### Corrupted results for non-full-group-by queries with JOINs
Expand Down
61 changes: 61 additions & 0 deletions go/cmd/vtctldclient/command/throttler.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
/*
Copyright 2022 The Vitess Authors.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/

package command

import (
"github.com/spf13/cobra"

"vitess.io/vitess/go/cmd/vtctldclient/cli"

vtctldatapb "vitess.io/vitess/go/vt/proto/vtctldata"
)

var (
// UpdateThrottlerConfig makes a UpdateThrottlerConfig gRPC call to a vtctld.
UpdateThrottlerConfig = &cobra.Command{
Use: "UpdateThrottlerConfig [--enable|--disable] [--threshold=<float64>] [--custom-query=<query>] [--check-as-check-self|--check-as-check-shard] <keyspace>",
Short: "Update the table throttler configuration for all cells and tablets of a given keyspace",
DisableFlagsInUseLine: true,
Args: cobra.ExactArgs(1),
RunE: commandUpdateThrottlerConfig,
}
)

var updateThrottlerConfigOptions vtctldatapb.UpdateThrottlerConfigRequest

func commandUpdateThrottlerConfig(cmd *cobra.Command, args []string) error {
keyspace := cmd.Flags().Arg(0)
cli.FinishedParsing(cmd)

updateThrottlerConfigOptions.CustomQuerySet = cmd.Flags().Changed("custom-query")
updateThrottlerConfigOptions.Keyspace = keyspace
_, err := client.UpdateThrottlerConfig(commandCtx, &updateThrottlerConfigOptions)
if err != nil {
return err
}
return nil
}

func init() {
UpdateThrottlerConfig.Flags().BoolVar(&updateThrottlerConfigOptions.Enable, "enable", false, "Enable the throttler")
UpdateThrottlerConfig.Flags().BoolVar(&updateThrottlerConfigOptions.Disable, "disable", false, "Disable the throttler")
UpdateThrottlerConfig.Flags().Float64Var(&updateThrottlerConfigOptions.Threshold, "threshold", 0, "threshold for the either default check (replication lag seconds) or custom check")
UpdateThrottlerConfig.Flags().StringVar(&updateThrottlerConfigOptions.CustomQuery, "custom-query", "", "custom throttler check query")
UpdateThrottlerConfig.Flags().BoolVar(&updateThrottlerConfigOptions.CheckAsCheckSelf, "check-as-check-self", false, "/throttler/check requests behave as is /throttler/check-self was called")
UpdateThrottlerConfig.Flags().BoolVar(&updateThrottlerConfigOptions.CheckAsCheckShard, "check-as-check-shard", false, "use standard behavior for /throttler/check requests")
Root.AddCommand(UpdateThrottlerConfig)
}
1 change: 1 addition & 0 deletions go/flags/endtoend/vtctldclient.txt
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ Available Commands:
TabletExternallyReparented Updates the topology record for the tablet's shard to acknowledge that an external tool made this tablet the primary.
UpdateCellInfo Updates the content of a CellInfo with the provided parameters, creating the CellInfo if it does not exist.
UpdateCellsAlias Updates the content of a CellsAlias with the provided parameters, creating the CellsAlias if it does not exist.
UpdateThrottlerConfig Update the table throttler configuration for all cells and tablets of a given keyspace
Validate Validates that all nodes reachable from the global replication graph, as well as all tablets in discoverable cells, are consistent.
ValidateKeyspace Validates that all nodes reachable from the specified keyspace are consistent.
ValidateSchemaKeyspace Validates that the schema on the primary tablet for shard 0 matches the schema on all other tablets in the keyspace.
Expand Down
1 change: 1 addition & 0 deletions go/flags/endtoend/vttablet.txt
Original file line number Diff line number Diff line change
Expand Up @@ -302,6 +302,7 @@ Usage of vttablet:
--throttle_metrics_threshold float Override default throttle threshold, respective to -throttle_metrics_query (default 1.7976931348623157e+308)
--throttle_tablet_types string Comma separated VTTablet types to be considered by the throttler. default: 'replica'. example: 'replica,rdonly'. 'replica' aways implicitly included (default "replica")
--throttle_threshold duration Replication lag threshold for default lag throttling (default 1s)
--throttler-config-via-topo When 'true', read config from topo service and ignore throttle_threshold, throttle_metrics_threshold, throttle_metrics_query, throttle_check_as_check_self
--topo_consul_lock_delay duration LockDelay for consul session. (default 15s)
--topo_consul_lock_session_checks string List of checks for consul session. (default "serfHealth")
--topo_consul_lock_session_ttl string TTL for consul session.
Expand Down
1 change: 0 additions & 1 deletion go/test/endtoend/onlineddl/vrepl/onlineddl_vrepl_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -546,7 +546,6 @@ func TestSchemaChange(t *testing.T) {
err := clusterInstance.VtctlclientProcess.ExecuteCommand("PlannedReparentShard", "--", "--keyspace_shard", keyspaceName+"/-80", "--new_primary", shards[0].Vttablets[reparentTabletIndex].Alias)
require.NoError(t, err, "failed PRS: %v", err)
})

t.Run("unthrottle and expect completion", func(t *testing.T) {
for i := range shards {
var body string
Expand Down
Loading