Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slack-vitess-r15.0.5: backport required Transaction Throttler PRs #302

Conversation

timvaillancourt
Copy link
Member

@timvaillancourt timvaillancourt commented Apr 16, 2024

Description

This PR backports several required Transaction Throttler changes from upstream

PRs:

  1. Add basic metrics to vttablet transaction throttler vitessio/vitess#12418
  2. Fix transaction throttler ignoring the initial rate vitessio/vitess#12618
  3. Cleanup panics in txthrottler, reorder for readability vitessio/vitess#12901
  4. Emit per workload labels for existing per table vttablet metrics vitessio/vitess#12394
  5. Add flag to select tx throttler tablet type vitessio/vitess#12174
  6. tx throttler: healthcheck all cells if --tx-throttler-healthcheck-cells is undefined vitessio/vitess#12477
  7. Add priority support to transaction throttler vitessio/vitess#12662
  8. txthrottler: further code cleanup vitessio/vitess#12902
  9. TxThrottler support for transactions outside BEGIN/COMMIT vitessio/vitess#13040
  10. txthrottler: verify config at vttablet startup, consolidate funcs vitessio/vitess#13115
  11. txthrottler: add metrics for topoWatcher and healthCheckStreamer vitessio/vitess#13153
  12. Per workload TxThrottler metrics vitessio/vitess#13526
  13. Add dry-run/monitoring-only mode for TxThrottler vitessio/vitess#13604
  14. txthrottler: remove txThrottlerConfig struct, rely on tabletenv vitessio/vitess#13624

Finally, to fix CI flakiness, CI workflows were wrapped by the GitHub Action nick-fields/retry@v2, similar to v14. This required changes to test/templates, which caused changes to .github/workflows files

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on the CI
  • Documentation was added or is not required

Deployment Notes

timvaillancourt and others added 4 commits April 17, 2024 00:08
* Add basic stats to vttablet tx throttler

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* test new metrics

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* reorder

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* short names

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Add max rate

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Move NewGaugeFunc to under conditional

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Use env

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Remove env from TxThrottler struct

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Fix tests

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* PR suggestion

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Fix unit test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* reorder test vars

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
* Fix transaction throttler ignoring the initial rate

This addresses the issue reported in vitessio#12549

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Add missing override of max replication lag in `throttler.newThrottler()`

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Reorder functions to make diff easier to read

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix check for maxRate in `newThrottlerFromConfig()`

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix some CI pipeline issues

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Address PR comment.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix typo

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

---------

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
Signed-off-by: Eduardo J. Ortega U. <5791035+ejortegau@users.noreply.github.com>
)

* Cleanup tx_throttler.go

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Cleanup tx_throttler.go #2

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Fix throttlerFactoryFunc

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Undo if-cond consolidation

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Undo struct shuffling

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* prove that disabled config returns nil error

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Improve test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
…essio#12394)

* Emit per workload labels for existing per table vttablet metrics

This adds the possibility to configure vttablet (via CLI flag) to also have a
workload label for existing per table metrics (query counts, query times, query
errors, query rows affected, query rows returned, query error counts). Workload
can be any string that makes sense for the client application. For example, API
endpoint name, controller, batch job name, application name or something else.

This is usefult to be able to gain observability about how the query load is
distributed across different workloads.

This is achieved with two new CLI flags, namely:

* `enable-per-workload-table-metrics`: whether to enable or disable per
  workload metric collection - disabled by default to preserve the current
  behavior, thus making the new feature opt-in only.
* `workload-label`: a string to look for in query comments to identify the
  workload running the current query.

The workload is obtained by parsing query comments of the form:

/* ... <workload_label>=<workload_name>; ... */

For example, if vttablet is started with

`--enable-per-workload-table-metrics --workload-label app_name`

anda query is issued with a comment like

/* ... app_name=shop; ... */

then metrics will look like

```
vttablet_query_counts{plan="Select",table="dual", workload="shop"} 15479
```

instead of

```
vttablet_query_counts{plan="Select",table="dual"} 15479
```

Query comment parsing only takes place if `--enable-per-workload-table-metrics`
is used, as to not incur parsing performance impact if the user does not want
per workload metrics.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* make linter happy

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* fix flags e2e test

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Address PR comments:

* Obtain workload information on the vtgate instead of the vttablet, avoiding
  double parsing.
* Treat workload name as a query directive.
* Send workload name from vtgate to vttablet as ExecuteOptions.

Additionally, annotate tabletserver's execution span with the workload name
to also enrich traces with workload name data, in addition to metrics.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* A few fixes:

1. Rebuild some files with `make proto`.
2. Protect against nil ExecuteOptions on the tabletserver.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix flags e2e test

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Address PR comments

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fixes

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix a comment

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix e2e flag test

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Update JS code for protobuf changes.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix QueryEngine unit test

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix e2e flag test

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix spurious tab in comment

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Address PR comment

Don't use dual format flag for new flags - stick with - separated ones.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

---------

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
@github-actions github-actions bot added this to the v15.0.5 milestone Apr 16, 2024
@timvaillancourt timvaillancourt changed the title Bp txthrottler pt1 slack vitess r15.0.5 slack-vitess-r15.0.5: backport required Transaction Throttler PRs, pt. 1 Apr 16, 2024
timvaillancourt and others added 15 commits April 17, 2024 01:13
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
* Add support for criticality query directive, and have TxThrottler respect that

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Remove unused variable

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix CI pipeline

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Address PR comments.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Make linter happy & add extra test cases.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Address PR comments.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix circular import

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Make linter happy

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Address PR comments:

* Invalid criticality in query directive fails the query.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Address PR comments:

* Renamed criticality to priority.
* Change error handling when parsing the priority from string to integer.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Add missing piece of code that got lost during merge conflict resolution

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix vtadmin.js

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Address PR comments

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix unit tests (I think)

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Invert polarity of priority values

With this change, queries with PRIORITY=0 never get throttled, whereas those
with PRIORITY=100 always do (provided there is contention).

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Make linter happy

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix flag e2e test

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

---------

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
* Add flag to select tx throttler tablet type

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* REPLICA and/or RDONLY only

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Update flag help msg

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Lowercase types in help/doc

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Help update

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* fix test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* No underscores in flag

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Fix test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Fix merge

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* PR suggestion, consolidate config logic

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Update go/vt/vttablet/tabletserver/tabletenv/config.go

Co-authored-by: Andrew Mason <amason@hey.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Use topoproto.TabletTypeListFlag to handle flag

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Fix unit test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Update go/vt/vttablet/tabletserver/tabletenv/config.go

Co-authored-by: Andrew Mason <amason@hey.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* improve test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* pr suggestions

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* go fmt

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Co-authored-by: Andrew Mason <amason@hey.com>
* txthrottler: further code cleanup

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Fix bad merge resolution

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
…3040)

* TxThrottler support for transactions outside BEGIN/COMMIT

This change allows the TxThrottler to throttle requests sent outside of
explicit transactions (i.e. explicit BEGIN/COMMIT blocks) when configured to do
so via a new config flag. Otherwise, it preserves the current/default behavior
of only throttling transactions inside BEGIN/COMMIT.

In addition, when this flag is passed, and because the call to throttle is done
in a context in which the execution plan is already known, this change uses the
plan type to make sure that throttling is triggered only when the query being
executed is INSERT/UPDATE/DELETE/LOAD, so that SELECTs and others no longer get
throttled unnecessarily, as they do not contribute to increasing replication
lag, which is what the TxThrottler aims at controlling.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix e2e flag tests & TxThrottler unit test

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Throttle auto-commit statements in QueryExecutor instead of TxPool

This changes where we call the transaction throttler:

1. Statements in `BEGIN/COMMIT` blocks keep being throttled in
   `TabletServer.begin()`.
2. Additionally, throttling is added in QueryExecutor.execAutocommit() and
   `QueryExecutor.execAsTransaction()`.

We also change things so that throttling in this new case is not opt-in via
configuration flag but instead is the new and only behavior.

Finally, we remove some previously added changes to example scripts that had
been added with the intention of testing and are not part of the PR.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Adds test cases for QueryExecutor.Execute() with TxThrottle throttling

To make unit testing simple here, we separated the interface and implementation
of the TxThrottle, and simply used a mock implementation of the interface in
the tests.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Add note on new TxThrottler behavior in v17 changelog

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Fix PR number in changelog entry for TxThrottler behavior change.

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Make linter happy

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

* Address PR comments

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>

---------

Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
…tessio#13115)

* txthrottler: verify config at vttablet startup, consolidate funcs

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Use explicit dest in prototext.Unmarshal

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Use for loop for TestVerifyTxThrottlerConfig

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Cleanup test

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Fix go vet complaint

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Add back synonym flag

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Update go/vt/vttablet/tabletserver/txthrottler/tx_throttler.go

Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

* Address staticcheck linter error

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>

---------

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
…essio#13153)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
…lls` is undefined (vitessio#12477)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Co-authored-by: Shlomi Noach <2607934+shlomi-noach@users.noreply.github.com>
Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
Signed-off-by: Eduardo J. Ortega U <5791035+ejortegau@users.noreply.github.com>
Signed-off-by: Eduardo J. Ortega U. <5791035+ejortegau@users.noreply.github.com>
…vitessio#13624)

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
@timvaillancourt timvaillancourt marked this pull request as ready for review April 17, 2024 21:29
@timvaillancourt timvaillancourt requested a review from a team as a code owner April 17, 2024 21:29
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
@timvaillancourt timvaillancourt changed the title slack-vitess-r15.0.5: backport required Transaction Throttler PRs, pt. 1 slack-vitess-r15.0.5: backport required Transaction Throttler PRs Apr 17, 2024
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
…ess-r15.0.5

Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
@timvaillancourt timvaillancourt marked this pull request as draft May 7, 2024 22:49
@timvaillancourt timvaillancourt deleted the bp-txthrottler-pt1-slack-vitess-r15.0.5 branch May 16, 2024 15:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants