Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

admission control: support for multi-tenant environments #65954

Closed
2 of 3 tasks
sumeerbhola opened this issue Jun 1, 2021 · 4 comments
Closed
2 of 3 tasks

admission control: support for multi-tenant environments #65954

sumeerbhola opened this issue Jun 1, 2021 · 4 comments
Assignees
Labels
A-admission-control A-multitenancy Related to multi-tenancy C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)

Comments

@sumeerbhola
Copy link
Collaborator

sumeerbhola commented Jun 1, 2021

The framework in the admission package is general enough to support admission control for multi-tenant KV nodes and single-tenant SQL nodes, however we need:

Epic: CRDB-10304

Jira issue: CRDB-7811

@sumeerbhola sumeerbhola added C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) A-admission-control labels Jun 1, 2021
sumeerbhola added a commit to sumeerbhola/cockroach that referenced this issue Jul 13, 2021
Informs cockroachdb#65954

Release note (ops change): Enabling admission.kv.enabled may provide
better inter-tenant isolation for multi-tenant KV nodes.
craig bot pushed a commit that referenced this issue Jul 13, 2021
67228: cloud: add child span in external storage WriteFile r=dt a=adityamaru

This change adds a child span to the `WriteFile` method
so that we can track the duration of an upload.

Release note: None

67533: server: use actual TenantID for multi-tenant KV admission control r=sumeerbhola a=sumeerbhola

Informs #65954

Release note (ops change): Enabling admission.kv.enabled may provide
better inter-tenant isolation for multi-tenant KV nodes.

67543: roachtest: fix tpchvec/smithcmp r=yuzefovich a=yuzefovich

The config file has recently been moved to a different location, and
`tpchvec/smithcmp` test was broken. This is now fixed.

Fixes: #67353.
Fixes: #67361.

Release note: None

Co-authored-by: Aditya Maru <adityamaru@gmail.com>
Co-authored-by: sumeerbhola <sumeer@cockroachlabs.com>
Co-authored-by: Yahor Yuzefovich <yahor@cockroachlabs.com>
pawalt pushed a commit to pawalt/cockroach that referenced this issue Jul 22, 2021
Informs cockroachdb#65954

Release note (ops change): Enabling admission.kv.enabled may provide
better inter-tenant isolation for multi-tenant KV nodes.
@blathers-crl blathers-crl bot added the T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions) label Oct 5, 2021
@blathers-crl blathers-crl bot added the T-sql-queries SQL Queries Team label Oct 11, 2021
@RaduBerinde RaduBerinde added the A-multitenancy Related to multi-tenancy label Oct 19, 2021
@RaduBerinde RaduBerinde changed the title admission control: support for mult-tenant environments admission control: support for multi-tenant environments Oct 19, 2021
@cucaroach cucaroach self-assigned this Jan 6, 2022
@cucaroach
Copy link
Contributor

Just to expand on this I think what's wanted here is a simple kv95 style roachtest that runs a weak kvserver and hits it hard with high concurrency sql tenants and we measure success as a reasonable amount of fairness in how throughput and latency varies amongst the pods.

cucaroach added a commit to cucaroach/cockroach that referenced this issue Apr 11, 2022
Informs: cockroachdb#65954

Roachtests intended to validate that kv and store admission control
queues distribute database resources fairly.

Release note: None

wip
cucaroach added a commit to cucaroach/cockroach that referenced this issue Apr 12, 2022
Informs: cockroachdb#65954

Roachtests intended to validate that kv and store admission control
queues distribute database resources fairly.

Release note: None

wip
cucaroach added a commit to cucaroach/cockroach that referenced this issue Apr 28, 2022
Informs: cockroachdb#65954

Roachtests intended to validate that kv and store admission control
queues distribute database resources fairly.

Release note: None
cucaroach added a commit to cucaroach/cockroach that referenced this issue May 2, 2022
Informs: cockroachdb#65954

Roachtests intended to validate that kv and store admission control
queues distribute database resources fairly.

Release note: None
cucaroach added a commit to cucaroach/cockroach that referenced this issue May 3, 2022
Informs: cockroachdb#65954

Roachtests intended to validate that kv and store admission control
queues distribute database resources fairly.

Release note: None
cucaroach added a commit to cucaroach/cockroach that referenced this issue May 3, 2022
Informs: cockroachdb#65954

Roachtests intended to validate that kv and store admission control
queues distribute database resources fairly.

Release note: None
cucaroach added a commit to cucaroach/cockroach that referenced this issue May 4, 2022
Informs: cockroachdb#65954

Roachtests intended to validate that kv and store admission control
queues distribute database resources fairly.

Release note: None
cucaroach added a commit to cucaroach/cockroach that referenced this issue May 16, 2022
Informs: cockroachdb#65954

Roachtests intended to validate that kv and store admission control
queues distribute database resources fairly.

There are 8 tests: 2 for "kv" ie CPU stressing and 2 for "store" that are
intended to stress the LSM. One test is "same" where each of N sql pods hits the
kvserver with equal concurrency and another "concurrency-skew" which
varies the concurrency from each pod.

We measure the variation in througphput across the pod ("max_tput_delta")
and the max/min throughput and latency across the pods. Sample results:

```
multitenant/fairness/kv/concurrency-skew/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.340925, "max_tput": 483.200000, "min_tput": 256.056667, "max_latency": 3.282347, "min_latency": 0.771148}
multitenant/fairness/kv/concurrency-skew/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.330760, "max_tput": 205.740000, "min_tput": 108.903333, "max_latency": 7.151178, "min_latency": 1.618236}

multitenant/fairness/kv/same/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.245294, "max_tput": 293.990000, "min_tput": 197.026667, "max_latency": 0.831686, "min_latency": 0.762475}
multitenant/fairness/kv/same/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.031199, "max_tput": 132.443333, "min_tput": 124.676667, "max_latency": 1.915801, "min_latency": 1.776664}

multitenant/fairness/store/same/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.018095, "max_tput": 139.950000, "min_tput": 136.336667, "max_latency": 0.346295, "min_latency": 0.341212}
multitenant/fairness/store/same/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.001886, "max_tput": 149.296667, "min_tput": 148.878333, "max_latency": 0.306853, "min_latency": 0.303392}

multitenant/fairness/store/concurrency-skew/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.024872, "max_tput": 143.875000, "min_tput": 138.346667, "max_latency": 1.094262, "min_latency": 0.346674}
multitenant/fairness/store/concurrency-skew/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.005439, "max_tput": 148.848333, "min_tput": 147.500000, "max_latency": 1.007741, "min_latency": 0.313597}
```

Release note: None
cucaroach added a commit to cucaroach/cockroach that referenced this issue May 23, 2022
Informs: cockroachdb#65954

Roachtests intended to validate that kv and store admission control
queues distribute database resources fairly.

There are 8 tests: 2 for "kv" ie CPU stressing and 2 for "store" that are
intended to stress the LSM. One test is "same" where each of N sql pods hits the
kvserver with equal concurrency and another "concurrency-skew" which
varies the concurrency from each pod.

We measure the variation in througphput across the pod ("max_tput_delta")
and the max/min throughput and latency across the pods. Sample results:

```
multitenant/fairness/kv/concurrency-skew/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.340925, "max_tput": 483.200000, "min_tput": 256.056667, "max_latency": 3.282347, "min_latency": 0.771148}
multitenant/fairness/kv/concurrency-skew/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.330760, "max_tput": 205.740000, "min_tput": 108.903333, "max_latency": 7.151178, "min_latency": 1.618236}

multitenant/fairness/kv/same/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.245294, "max_tput": 293.990000, "min_tput": 197.026667, "max_latency": 0.831686, "min_latency": 0.762475}
multitenant/fairness/kv/same/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.031199, "max_tput": 132.443333, "min_tput": 124.676667, "max_latency": 1.915801, "min_latency": 1.776664}

multitenant/fairness/store/same/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.018095, "max_tput": 139.950000, "min_tput": 136.336667, "max_latency": 0.346295, "min_latency": 0.341212}
multitenant/fairness/store/same/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.001886, "max_tput": 149.296667, "min_tput": 148.878333, "max_latency": 0.306853, "min_latency": 0.303392}

multitenant/fairness/store/concurrency-skew/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.024872, "max_tput": 143.875000, "min_tput": 138.346667, "max_latency": 1.094262, "min_latency": 0.346674}
multitenant/fairness/store/concurrency-skew/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.005439, "max_tput": 148.848333, "min_tput": 147.500000, "max_latency": 1.007741, "min_latency": 0.313597}
```

Release note: None
craig bot pushed a commit that referenced this issue May 23, 2022
77481: sql: add multitenant fairness tests r=cucaroach a=cucaroach

Informs: #65954

Roachtests intended to validate that kv and store admission control
queues distribute database resources fairly.

There are 8 tests: 2 for "kv" ie CPU stressing and 2 for "store" that are
intended to stress the LSM. One test is "same" where each of N sql pods hits the
kvserver with equal concurrency and another "concurrency-skew" which
varies the concurrency from each pod.

We measure the variation in througphput across the pod ("max_tput_delta")
and the max/min throughput and latency across the pods. Sample results:

```
multitenant/fairness/kv/concurrency-skew/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.340925, "max_tput": 483.200000, "min_tput": 256.056667, "max_latency": 3.282347, "min_latency": 0.771148}
multitenant/fairness/kv/concurrency-skew/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.330760, "max_tput": 205.740000, "min_tput": 108.903333, "max_latency": 7.151178, "min_latency": 1.618236}

multitenant/fairness/kv/same/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.245294, "max_tput": 293.990000, "min_tput": 197.026667, "max_latency": 0.831686, "min_latency": 0.762475}
multitenant/fairness/kv/same/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.031199, "max_tput": 132.443333, "min_tput": 124.676667, "max_latency": 1.915801, "min_latency": 1.776664}

multitenant/fairness/store/same/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.018095, "max_tput": 139.950000, "min_tput": 136.336667, "max_latency": 0.346295, "min_latency": 0.341212}
multitenant/fairness/store/same/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.001886, "max_tput": 149.296667, "min_tput": 148.878333, "max_latency": 0.306853, "min_latency": 0.303392}

multitenant/fairness/store/concurrency-skew/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.024872, "max_tput": 143.875000, "min_tput": 138.346667, "max_latency": 1.094262, "min_latency": 0.346674}
multitenant/fairness/store/concurrency-skew/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.005439, "max_tput": 148.848333, "min_tput": 147.500000, "max_latency": 1.007741, "min_latency": 0.313597}
```

Release note: None

81660: kvserver: fix raft handling stats when no ready present r=erikgrinaker a=tbg

We were previously adding `time.Time{}.Sub(timeutil.Now())` to
the metrics some of the time which produces bogus results.

Noticed while working on #81516.

Release note: None


81666: ui: Fix accidental commit of `it.only` r=jocrl a=jocrl

This commit fixes an accidental commit of `it.only` to frontend tests.
It also adds a comment explaining why the test does not go further.

Release note: None

Co-authored-by: Tommy Reilly <treilly@cockroachlabs.com>
Co-authored-by: Tobias Grieger <tobias.b.grieger@gmail.com>
Co-authored-by: Josephine Lee <josephine@cockroachlabs.com>
@cucaroach cucaroach removed their assignment May 24, 2022
andrewbaptist pushed a commit to andrewbaptist/cockroach that referenced this issue May 25, 2022
Informs: cockroachdb#65954

Roachtests intended to validate that kv and store admission control
queues distribute database resources fairly.

There are 8 tests: 2 for "kv" ie CPU stressing and 2 for "store" that are
intended to stress the LSM. One test is "same" where each of N sql pods hits the
kvserver with equal concurrency and another "concurrency-skew" which
varies the concurrency from each pod.

We measure the variation in througphput across the pod ("max_tput_delta")
and the max/min throughput and latency across the pods. Sample results:

```
multitenant/fairness/kv/concurrency-skew/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.340925, "max_tput": 483.200000, "min_tput": 256.056667, "max_latency": 3.282347, "min_latency": 0.771148}
multitenant/fairness/kv/concurrency-skew/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.330760, "max_tput": 205.740000, "min_tput": 108.903333, "max_latency": 7.151178, "min_latency": 1.618236}

multitenant/fairness/kv/same/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.245294, "max_tput": 293.990000, "min_tput": 197.026667, "max_latency": 0.831686, "min_latency": 0.762475}
multitenant/fairness/kv/same/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.031199, "max_tput": 132.443333, "min_tput": 124.676667, "max_latency": 1.915801, "min_latency": 1.776664}

multitenant/fairness/store/same/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.018095, "max_tput": 139.950000, "min_tput": 136.336667, "max_latency": 0.346295, "min_latency": 0.341212}
multitenant/fairness/store/same/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.001886, "max_tput": 149.296667, "min_tput": 148.878333, "max_latency": 0.306853, "min_latency": 0.303392}

multitenant/fairness/store/concurrency-skew/admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.024872, "max_tput": 143.875000, "min_tput": 138.346667, "max_latency": 1.094262, "min_latency": 0.346674}
multitenant/fairness/store/concurrency-skew/no-admission/run_1/1.perf/stats.json
{ "max_tput_delta": 0.005439, "max_tput": 148.848333, "min_tput": 147.500000, "max_latency": 1.007741, "min_latency": 0.313597}
```

Release note: None
@jlinder jlinder removed the sync-me-3 label May 27, 2022
@vy-ton
Copy link
Contributor

vy-ton commented Jun 27, 2022

Should this be closed now that admission control is enabled for KV pods? #79425 tracks work to enable admission control for SQL pods.

@mgartner
Copy link
Collaborator

mgartner commented Jul 7, 2022

Yes, closing.

@mgartner mgartner closed this as completed Jul 7, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-admission-control A-multitenancy Related to multi-tenancy C-enhancement Solution expected to add code/behavior + preserve backward-compat (pg compat issues are exception) T-sql-foundations SQL Foundations Team (formerly SQL Schema + SQL Sessions)
Projects
Archived in project
Development

No branches or pull requests

6 participants