diff --git a/sig-scalability/charter.md b/sig-scalability/charter.md index ecf588c86ad..fabd7ab6972 100644 --- a/sig-scalability/charter.md +++ b/sig-scalability/charter.md @@ -16,9 +16,9 @@ Scalability and performance are horizontal aspects of the system - changes in a single place of Kubernetes may affect the whole system. As a result, to effectively ensure Kubernetes scales, we need a special cross-SIG privileges. -- We can rollback any merged PR if it has been identified as a cause of an SLO - regression. The offending PR should only be merged again after proving to pass - tests at scale. +- We can rollback any merged PR if it has been identified as a cause of any + [performance/scalability SLOs] regression. The offending PR should only be + merged again after proving to pass tests at scale. - We can pause the merge queue in case of a regression observed until a particular PR has been identified as cause of the regression and regression has been mitigated. The “Rules of engagement” of pausing merge-queue and rationale for @@ -26,7 +26,8 @@ effectively ensure Kubernetes scales, we need a special cross-SIG privileges. TODO(wojtek-t, shyamjvs): Write it down and link here. - We require significant changes (in terms of impact, such as: update of etcd, update of Go version, major architectural changes, etc.) may only be merged: - - with an explicit approval from a SIG-scalability approver and + - with an explicit approval from a [SIG-scalability approver](#sig-scalability-approvers) + and - after having passed performance testing on biggest supported clusters (unless found unnecessary by scalability approver) - We can block a feature from transitioning to Beta status if (when turned on) it @@ -42,6 +43,8 @@ For the record, by regression above we mean a regression identified by the set of release-blocking scalability/performance tests (as defined by sig-release-master-blocking group of test suites). +[performance/scalability SLOs]: https://github.com/kubernetes/community/blob/master/sig-scalability/slos/slos.md + ## SIG Values - We are NOT firefighters, we are fire-prevention specialists. @@ -57,55 +60,18 @@ delegated to that SIG. SIG scalability subprojects are as follows. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
SubprojectDescriptionExample Artifacts
Kubernetes scalabilityDefining what does it mean that “Kubernetes scales”. This includes defining - (or approving) individual performance SLIs/SLOs, ensuring they are all oriented - on user experience and consistent with each other.API-machinery performance SLIs/SLOs
Kubernetes performance validationEnsuring that each official Kubernetes release satisfies all scalability and - performance related requirements, as state in “Kubernetes scalability” - definition.1.9 validation report
Scalability testing frameworksDesigning and creating frameworks to make scalability and performance testing - of Kubernetes easy and available for all contributors. Different frameworks may - help in different aspect of scalability testing enabling making conscious tradeoffs, - e.g. cost of accuracy or real life vs more generalized benchmarking scenarios. Cluster loader
Scalability and performanc testsEnsuring that all tests necessary to validate Kubernetes scalability and - performance exist (ideally by providing easy-to-use framework and working with SIGs - to provide them) have the environment and resources to run on and are being - executed according to calendar enabling release validation.Scalability e2e tests
Scalability governanceEstablishing and documenting best practises on how to design and/or implement - Kubernetes features in scalable and performant way. Educating contributors and - ensuring those are widely used.Regressions case study
+| Subproject | Description | Example Artifacts | OWNERS | +| --- | --- | --- | --- | +| Kubernetes scalability | Defining what does it mean that “Kubernetes scales”. This includes defining (or approving) individual performance SLIs/SLOs, ensuring they are all oriented on user experience and consistent with each other. | [SLIs/SLOs] | [OWNERS](https://github.com/kubernetes/community/blob/master/sig-scalability/slos/OWNERS) | +| Kubernetes performance validation | Ensuring that each official Kubernetes release satisfies all scalability and performance related requirements, as state in “Kubernetes scalability” definition | [1.9 validation report] | TODO | +| Scalability testing frameworks | Designing and creating frameworks to make scalability and performance testing of Kubernetes easy and available for all contributors. Different frameworks may help in different aspect of scalability testing enabling making conscious tradeoffs, e.g. cost of accuracy or real life vs more generalized benchmarking scenarios. | [Cluster loader] | [OWNERS](https://github.com/kubernetes/perf-tests/blob/master/OWNERS) [OWNERS](https://github.com/kubernetes/kubernetes/blob/master/test/kubemark/OWNERS) | +| Scalability and performance tests | Ensuring that all tests necessary to validate Kubernetes scalability and performance exist (ideally by providing easy-to-use framework and working with SIGs to provide them) have the environment and resources to run on and are being executed according to calendar enabling release validation. | [Scalability e2e tests] | [OWNERS](https://github.com/kubernetes/kubernetes/blob/master/test/e2e/scalability/OWNERS) | +| Scalability governance | Establishing and documenting best practises on how to design and/or implement Kubernetes features in scalable and performant way. Educating contributors and ensuring those are widely used. | [Regressions case study] | [OWNERS](https://github.com/kubernetes/community/blob/master/sig-scalability/governance/OWNERS) | TODO: Figure out if we need subproject for finding bottlenecks, coordinating improvements and architectural changes, etc. -[API-machinery performance SLIs/SLOs]: https://github.com/kubernetes/community/blob/master/sig-scalability/slis/apimachinery_slis.md +[SLIs/SLOs]: https://github.com/kubernetes/community/blob/master/sig-scalability/slos/slos.md [1.9 validation report]: https://github.com/kubernetes/sig-release/blob/master/releases/release-1.9/scalability_validation_report.md [Cluster loader]: https://github.com/kubernetes/perf-tests/tree/master/clusterloader [Scalability e2e tests]: https://github.com/kubernetes/kubernetes/tree/master/test/e2e/scalability @@ -121,7 +87,8 @@ until the role is filled. - Number: 2-3 - Run operations and processes governing the SIG - A majority of chairs cannot be from a single company. -- An initial set of chairs was established at the time the SIG was founded. +- An initial set of chairs was established at the time the SIG was founded as: + Wojciech Tyczynski and Bob Wise. - Chairs may decide to step down and propose a replacement, who must be approved by all other chairs. - Chairs may select additional chairs by consensus. @@ -132,10 +99,10 @@ until the role is filled. - Number: 2-3 - Establish new subprojects and retire existing ones - Resolve cross-subprojects technical issues and decisions and escalations from - subprojects + subprojects. - Decision making must be by consensus. - An initial set of technical leads was set to long-standing group of SIG leads: - Wojciech Tyczynski and Bob Wise + Wojciech Tyczynski and Bob Wise. - Technical leads must have demonstrated deep understanding of the whole system that is sufficient to assess impact of different changes on Kubernetes scalability. - Technical leads must remain active in the role and are automatically removed @@ -160,6 +127,17 @@ until the role is filled. - Owners may select additional subproject owners through a super-majority vote amongst subproject owners. +### SIG Scalability approvers +- Number: at least 3 +- Approve significant changes (in terms of potential impact, e.g. major architectural + changes, upgrades of etcd or Go version) from scalability perspective. +- An initial set of approvers was set to: + - Bob Wise + - Clayton Coleman + - Jordan Liggitt + - Shyam Jeedigunta + - Wojciech Tyczynski + ## Organizational management - Six months after this charter is first ratified, it must be reviewed and re-approved by the SIG in order to evaluate the assumptions made in its initial