You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: keps/sig-node/2371-cri-pod-container-stats/README.md
+44-84Lines changed: 44 additions & 84 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -619,6 +619,8 @@ Additional work may be required to evaluate other kubelet components (e.g. evict
619
619
Ideally all components will rely on summary API thereby alleviating need for cAdvisor for container and pod level stats.
620
620
This is also a requirement to be able to disable cAdvisor container metrics collection.
621
621
622
+
To make clear to cluster admins when metrics are coming from CRI, rather than cadvisor, a new metric `kubelet_metrics_provider` will be used, with `provider` label either `cri` or `cadvisor`.
623
+
622
624
#### cAdvisor
623
625
624
626
Once CRI and Kubelet stats provider level changes are in place, we can evaluate disabling cAdvisor from collecting container and pod level stats.
@@ -793,7 +795,7 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
793
795
- A test using the CRI stats feature gate with enabled CRI implementations should be used with cri_stats_provider to ensure the stats reported are conformant.
794
796
795
797
### Graduation Criteria
796
-
#### Alpha implementation
798
+
#### Alpha
797
799
798
800
- CRI should be extended to provide required stats for `/stats/summary`
799
801
- Kubelet should be extended to provide the required stats from CRI implementation for `/stats/summary`.
@@ -803,7 +805,7 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
803
805
- This will allow the CRI to broadcast `/metrics/cadvisor` through the Kubelet's HTTP server.
804
806
- Conduct research to find the set of metrics from `/metrics/cadvisor` that compliant CRI implementations must expose.
805
807
806
-
#### Alpha -> Beta Graduation
808
+
#### Beta
807
809
808
810
- Conformance tests for the fields in `/metrics/cadvisor` should be created.
809
811
- Validate performance impact of this feature is within allowable margin (or non-existent, ideally).
@@ -812,10 +814,10 @@ We expect no non-infra related flakes in the last month as a GA graduation crite
812
814
- Write migration documentation for entities relying on metrics from `/metrics/cadvisor`.
813
815
- Windows stats and metrics will be added.
814
816
815
-
#### Beta -> GA Graduation
817
+
#### GA
816
818
817
819
- The CRI stats provider in the Kubelet should be fully formed, and able to satisfy all the needs of downstream consumers
818
-
- cAdvisor stats provider will likely be marked as deprecated (depending on dockershim deprecation).
820
+
- cAdvisor stats provider support will be dropped
819
821
- Feature gate removed and the CRI stats provider will no longer rely on cAdvisor for container/pod level metrics.
820
822
821
823
### Upgrade / Downgrade Strategy
@@ -860,26 +862,27 @@ you need any help or guidance.
860
862
861
863
_This section must be completed when targeting alpha to a release._
862
864
863
-
***How can this feature be enabled / disabled in a live cluster?**
865
+
###### How can this feature be enabled / disabled in a live cluster?
866
+
864
867
-[x] Feature gate (also fill in values in `kep.yaml`)
865
868
- Feature gate name: PodAndContainerStatsFromCRI
866
869
- Components depending on the feature gate: Kubelet
867
870
868
-
***Does enabling the feature change any default behavior?**
871
+
###### Does enabling the feature change any default behavior?
869
872
Any change of default behavior may be surprising to users or break existing
870
873
automations, so be extremely careful here.
871
874
Enabling this behavior means some stats endpoints will not be filled:
872
875
- some entries in `/metrics/cadvisor`
873
876
- Accelerator and UserDefinedMetrics in `/stats/summary`
874
877
875
-
***Can the feature be disabled once it has been enabled (i.e. can we roll back
878
+
###### Can the feature be disabled once it has been enabled (i.e. can we roll back the enablement)?
876
879
the enablement)?**
877
880
Yes, assuming the Kubelet is restarted.
878
881
879
-
***What happens if we reenable the feature if it was previously rolled back?**
882
+
###### What happens if we reenable the feature if it was previously rolled back?
880
883
There should be no problem with this.
881
884
882
-
***Are there any tests for feature enablement/disablement?**
885
+
###### Are there any tests for feature enablement/disablement?
883
886
It will need to be (at least manually) tested against enabling/disabling on a live Kubelet.
884
887
885
888
Note: enabling/disabling feature gate will require cAdvisor is restarted. The most graceful way to make this happen is require the Kubelet restarts to apply these changes.
@@ -888,22 +891,17 @@ Note: enabling/disabling feature gate will require cAdvisor is restarted. The mo
888
891
889
892
_This section must be completed when targeting beta graduation to a release._
890
893
891
-
***How can a rollout fail? Can it impact already running workloads?**
892
-
Try to be as paranoid as possible - e.g., what if some components will restart
893
-
mid-rollout?
894
+
###### How can a rollout or rollback fail? Can it impact already running workloads?
894
895
895
896
If the CRI implementation doesn't support the required metrics, and cAdvisor has container metrics collection turned off,
896
897
it is possible the node comes up with no metrics about pods and containers. This should be mitigated by making sure that
897
898
the kubelet probes the CRI implementation and enables cAdvisor metrics collection even if the feature gate is on.
898
899
899
-
***What specific metrics should inform a rollback?**
900
+
###### What specific metrics should inform a rollback?
900
901
901
902
The lack of any metrics reported for pods and containers is the worst case scenerio here, and would require either a rollback or for the feature gate to be disabled.
902
903
903
-
***Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?**
904
-
Describe manual testing that was done and the outcomes.
905
-
Longer term, we may want to require automated upgrade/rollback tests, but we
906
-
are missing a bunch of machinery and tooling and can't do that now.
904
+
###### Were upgrade and rollback tested? Was the upgrade->downgrade->upgrade path tested?
907
905
908
906
The source of the metrics is a private matter between the kubelet, CRI implementation and cAdvisor. Since cAdvisor
909
907
in embedded in the kubelet, the two pieces that could move disjointly are kubelet and CRI implementation. The
@@ -912,9 +910,7 @@ words, rolling back and upgrading should have no affect--if the upgrade broke me
912
910
(and measures weren't taken to cause kubelet to fallback to cAdvisor), then a rollback (or toggling of the feature gate)
913
911
would return the metrics from cAdvisor.
914
912
915
-
***Is the rollout accompanied by any deprecations and/or removals of features, APIs,
916
-
fields of API types, flags, etc.?**
917
-
Even if applying deprecation policies, they may still surprise some users.
913
+
###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
918
914
919
915
A piece of work for Beta is moving the source of the contents of `/metrics/cadvisor`. If users toggle the feature gate,
920
916
prometheus collectors will have to move the URL. However, it's an expressed intention of the implementation to have the CRI
@@ -925,60 +921,33 @@ report metrics previously reported by cAdvisor, so the contents should not chang
925
921
926
922
_This section must be completed when targeting beta graduation to a release._
927
923
928
-
***How can an operator determine if the feature is in use by workloads?**
929
-
Ideally, this should be a metric. Operations against the Kubernetes API (e.g.,
930
-
checking if there are objects with field X set) may be a last resort. Avoid
931
-
logs or events for this purpose.
924
+
###### How can an operator determine if the feature is in use by workloads?
932
925
933
926
The source of the pod and container metrics previously reported to Prometheus by `/metrics/cadvisor` is the CRI implementation, not cAdvisor.
934
927
Further, if the CRI implementation was using the old CRI stats provider, then the memory usage of the cgroup the kubelet and runtime
935
928
were in should go down--as some duplicated work should be unduplicated.
936
929
937
-
***What are the SLIs (Service Level Indicators) an operator can use to determine
938
-
the health of the service?**
930
+
###### How can someone using this feature know that it is working for their instance?
939
931
-[x] Metrics
940
932
- Metric name:
941
-
- all pod and container level stats coming from cAdvisor `container_*`
933
+
- `kubelet_metrics_provider`
942
934
- Components exposing the metric:
943
-
-Previously cAdvisor, now CRI implementation.
935
+
-kubelet
944
936
-[ ] Other (treat as last resort)
945
937
- Details:
946
938
947
-
***What are the reasonable SLOs (Service Level Objectives) for the above SLIs?**
948
-
At a high level, this usually will be in the form of "high percentile of SLI
949
-
per day <= X". It's impossible to provide comprehensive guidance, but at the very
950
-
high level (needs more precise definitions) those may be things like:
951
-
- per-day percentage of API calls finishing with 5XX errors <= 1%
952
-
- 99% percentile over day of absolute value from (job creation time minus expected
953
-
job creation time) for cron job <= 10%
954
-
- 99,9% of /health requests per day finish with 200 code
939
+
###### What are the SLIs (Service Level Indicators) an operator can use to determine the health of the service?
955
940
956
941
- Reduction of CPU and memory usage between kubelet and CRI (if previously using CRI stats provider).
957
942
- Minimal (< 2%) of performance hit between CPU and memory between CRI and kubelet (if previously using cAdvisor stats provider).
958
943
959
-
***Are there any missing metrics that would be useful to have to improve observability
960
-
of this feature?**
961
-
Describe the metrics themselves and the reasons why they weren't added (e.g., cost,
962
-
implementation difficulties, etc.).
963
-
964
-
### Dependencies
965
-
966
-
_This section must be completed when targeting beta graduation to a release._
944
+
###### Are there any missing metrics that would be useful to have to improve observability of this feature?
967
945
968
-
***Does this feature depend on any specific services running in the cluster?**
969
-
Think about both cluster-level services (e.g. metrics-server) as well
970
-
as node-level agents (e.g. specific version of CRI). Focus on external or
971
-
optional services that are needed. For example, if this feature depends on
972
-
a cloud provider API, or upon an external software-defined storage or network
973
-
control plane.
946
+
To make clear to cluster admins when metrics are coming from CRI, rather than cadvisor, a new metric `kubelet_metrics_provider` will be used, with `provider` label either `cri` or `cadvisor`.
974
947
975
-
For each of these, fill in the following—thinking about running existing user workloads
976
-
and creating new ones, as well as about cluster-level services (e.g. DNS):
977
-
-[Dependency name]
978
-
- Usage description:
979
-
- Impact of its outage on the feature:
980
-
- Impact of its degraded performance or high-error rates on the feature:
948
+
### Dependencies
981
949
950
+
###### Does this feature depend on any specific services running in the cluster?
982
951
983
952
- CRI implementation
984
953
- Usage description:
@@ -988,34 +957,22 @@ _This section must be completed when targeting beta graduation to a release._
988
957
989
958
### Scalability
990
959
991
-
_For alpha, this section is encouraged: reviewers should consider these questions
992
-
and attempt to answer them._
993
-
994
-
_For beta, this section is required: reviewers must answer these questions._
995
-
996
-
_For GA, this section is required: approvers should be able to confirm the
997
-
previous answers based on experience in the field._
998
-
999
-
***Will enabling / using this feature result in any new API calls?**
960
+
###### Will enabling / using this feature result in any new API calls?
1000
961
It should not.
1001
962
1002
-
***Will enabling / using this feature result in introducing new API types?**
1003
-
Describe them, providing:
1004
-
- There will be new CRI API types, described above. These are to be agreed upon by Kubelet and the CRI implementation.
963
+
###### Will enabling / using this feature result in introducing new API types?
964
+
965
+
- There will be new CRI API types, described above. These are to be agreed upon by Kubelet and the CRI implementation.
1005
966
1006
-
***Will enabling / using this feature result in any new calls to the cloud
1007
-
provider?**
967
+
###### Will enabling / using this feature result in any new calls to the cloud provider?
1008
968
- No.
1009
-
***Will enabling / using this feature result in increasing size or count of
1010
-
the existing API objects?**
1011
-
Describe them, providing:
1012
-
- There are no changes that affect objects stored in the database.
1013
-
- There are changes to the CRI API, which will have to be coordinated between CRI implementation and Kubelet.
969
+
###### Will enabling / using this feature result in increasing size or count of the existing API objects?
970
+
971
+
- There are no changes that affect objects stored in the database.
972
+
- There are changes to the CRI API, which will have to be coordinated between CRI implementation and Kubelet.
1014
973
1015
-
***Will enabling / using this feature result in increasing time taken by any
1016
-
operations covered by [existing SLIs/SLOs]?**
1017
-
Think about adding additional work or introducing new steps in between
1018
-
(e.g. need to do X to start a container), etc. Please describe the details.
974
+
###### Will enabling / using this feature result in increasing time taken by any operations covered by existing SLIs/SLOs?
975
+
1019
976
- The process of collecting and reporting the metrics should not differ too much between cAdvisor and the CRI implementation:
1020
977
- At a high level, both need to watch the changes to the stats (from cgroups, disk and network stats)
1021
978
- Once collected, the CRI implementation will need to report them (both through the CRI and eventually through the prometheus endpoint).
@@ -1024,8 +981,8 @@ operations covered by [existing SLIs/SLOs]?**
1024
981
- This may come because cAdvisor's performance has been fine-tuned, and changing the location of work may loose some optimizations.
1025
982
- However, it is explicitly stated that a requirement for transition from Alpha->Beta is little to no performance degradation.
1026
983
- The existence of the feature gate will allow users to mitigate this potential blip in performance (by not opting-in).
1027
-
***Will enabling / using this feature result in non-negligible increase of
1028
-
resource usage (CPU, RAM, disk, IO, ...) in any components?**
984
+
985
+
###### Will enabling / using this feature result in non-negligible increase of resource usage (CPU, RAM, disk, IO, ...) in any components?
1029
986
- It most likely will reduce resource utilization. Right now, there is duplicate work being done between CRI and cAdvisor.
1030
987
This will not happen anymore.
1031
988
- The CRI implementation may scrape the metrics less efficiently than cAdvisor currently does. This should be measured and evaluated as a requirement of Beta.
@@ -1049,12 +1006,12 @@ details). For now, we leave it here.
1049
1006
1050
1007
_This section must be completed when targeting beta graduation to a release._
1051
1008
1052
-
***How does this feature react if the API server and/or etcd is unavailable?**
1009
+
###### How does this feature react if the API server and/or etcd is unavailable?
1053
1010
- Should not change.
1054
-
***What are other known failure modes?**
1011
+
###### What are other known failure modes?
1055
1012
- Kubelet should fall back to using cAdvisor if errors are detected with version skew. Nothing else should be affected.
1056
1013
1057
-
***What steps should be taken if SLOs are not being met to determine the problem?**
1014
+
###### What steps should be taken if SLOs are not being met to determine the problem?
@@ -1071,6 +1028,7 @@ _This section must be completed when targeting beta graduation to a release._
1071
1028
2022-12-09: Retarget KEP to alpha in 1.26
1072
1029
2023-05-19: KEP targeted at Beta in 1.28
1073
1030
2023-05-19: KEP retargeted to Alpha in 1.29
1031
+
2025-10-07: KEP retargeted to Beta in 1.35
1074
1032
1075
1033
## Drawbacks
1076
1034
@@ -1087,3 +1045,5 @@ Greater complexity as opposed to adding these unstructured metrics directly into
1087
1045
- However, this doesn't address the anti-pattern of having multiple parties confusingly responsible for a wide array of metrics and other issues described.
1088
1046
- Have cAdvisor implement the summary API. A cAdvisor daemonset could be a drop-in replacement for the summary API.
1089
1047
- Don't keep supporting the summary API. Replace it with a "better" format, like prometheus. Or help users migrate to equivalent APIs that container runtimes already expose for monitoring.
0 commit comments