[Metrics][Kubernetes] Update proxy, scheduler and controller manager fields and dashboards #4948

constanca-m · 2023-01-09T11:49:53Z

What does this PR do?

This PR updates the dashboards and fields of proxy, scheduler and controller manager to match the ones mentioned in elastic/beats#34161.

Checklist

I have reviewed tips for building integrations and this pull request is aligned with them.
I have verified that all data streams collect metrics or logs.
I have added an entry to my package's changelog.yml file.
I have verified that Kibana version constraints are current according to guidelines.

How to test this PR locally

Download the changes of this PR.
Run command elastic-package build inside Kubernetes package.
Run elastic-package stack up.
Using Kibana, add Kubernetes integration.

Related issues

Screenshots

The dashboards are just like the ones on the screenshots of elastic/beats#34161.

The cluster overview dashboard was also updated to include a link to the proxy dashboard:

Problem

When using metricbeat-* index, fetching a metric like node_collector_zone_size{zone=""} 1 would result in a missing value on kubernetes.controllermanager.zone. The result would appear on the visualization like this:

However, when using the metrics-* index, there is no missing value anymore as the metric does not even exist:

An empty value like the label zone label would be ignored according to metricbeat code. However, the end results in metricbeat-* and metric-* indexes are different.

elasticmachine · 2023-01-09T15:50:26Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Start Time: 2023-01-27T09:11:58.008+0000
Duration: 28 min 55 sec

Test stats 🧪

Test	Results
Failed	0
Passed	91
Skipped	0
Total	91

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

/test : Re-trigger the build.

elasticmachine · 2023-01-09T15:50:38Z

🌐 Coverage report

Name	Metrics % (`covered/total`)	Diff
Packages	100.0% (`0/0`)	💚
Files	100.0% (`0/0`)	💚
Classes	100.0% (`0/0`)	💚
Methods	94.872% (`74/78`)	❕
Lines	100.0% (`0/0`)	💚
Conditionals	100.0% (`0/0`)	💚

tetianakravchenko

tests are failing with the error:

found 0 hits in metrics-kubernetes.state_persistentvolumeclaim-ep data stream: index_not_found_exception: no such index [metrics-kubernetes.state_persistentvolumeclaim-ep] Status=404

found 0 hits in metrics-kubernetes.state_persistentvolumeclaim-ep data stream: search_phase_execution_exception: all shards failed Status=503

doesn't seem to be related to your change, could you try to retrigger tests?
did you try to run tests locally?

tetianakravchenko · 2023-01-10T13:49:00Z

packages/kubernetes/data_stream/proxy/sample_event.json

+        }
+    },
+    "kubernetes": {
+        "proxy": {


is this sample representative? it does not contain much kubernetes specific fields?

You're right, I will update the sample for the 3 datasets on the next commit.

constanca-m · 2023-01-10T14:24:41Z

doesn't seem to be related to your change, could you try to retrigger tests? did you try to run tests locally?

Yes, I can try that, but I don't think it will work. From what I understand, the field kubernetes.namespace_uid is undefinied for those datasets like the error says. It is true, it is not present on the base-fields.yml of those. The other datatsets have that. maybe that is why there is no complaint.

constanca-m · 2023-01-10T14:25:29Z

/test

tetianakravchenko · 2023-01-10T14:27:42Z

packages/kubernetes/kibana/dashboard/kubernetes-5e649d60-9901-11e9-ba57-b7ab4e2d4b58.json

+                                                                "decimals": 1
+                                                            }
+                                                        },
+                                                        "formula": "differences(last_value(kubernetes.proxy.process.memory.resident.bytes))",


looking on https://user-images.githubusercontent.com/113898685/210374382-e14518ad-ac4c-4683-9abb-51a00aefeb3e.png I am not sure that this is a clear representation of memory usage

There is no metric for the limit of resident memory usage, so having the average isn't really interesting. I used the rate because it will be easier to detect changes, just like with all the other metrics

so having the average isn't really interesting.

why?

I used the rate because it will be easier to detect changes, just like with all the other metrics

I wouldn't say that for monitoring purpose changes like increase/decrease in 50KB provide any value, but it gives a falls impression that smth is wrong

It is very hard to detect spikes on average if we are taking into account many values, while the rate only cares about the last value, making it very easy. But we also have to think about the point of our visualization: is it to get an overview of the overall performance or to highlight problems that might require immediate action?

It is very hard to detect spikes on average if we are taking into account many values

but are such spikes as on the screenshot - https://user-images.githubusercontent.com/113898685/210374382-e14518ad-ac4c-4683-9abb-51a00aefeb3e.png are important for either overall performance or to highlight problems?

Spikes as the one on the screenshot are not important, but the node on the screenshot only has the essential pods + the metricbeat pod. But it is hard to make a case without giving examples. I will try to make it clear once I open the PR for elastic/observability-docs#2474. In the meantime, we can leave it as it is, or I can change to the average

I would suggest average(kubernetes.proxy.process.memory.resident.bytes)

What do you think?

Yes, I will update to that. I am still a little skeptical on choosing average over rate, but it will maybe become more clear once we settle on which problems we want to highlight.

gizas · 2023-01-11T14:22:57Z

packages/kubernetes/kibana/dashboard/kubernetes-bf9389f0-0c14-11ed-b760-5d1bccb47f56.json

                "type": "visualization",
-                "version": "8.4.0-SNAPSHOT"
+                "version": "8.5.1"


Suggested change

"version": "8.5.1"

"version": "8.6.0"

gizas · 2023-01-11T14:23:43Z

packages/kubernetes/manifest.yml

@@ -10,7 +10,7 @@ categories:
  - kubernetes
 release: ga
 conditions:
-  kibana.version: "^8.5.0"
+  kibana.version: "^8.7.0"


I tested this also with 8.6.0 and works fine.

(with 8.6.0)

See comment on line https://github.com/elastic/integrations/pull/4948/files#diff-293e076906b24a178d0e026ab430e5289dabc6f425f2113bcbcf6125d68c2524R170 which I guess should be done 8.6.0 ?

It only works because some fields existed already on the previous version of metricbeat. But that doesn't happen for all fields

packages/kubernetes/kibana/dashboard/kubernetes-f5ab5510-9c94-11e9-94fd-c91206cd5249.json

packages/kubernetes/kibana/dashboard/kubernetes-bf9389f0-0c14-11ed-b760-5d1bccb47f56.json

packages/kubernetes/kibana/dashboard/kubernetes-5e649d60-9901-11e9-ba57-b7ab4e2d4b58.json

gizas · 2023-01-11T15:26:26Z

Also for scheduler I see top visualisations have no data by default.
Do you think we should update the notes for this?

constanca-m · 2023-01-11T15:28:16Z

Also for scheduler I see top visualisations have no data by default. Do you think we should update the notes for this?

They have values @gizas on version 8.7.0. Partial intervals were dropped, so you have to wait a few seconds to see them

gizas · 2023-01-11T15:30:50Z

@constanca-m great work overall and very detailed !!!

You can update the sections of related stories with the one elastic/observability-docs#2474 you created (in order someone lets say that will see your PR after sometime to see continuation of work and to have the reference)

And also make sure that in your How to test section: You need to enable Proxy,Scheduler, Controller manager datasets in kubernetes policy.

drewdaemon

I see 8.7 in the manifest.yml.

Changing dashboards and visualizations using an unreleased version of Kibana is unsafe since the App Ex team is still actively making changes to the Kibana code and potentially the data models. In other words, you have no guarantee that your changes won't be broken by the time 8.7 is released.

If this PR can wait until 8.7 is about to be released, that's best. If there's some reason it really has to be merged now, please reach out.

tetianakravchenko · 2023-01-12T13:15:23Z

@andrewctate

Changing dashboards and visualizations using an unreleased version of Kibana is unsafe since the App Ex team is still actively making changes to the Kibana code and potentially the data models. In other words, you have no guarantee that your changes won't be broken by the time 8.7 is released.

main reason of using 8.7 here, is because integration package must be compatible with beats version - elastic/beats#34161, where some new fields were introduced, that we rely on in this PR. What then would be the best approach in this case?

constanca-m · 2023-01-12T14:02:24Z

main reason of using 8.7 here, is because integration package must be compatible with beats version - elastic/beats#34161, where some new fields were introduced, that we rely on in this PR. What then would be the best approach in this case?

I tried to backport the changes on metricbeat to 8.6 since the metrics as they are now are not correct. I am not sure if it worked, but if it did, I am planning on changing the condition to 8.6 instead of 8.7 @tetianakravchenko

tetianakravchenko · 2023-01-12T14:29:57Z

packages/kubernetes/kibana/dashboard/kubernetes-5e649d60-9901-11e9-ba57-b7ab4e2d4b58.json

+                        },
+                        "query": {
+                            "match_phrase": {
+                                "datastream.dataset": "kubernetes.proxy"


shouldn't it be data_stream.dataset ? 🤔

comparing to https://github.com/elastic/integrations/pull/4972/files

the same for other dashboards in this PR

Yes, thanks for pointing that out. I am going to test it now that the changes were backported on metricbeat and I will make the commit for version 8.6 accordingly

drewdaemon · 2023-01-12T15:05:26Z

@tetianakravchenko

main reason of using 8.7 here, is because integration package must be compatible with beats version - elastic/beats#34161, where some new fields were introduced, that we rely on in this PR. What then would be the best approach in this case?

I understand. Your reasons for using 8.7 are valid.

If @constanca-m is successful in making the changes to the dashboards on 8.6 instead, that is the most straightforward solution. If not, my suggestion is to wait on merging this PR until at least the day after 8.7 feature freeze (or even a few days before public release). That gives you a chance to verify that your dashboard changes still work before committing them to the main branch. Let me know if there's some reason this strategy won't work for you.

Signed-off-by: constanca-m <constanca.manteigas@elastic.co>

…s' into fix-kubernetes-deprecated-metrics

Change requested is fixed.

elasticmachine · 2023-01-30T07:33:22Z

Package kubernetes - 1.31.1 containing this change is available at https://epr.elastic.co/search?package=kubernetes

constanca-m requested review from a team as code owners January 9, 2023 11:49

constanca-m requested review from gsantoro, tetianakravchenko and gizas January 9, 2023 11:49

constanca-m self-assigned this Jan 9, 2023

constanca-m closed this Jan 9, 2023

constanca-m reopened this Jan 9, 2023

elastic deleted a comment from elasticmachine Jan 9, 2023

constanca-m closed this Jan 9, 2023

constanca-m reopened this Jan 9, 2023

elastic deleted a comment from elasticmachine Jan 9, 2023

gsantoro approved these changes Jan 10, 2023

View reviewed changes

tetianakravchenko reviewed Jan 10, 2023

View reviewed changes