-
Notifications
You must be signed in to change notification settings - Fork 8.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Kibana RSS / Memory usage seems higher in 8.2 vs previous releases (by 50% sometimes) #128061
Comments
I can copy in some additional research as posted in slack, if desired. But I could let Tyler, Mikhail, and Response Ops team members cite their own latest. @kobelb @mshustov @danielmitterdorfer fyi |
From Jonathan B: Jonathan Budzenski 3 days ago Jonathan Budzenski 3 days ago Jonathan Budzenski 3 days ago Jonathan Budzenski 3 days ago |
From Mikhail Shustov, Tyler and Patrick: Patrick Mueller:four_leaf_clover: 1 day ago Patrick Mueller:four_leaf_clover: 1 day ago Tyler Smalley:spiral_calendar_pad: 1 day ago |
@jbudz did you find anything when you did the revert you were trying? |
You can find a couple of spikes on the RSS chart of APM data collected on Kibana CI (around the end of January and 14.03) |
No that wasn't the root cause. Still had a similar memory profile. |
Reverting #126320 appears to fix this. Now: #128095 After reverting: #128096 |
Pinging @elastic/kibana-core (Team:Core) |
I added the blocker label for 8.2.0. Kibana's pretty close to OOM on a fresh cloud installation. |
I checked the memory snapshot with #126320 to be reverted. There are still a lot of As you can see, there are quite a few RxJS created. I noticed that this logic (with an observable creation) is called ~22000 times on start https://github.com/elastic/kibana/blob/main/src/core/server/status/plugins_status.ts#L132-L142 |
Maybe we can merge #128096? to check whether RSS is back to the norm on APM data gathered from CI https://kibana-ci-apm.kb.us-central1.gcp.cloud.es.io:9243/goto/ac99b0f0-a93f-11ec-94a3-2d0a38b40710 |
I pulled #128096 out of draft and added core as a reviewer - whatever's easiest to help investigate. Feel free to merge. |
The PR has been merged. I can see the drop in APM metrics, but it doesn't seem we are back to the norm. |
[Edited] - I misposted prior, sorry for any confusion. I'm not sure how to test this pre-merge. But I will be watching for afterwards. |
@gsoldevila is working on improving plugin status service memory footprint. I will test his PR and repeat heap snapshot analysis |
@kobelb Having #128324 merged, we can remove the According to APM metrics collected at CI, |
I'm among others I'm sure, but I'm waiting for... a success platform-build to get a new 8.2 snapshot and then a nightly run of the kbn-alert-load tool will pick up the change and show us the improvement (again). Very nice work! |
We believe the root cause has been addressed in #128324, so we will go ahead and close this issue. @EricDavisX Feel free to reopen or open a new issue if you find there is something we have missed! |
Kibana version: 8.2.0-SNAPSHOT (details below show the problem seems to have started around March 1 2022)
Elasticsearch version:
same as Kibana
Original install method (e.g. download page, yum, from source, etc.):
The issue was first found by the kbn-alert-load Rules Performance tester, by the Response Ops team. I can fwd you a slack message that relates, though I will try to capture the most relevant data here. I can also forward you more details, like the kbn-alert-load jenkins job output that can be re-run on a latest/later-new snapshot easily to re-run the test if we like.
Describe the bug:
The test starts up several cloud clusters, and individually assesses their memory usage and other metrics after creating a variable number of rules a variable % of which can create alerts from their execution.
Screenshots (if relevant):
Errors in browser console (if relevant):
n/a
Provide logs and/or server output (if relevant):
Here are the things from slack that we've already gathered:
Questions asked (and answered, mostly):
Is this repeatable or was this a s single run that could be off (I guess not)?
Yes, repeatable with the kbn-alert-load tool and locally
Is that also reproducible locally?
not applicable to the tool's use, but yes others see memory concerns locally
From Brandon K: Anecdotally, I'm seeing the memory usage increase from 8.1 to 8.2 even when there are no alerting rules running. The following are heap usages when starting up a new Kibana instance and logging in:
8.1: 403 MB
8.2: 574 MB
For an RSS increase, the first thing that would be interesting to understand whether it’s an increase in native memory or heap memory.
Heap dumps from Brandon K are here:
https://drive.google.com/drive/folders/1n0jjJ_H3oEbViMYjd8eKKqqcwM2mbcOX?usp=sharing
NOTE: we suspected possibly a node version? But it does not seem to be the Node version btw, 8.0 uses Node 16.13.2 (see https://github.com/elastic/kibana/blob/8.0/.node-version) as well as main (https://github.com/elastic/kibana/blob/main/.node-version).
Patrick M offered up a script he had used to create many running rules locally, if helpful:
https://gist.github.com/pmuellr/f30331660ae032a0a7ccf2767aea3900
The text was updated successfully, but these errors were encountered: