Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate Kibana Alerting amount of configuration pulls #161382

Open
philippkahr opened this issue Jul 6, 2023 · 4 comments
Open

Investigate Kibana Alerting amount of configuration pulls #161382

philippkahr opened this issue Jul 6, 2023 · 4 comments
Labels
Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)

Comments

@philippkahr
Copy link
Contributor

Kibana: 8.8.1

When looking at the instrumentation of Kibana Node.JS I can see what is happening behind the scenes when a simple Kibana ES Query alert is run.

I observe a total of 5 calls to Elasticsearch: GET /.kibana_8.8.1/_doc/strava%3Aconfig%3A8.8.1 which don't make that much sense, especially as it seems those are always wrapped with has_privileges calls. Shouldn't that be handled by single call? It's a bit related to this discussion: #161229

When looking at all spans that are part of this the 5x config pull might still seem insignificant, but that is only because it is an unused and bored cluster. Imagine a cluster that is under heavy load and needs to provide 5 times the same data, accompanied with the has_privileges calls.

image

platform-metrics kb europe-west3 gcp cloud es io_app_apm_services_kibana_transactions_view_kuery=labels deploymentId_%20%2212a0e5b525c14e57b156463ee7c8af67%22 rangeFrom=now-24h%2Fh rangeTo=now environment=ENVIRONMENT_ALL serviceGroup= compa

@botelastic botelastic bot added the needs-team Issues missing a team label label Jul 6, 2023
@nreese nreese added the Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) label Jul 7, 2023
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@philippkahr
Copy link
Contributor Author

Instead of doing 5 GETs, could we merge those into a single MGET?

@pmuellr pmuellr moved this from Todo to Awaiting Triage in AppEx: ResponseOps - Execution & Connectors Aug 9, 2023
@pmuellr
Copy link
Member

pmuellr commented Aug 9, 2023

It's not clear to me who's making these calls. I believe that [{space}:]config document is part of the "Advanced Settings". Which rules don't use directly. The ES Query rule type can use data views, which is I'm guessing how this leaked in.

I think we're going to have to step through the code in alerting, with some instrumentation on those ES GET calls (I guess somewhere in UI settings) to see if we can figure out where these are coming from.

It's not clear to me that we realized this was a UI Settings thing - which seems very odd to me - my first thought that this was related to all the other seemingly duplicate calls we make. These ones? Dunno.

So, I'm going to put this back into the triage bucket, for our next session. I think someone should do some time-boxed investigation to try to find out why these calls are being made - we can then try to figure out how to fix this as the next work item.

@ersin-erdal ersin-erdal moved this from Awaiting Triage to Todo in AppEx: ResponseOps - Execution & Connectors Aug 10, 2023
@mikecote
Copy link
Contributor

Similar comment as #161229 (comment)

Optimizations have been made in the Kibana Alerting and Task Manager framework to not call the has_privileges API as frequent. A downstream issue still exists whenever using search source / data view services (#192170) and some calls are still made by rule types (ex: security detection rules). Calls to such endpoint should be drastically reduced now. Perhaps sufficient to close this GitHub issue?

It seems to be a pattern whenever loading the config SO that a has privilege check occurs right before. Perhaps we're in a sufficient state now to close this GitHub issue?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams)
Projects
No open projects
Development

No branches or pull requests

5 participants