-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Monitoring] Only do a single date_histogram agg for get_nodes calls #43481
[Monitoring] Only do a single date_histogram agg for get_nodes calls #43481
Conversation
💔 Build Failed |
💚 Build Succeeded |
Pinging @elastic/stack-monitoring |
* This work stemmed from this issue: https://github.com/elastic/kibana/issues/43477 | ||
* | ||
* Historically, the `get_nodes` function created an aggregation with multiple sub `date_histogram` | ||
* aggregations for each metric aggregation. From a top down view, the entire aggregations look liked: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looked like
:)
This looks reasonably straightforward and the gains are pretty obvious. Are you thinking that down the road we'd continue by changing the rest of the code to avoid having to do this convert/unconvert strategy? |
I don't think it's that simple. Historically, the aggregation part of the query looks like (I intentionally removed a few of the aggregations to shorten it):
The code then takes out the named aggregation buckets, which makes each aggregation structure look like (again, removing redundant parts):
The key part of this is, under each named aggregation, the structure is identical (or rather consistent) in that it has like This consistent format is expected in other areas of the code. I'll use the Some other examples of shared code expecting this normalized structure:
Now, let's look how this changes with a top level That might look like (again, I removed redundant parts):
This won't work because our named aggregation doesn't actually have a sub aggregation anymore (it used to be
Even though we are supposed to have 6 named aggregations, we only see one This is the main issue. We need to convert the structure to maintain the named aggregation, but rather than having it in a higher aggregation, we need to prefix the metric aggregation named with the named aggregation, something like Of course, once we do that, we need to undo that structure, as the rest of the shared code (linked earlier) would not know to look for that, or even care about the prefix naming. I'm all for a better approach if anyone has any thoughts, but I didn't see a path forward outside of what I did in the PR. |
Thanks for the explanation. I thought about this and I can't see a better way given the way the application is structured right now. Another approach seems like it might be to back away from trying to fetch a set of metrics for a set of nodes in a single ES request and instead break that up into multiple smaller requests but I'm pretty wary about going down that road, especially given that it's hard to believe that wouldn't be dramatically less performant for this case. |
…_more_date_histograms
💚 Build Succeeded |
x-pack/legacy/plugins/monitoring/server/lib/elasticsearch/convert_metric_names.js
Outdated
Show resolved
Hide resolved
Awesome stuff @chrisronline! This fix alone will have a great impact on performance |
💚 Build Succeeded |
@cachedout I think this should be all ready for you to review again when you get the chance |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is good. 👍 on the thorough documentation and explanation in the comments.
…_more_date_histograms
💔 Build Failed |
retest |
💚 Build Succeeded |
…lastic#43481) * I think this is working now * Add a way to uncovert, and then fix tests * Remove unnecessary export
…lastic#43481) * I think this is working now * Add a way to uncovert, and then fix tests * Remove unnecessary export
As a related side-note, the ES team is tracking a severe performance regression in |
@cachedout Nice find! I'm not quite sure how this information relates to this PR though. Do you mind providing more more details about the connection? |
I only put it here because it was one of the places where we've been working with |
Resolves #43477
This PR changes the query executed when we fetch the nodes listing page from within the Stack Monitoring UI.
Currently, the query does multiple sub
date_histogram
aggregations, when only one is necessary. Unfortunately, this change is more complicated as parts of the code flow involved touch multiple shared areas of code, not easily changed to support this PR. As a result, we have a bit of shuffling and unshuffling happening to ensure we can perform a singledate_histogram
query, yet also return the same, expected structure to the rest of the monitoring code.