-
Notifications
You must be signed in to change notification settings - Fork 8.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Infra UI Meta Issue] Improve UI Field Selection for Metricbeat #40277
Comments
I filed this issue #24709 in Kibana some time ago which I think is related. It seems Kibana already has the capabilities for option 2 today which I think would be right approach. |
A few possibilities come to mind, which are mostly independent: Grouping: Maybe we could find a good middle ground by partitioning the shown list of fields into multiple sections? There could be a "recommended" or "commonly used" section at the top and "everything else" in a second section at the end. The partitioning could go even further by grouping the fields into sections that map to the ECS fields sets (base, agent, network, geo, etc). Async cardinality: We could also asynchronously calculate the cardinality of the fields when the user opens the menu. That allows them to immediately select something if they know what they were looking for, or to wait for additional details to load. On the other hand, the incremental addition/re-sorting might be confusing and it still causes some load on the cluster. Batch cardinality calculation: There is task manager available in Kibana, that can be used to perform coordinated batch operations. We could just pre-calculate the cardinalities every N minutes and store the results in a saved object for constant-time retrieval. (That could also be useful for many other applications, so it might make for an interesting shared service.) |
Unless we find a way to query for the relevant metrics that returns quickly, I think it needs to happen in the background, along the lines of "Batch cardinality calculation". An option is to investigate if we can use the new data frames transformations: https://www.elastic.co/guide/en/elasticsearch/reference/master/preview-data-frame-transform.html Another option is that metricbeat simply emits a document for each metric it has collected the last minute.
We could eventually roll this up, so you'd have one document per metric per day. Admittedly, it's not great. |
Here is my solution to this problem: #36843 |
@simianhacker This is great. Is this similar to what the "left bar" does discussed in #24709 ? |
Let’s discuss on the PR? |
@bleskes suggested we discuss if we can have the Field Stats API (doc_count) back, as described here: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/search-field-stats.html#_field_statistics_2 |
Would be nice to get something similar back in ES directly. In the past Field Stats supported some constraints on indices. It would be nice to have this constraint instead on date range especially as we are using ILM which means one index can contain the data up to a month. If we only look for 1 day of data we should only see the fields used during this day. |
This effort overlaps with the discussions we've been having about improving the usefulness of index patterns. The index pattern service is the natural place to store this extra information, instead of calculating it on every load. The service also lets us share it across all Kibana apps. Improvements to the index pattern service are being discussed here: #35481 |
this problem also applies to the second drop down: "graph by". We should only show the fields relevant for the metrics you've selected. For the "graph per" dropdown, when a user has already selected a metric, we could query for X number of documents that have those metrics and only show the labels/keyword-fields available in those documents? The reason it works here as opposed to in the main metrics selector, is that for the same metric, there will be a much smaller variability in the labels/keyword-fields than if you look at the general population. |
Problem
The fields returned from the
_field_cap
API for Metricbeat indices includes over 2000 fields since every possible field is present in the index mapping. The current approach in the UI, is to present the user with a "combo box" that allows them to narrow down the list by searching. This requires the user have intimate knowledge of the Metricbeat fields. There is not an Elasticsearch native way to filter down this list to only include fields with actual data.Possible Solutions
count
requests (100 at a time) to check if the field exists in the current time range. Initial attempts have also proven to be expensive as well.event.dataset
ormetricset.module
as a required prefix. This would require an aggregation to be run on the data but the potential cardinality ofmetricset.module
is relatively low. We would also need to keep a whitelist of prefixes for things likehost
andcloud
. The down side to this is any field we don't recognize for as an "official" prefix would be filtered out; this would apply to user defined fields.Related Issues
https://github.com/elastic/dev/issues/1223
#36843
#38020
#39613
#40120
#41090
The text was updated successfully, but these errors were encountered: