Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Infra UI] Only display available fields in Metric Explorer #36843

Closed
wants to merge 4 commits into from

Conversation

simianhacker
Copy link
Member

Summary

This PR fixes the Metric Explorer field drop downs by only displaying the available fields instead of every possible field in the index pattern. It accomplishes this by running a terms aggregation on event.dataset then using top_hits as a sub aggregation for pulling out a sample documents (for the last 5 minutes). The sample documents are then reduced to a list of sample fields. This list of sample fields is use to filter the actual results from the Field Capabilities endpoint. This introduces a new useFields hook to query the fields from the /api/infra/available_fields API.

Before

image

After

image

Checklist

Use strikethroughs to remove checklist items you don't feel are applicable to this PR.

For maintainers

CC: @makwarth @roncohen @jasonrhodes

@elasticmachine
Copy link
Contributor

Pinging @elastic/infrastructure-ui

@elasticmachine
Copy link
Contributor

💔 Build Failed

@elasticmachine
Copy link
Contributor

💚 Build Succeeded

query: {
range: {
[options.timeField]: {
gte: 'now-5m',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if we would pull in the data for the currently selected range instead? Would this become too slow for larger periods?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will become too slow. After some thought last night I was planning on adding the time range to the query.

@exekias
Copy link
Contributor

exekias commented May 22, 2019

I'm not sure this will work for the general case. It is probably a good solution for many metricsets, but some others have dynamic fields. A few examples:

  • docker module reports info from several containers, with different label fields on them
  • prometheus collector retrieves metrics in many documents, all of them with different fields

For modules with dynamic fields, taking the last document is not enough, at you will miss many other fields that are present in the rest of docs.

Also for queries on bigger ranges, limiting this to the data reported in the last 5 mins may not be enough, how would this work if I query for the last 24h and some agent/module stopped reporting hours ago?

@roncohen
Copy link
Contributor

Thanks! As ruflin hinted at, I guess fields will disappear if they didn’t report for a while (5mins here)? For Prometheus, since “dataset” is not distinctly set for different sets of metrics, we could risk not showing all the existing fields?

@roncohen
Copy link
Contributor

Sorry, didn’t see @exekias response

@weltenwort
Copy link
Member

weltenwort commented May 22, 2019

Leaving aside the query aspects discussed above, I hope it isn't inappropriate for me to comment on the API:

I know that the departure from graphql is under discussion right now, but wouldn't it be simpler and more consistent to just amend the existing status field of the source with this? Maybe add the presence information to the indexFields field or add a corresponding dataFields field? Even if we decided to migrate to a non-graphql API soon, migrating from a consistent set of APIs would be easier and this potential change to the current graphql schema seems rather trivial.

Client-side the information would automatically become available via the source context hook.

@simianhacker
Copy link
Member Author

simianhacker commented May 22, 2019

Wow! Thanks @weltenwort @exekias @roncohen @ruflin for the quick feedback. I should have mentioned in the PR description that this is sort of a prototype for now, this is exactly the conversation I was hoping to get started.

@weltenwort I thought about changing the source endpoint but opted for this approach just to see if this would actually solve the problem. Modifying the source endpoint should be trivial enough once we prove out that the solution actually works. I'm also not sure if this is really a good idea for the Log viewer since it's designed to work with unstructured data. For the purposes of this discussion I limited the changes to just the Metrics Explorer but if it works out, we can come to a consensus on the final implementation.

@exekias This will absolutely NOT work for the general use case, for that we would need ES to add some metadata about each field and whether it actually exists in the index. But for the Metricbeat data this might be enough (Inventory View and Metrics Explorer).

@exekias For the prometheus and docker use cases, are those prefixed with a known pattern? One thought I had was to create a list of prefixes we match (but only if the event.dataset returns a document count) regardless of whether or not they show up in the sample documents. We might also what to have a fallback when nothing shows up in the sample documents then we just return all the fields in the mappings.

@ruflin
Copy link
Contributor

ruflin commented May 23, 2019

One way to keep this idea moving and not require it to be perfect from day one would be that we have the list you showed above and the last item in the list is: "show all fields" which then falls back to the old behaviour. So if a user misses a field, he can always opt in to see all the fields.

@elasticmachine
Copy link
Contributor

💔 Build Failed

@exekias
Copy link
Contributor

exekias commented May 27, 2019

@exekias For the prometheus and docker use cases, are those prefixed with a known pattern? One thought I had was to create a list of prefixes we match (but only if the event.dataset returns a document count) regardless of whether or not they show up in the sample documents. We might also what to have a fallback when nothing shows up in the sample documents then we just return all the fields in the mappings.

I think we can know what modules are reporting dynamic fields as of today, there are a few of them: http, jolokia, kubernetes, docker, etc.

Also docker and kubernetes data can be present in all documents regardless their event.dataset, as they are added as metadata, this is not important when looking for a metric to graph (as they mostly hold keywords), but they are if we use this same trick for the search/filter box.

@jasonrhodes jasonrhodes added the Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services label Jun 17, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/infra-logs-ui

@elasticmachine
Copy link
Contributor

💔 Build Failed

@jasonrhodes
Copy link
Member

One way to keep this idea moving and not require it to be perfect from day one would be that we have the list you showed above and the last item in the list is: "show all fields" which then falls back to the old behaviour. So if a user misses a field, he can always opt in to see all the fields.

This sounds like a not terrible first iteration? Otherwise, we need to think through other ideas to trim that list down because "the aerospike problem" (as I'm now referring to this) is real ...

@tbragin
Copy link
Contributor

tbragin commented Jun 25, 2019

Great discussion in this Draft PR.

I'd like to see us continue to think about how to solve the metrics selection problem in Metrics Explorer (and more generally). I think valid concerns are raised around speed and accuracy of the approach. but like @jasonrhodes I'd be interested to explore the "Show all Fields" option.

We are also looking at a few smaller, more targeted, changes in these issues:

[Infra UI] Make metrics explorer have a better initial experience #39552

[Infra UI] Limit "group by" choices for Metrics Explorer UI based on metric selection #39613

@jasonrhodes
Copy link
Member

@simianhacker would you mind making a new issue that references this and describes the problem, along with a short summary of the notes from this discussion? The AC for that ticket can mention that we'll have a "show all fields" escape hatch. We can continue to discuss if that's the best/right approach in that issue if necessary.

I'll close this PR for now so that we can track that work from the new issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
discuss Feature:Metrics UI Metrics UI feature Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants