[Lens] Field existence via 500 sample is not intuitive #58330

timroes · 2020-02-24T12:09:07Z

We currently only sample over the first 500 documents within the configured time range and filters (once #52826 is fixed). Those 500 documents might not be very representative over the overall documents matching this filters/queries/timerange, and thus a lot of available fields might be hidden.

We currently see a high amount of confusion among users around fields not appearing because of that. It's the most common question currently raised across all sources (forums, issues, twitter...).

We should discuss how we want to handle that in a less confusing way. I have a couple of suggestions what we could do to improve this situation:

Increase the sample size. I don't think this will actually help us much. We'll just increase query time to load the fields, and even if we go 10 times to 5000 documents, the dataset sizes might just be too small to get a meaningful sample. Gathering the true data is also just too expensive in general to do a proper terms aggregation.
If a user searches for fields we might also show them fields without data that are matching their data (at least if no other fields are matching), since I think a common try to solve that issues is for users to first search for the field.

I am not sure if we have better solutions, but I think given how often this issue pops up, we need to think about how we can create a better UX here.

Similar discussion: #40277

cc @cchaos

elasticmachine · 2020-02-24T12:09:09Z

Pinging @elastic/kibana-app (Team:KibanaApp)

cchaos · 2020-02-24T15:01:47Z

Have you thought about some sort of lazy loading where the fields list shows these first 500 documents, but then continues to query the rest adding as it gets more information?

timroes · 2020-02-24T15:59:30Z

A couple of other suggestions that came up:

Maybe renaming the "Filter by field type" label to open the popup to "Filter fields", since it's currently not clear from the label, that the popup dialog might contain that filter for showing fields without data.
Add a button or descriptive text at the end of the field list, stating that there are more fields without data hidden and give a "show more" button directly at the end of the list. So users scrolling through the full list searching their field, will directly at the end, when not finding their field see further actionable items to get to their fields.
As suggested by Caroline, we could do a bit more lazy loading of the field list. We could load the first 500 documents and have all the fields in there. After we have that information, we do an potential slower exist query on all the fields that are still hidden and show those fields later. Since we only need to do that on all fields we don't already know from the first 500 documents they have data, we might never even need those 2nd query for smaller datasets, or less sparse datasets, since they might already have all fields within the first 500 documents. My main concern about that approach is: how are we mixing in the newly loaded fields into the field list, without disturbing the users interaction, since they might right at that point work with the list, and we on the fly mix up the list (jumping content), which always creates a horrible UX.

nreese · 2020-02-24T19:21:27Z

Why even pull documents at all? Why not just load the field list from the index pattern saved object? Then, lazy load field details so that when a user hovers over a field for details, a terms aggregation or whatever is used to fetch the field details in a separate request.

wylieconlon · 2020-02-24T19:41:42Z

@nreese Because we have the beats problem: metricbeat has 3900 individual fields in the default configuration, and obviously not all of those are used. So we want to provide the best possible list of fields as quickly as possible on first load and as filters are added.

wylieconlon · 2020-02-24T19:58:41Z

To cross-link in some of the related discussion over time:

Discussion for KQL autocompletion KQL autocomplete should show existing fields and not all fields #24709
Discussion for Infra [Infra UI Meta Issue] Improve UI Field Selection for Metricbeat #40277
Unmerged solution for infra [Infra UI] Only display available fields in Metric Explorer #36843
Merged solution for Infra [Infra UI] Limit Metric Explorer fields #43322

What would the ideal solution to this problem be? Would it require Elasticsearch support?

wylieconlon · 2020-02-24T20:57:05Z

I couldn't find a discussion in the ES repo, so I added one elastic/elasticsearch#52730

AlonaNadler · 2020-02-25T01:17:41Z

Sampling more documents by default is not recommended. We anyway make multiple queries it wouldn't be good to increase it.

The scenario I find the most unsettled is when the preview comes empty and when dropped it shows data, it makes Lens seem unreliable. I understand it is due to sparse data, maybe we should try to optimize only this use case.

Regardless, I like @timroes suggestions

If a user searches for fields we might also show them fields without data that are matching their data (at least if no other fields are matching), since I think a common try to solve that issues is for users to the first search for the field.

nreese · 2020-02-25T01:42:45Z

It sounds like the problem is a result of a design decision to support a large field list that is sparsely populated. I would recommend not optimizing the experience for beats. I think the beats problem needs to be solved upstream of lens and then lens can optimize on just showing available fields in the index pattern and lazy load value details with aggregations to provide a view of the entire data set for a time range as needed vs querying for the first 500 documents and showing values for that poorly chosen sample.

flash1293 · 2021-03-25T09:27:23Z

As discussed with the Elasticsearch team, it might be possible to replace the current approach by either a multisearch or a filters aggregation, kicking off a separate search for the existence of each field. If this is justifiable from a performance and resource usage perspective, it would be a preferred solution because it will provide 100% accurate results (instead of the potential of false negatives as in the current solution).

We are going to explore this approach by creating a POC and testing it against common production configurations.

ghudgins · 2021-08-17T15:32:41Z

POC didn't yield an implementation change so we're keeping this open. Need to continue to collaborate with Elasticsearch team on field_caps

flash1293 · 2022-08-31T08:17:12Z

Fixed by #112782 we don't use sampling anymore

timroes added discuss enhancement New value added to drive a business result Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens labels Feb 24, 2020

This was referenced Feb 24, 2020

Lens is not showing fields from the index #58298

Closed

[Kibana Lens] Field not showing up despite having data #58125

Closed

rayafratkina mentioned this issue Feb 24, 2020

[Lens] Add tour component to time filter when there is no data #58276

Closed

wylieconlon mentioned this issue May 21, 2020

[Lens] Use accordion menus in field list to fix the "where are my fields" problem #67203

Closed

wylieconlon mentioned this issue Nov 12, 2020

[Lens] In-product explanation of how "existing fields" are chosen #83321

Closed

flash1293 mentioned this issue Jan 29, 2021

Lens showing empty field when there is data #89716

Closed

This was referenced Feb 23, 2021

[Lens] Existence fetch can take a long time #92493

Closed

[Meta][Lens] Editing experience #57706

Closed

timductive mentioned this issue Apr 5, 2021

[Canvas] Elements using Raw Documents datasource don't list all columns #24375

Closed

rayafratkina mentioned this issue Aug 31, 2021

Filtered Field Lists #101937

Closed

flash1293 mentioned this issue Sep 22, 2021

[Lens] Switch field list filtering to filtered field caps instead of sampling #112782

Closed

flash1293 closed this as completed Aug 31, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Lens] Field existence via 500 sample is not intuitive #58330

[Lens] Field existence via 500 sample is not intuitive #58330

timroes commented Feb 24, 2020 •

edited

Loading

elasticmachine commented Feb 24, 2020

cchaos commented Feb 24, 2020

timroes commented Feb 24, 2020

nreese commented Feb 24, 2020 •

edited

Loading

wylieconlon commented Feb 24, 2020

wylieconlon commented Feb 24, 2020 •

edited

Loading

wylieconlon commented Feb 24, 2020

AlonaNadler commented Feb 25, 2020

nreese commented Feb 25, 2020 •

edited

Loading

flash1293 commented Mar 25, 2021

ghudgins commented Aug 17, 2021

flash1293 commented Aug 31, 2022

[Lens] Field existence via 500 sample is not intuitive #58330

[Lens] Field existence via 500 sample is not intuitive #58330

Comments

timroes commented Feb 24, 2020 • edited Loading

elasticmachine commented Feb 24, 2020

cchaos commented Feb 24, 2020

timroes commented Feb 24, 2020

nreese commented Feb 24, 2020 • edited Loading

wylieconlon commented Feb 24, 2020

wylieconlon commented Feb 24, 2020 • edited Loading

wylieconlon commented Feb 24, 2020

AlonaNadler commented Feb 25, 2020

nreese commented Feb 25, 2020 • edited Loading

flash1293 commented Mar 25, 2021

ghudgins commented Aug 17, 2021

flash1293 commented Aug 31, 2022

timroes commented Feb 24, 2020 •

edited

Loading

nreese commented Feb 24, 2020 •

edited

Loading

wylieconlon commented Feb 24, 2020 •

edited

Loading

nreese commented Feb 25, 2020 •

edited

Loading