[Lens] Switch field list filtering to filtered field caps instead of sampling #112782

flash1293 · 2021-09-22T08:13:40Z

Blocked on #95558

The current approach of field list filtering in Lens is problematic because false negatives can occur (see #58330). To fix this in a forward-facing way, it should be replaced by querying the field caps API with the currently applied filters.

As this can negatively affect the behavior of index structures which might still be used by some people, there should be an advanced setting flag to switch back to sampling (similar to how legacy chart implementations are treated). There should be telemetry for this setting - if there is no significant usage, sampling can be removed completely.

ghudgins · 2021-11-02T12:48:10Z

can we verify if this implementation change will also effect the field dropdown (that has the same sections)? it would be really great if this changed both places as I have seen #58330 occur (false negatives) in the field list as well

flash1293 · 2021-11-02T15:40:17Z

@ghudgins yes, this is powered by the same underlying information

mattkime · 2021-11-24T16:15:52Z

Perhaps we could provide dataView.getFilteredFields(Filter[])

@ppisljar @sixstringcode any thoughts

flash1293 · 2021-11-24T16:39:04Z

@mattkime I think this makes a lot of sense (and would be a great fit for our architecture) - just one small note: This function would also need to take time range and current query as arguments (or there needs to be some way to convert them into filters, not sure whether that's easily possible at the moment)

mattkime · 2021-11-24T16:49:15Z

@flash1293 What format do you have the time range and current query in? Are you using the query bar for this? Looks like there's more flexibility than I initially thought - https://www.elastic.co/guide/en/elasticsearch/reference/master/search-field-caps.html#search-field-caps-api-request-body

At any rate, my goal is to properly pass through the query / filter. We should verify the details of how this should work.

sixstringcode · 2021-11-24T18:31:59Z

@mattkime we can move forward with the getFilteredFields change. Does this change apply to the field list in Discover also?

flash1293 · 2021-11-24T18:39:42Z

@mattkime I’m also completely fine with passing a raw Elasticsearch query in there - it’s not an issue to compile filter plus timerange plus query as part of the consumer and it would keep the API flexible. In Lens state, timerange is a data “TimeRange” object, query is a data “Query” object (the ones returned by the query bar ui component)

@sixstringcode Discover can’t use this in the current state because the field list is built by the actual loaded documents - this makes it always “correct” in the sense of the doc table and the field list being in sync (this is the main problem we want to solve with this api in Lens)

mattkime · 2021-11-24T19:40:26Z

I think the function signature would need to be a bit different from what I stated earlier. We'd add functionality to getFieldsForWildcard and create a helper function that takes the dataView and the query. Same final result.

Are we certain that all queries could be passed into the field caps call?

@flash1293 would passing the raw ES query have type checking?

flash1293 · 2021-11-25T07:10:09Z

Are we certain that all queries could be passed into the field caps call?

not sure what you mean, the Elasticsearch docs state a query object can be passed in (like the one used for search requests)

I don’t think we have proper type checking for ES query objects in Kibana. For consumers of the filter bar (like Lens), I think the following should be the easiest to use:
interface SearchContext {
query?: Query;
filter?: Filter[];
timeRange?: TimeRange
}

then the new function itself would put together the parts into a single Es query

ppisljar · 2021-11-25T10:09:18Z

searchsource doesn't know about timerange (its just a filter for it), dataviews seem a similar level of abstraction, so i would be ok if we just pass in filters and queries

flash1293 · 2021-11-25T10:11:19Z

Sounds fine to me as well

mattkime · 2021-11-27T05:14:33Z

@flash1293

I might be having trouble understanding the following statement from the docs -

query object Allows to filter indices if the provided query rewrites to match_none on every shard.

What if it doesn't rewrite to match_none?

flash1293 · 2021-11-27T11:20:02Z

@mattkime Someone from the ES team can provide a more thorough explanation, this is my understanding:
The filtered field caps api is not actually applying the query object like a regular search request would (e.g. comparing it to the Lucene indices in the data shards) instead, it’s only checking the shard meta data whether the query has any chance of returning data. For a lot of things, it’s not possible to tell (e.g. whether a fuzzy search string will match), but it’s possible in two key cases:

if a range filter on the time field is out of bounds for all data stored in the shard
If a constant keyword field has a different value than the one specified in the query (constant keyword means just a single value is allowed)

in these cases the query can be broken down to “won’t match anything” per individual shard purely based on shard meta data without looking at the actual data. If this happens for all shards of an index, the index’s fields won’t become part of the final field caps response - this is what we are looking for for Lens. If a field is in the full field list of a data view but not in the filtered field list for the currently applied filters/query/timerange, they are “empty” and hidden from the UI by default, otherwise they are shown as “available”.

Based on this, to answer your question directly - if it doesn’t rewrite to match_none, it’s possible the shard contains data that will be included in the current query, so Elasticsearch has to take the fields of the index into account for the response of the fields caps api (marking them “available” in Lens)

mattkime · 2021-12-06T05:10:28Z

@ppisljar Could you drive the filter / query portion of this interface? Or suggest someone else? Everything looks very straight forward but since I'm not familiar with that portion of the problem I might slow things down.

mattkime · 2021-12-06T20:16:16Z

I spoke with @ppisljar and we came up with -

dataViews.getFieldsForWildcard and dataViews.getFieldsForIndexPattern can be extended to take filter: BoolQuery; The filter would be created with buildEsQuery by the dataViews API consumer. At least thats my first pass at it - I'm avoiding requiring kbn-es-query as part of the data views plugin as it adds 30k - at least it did last time I checked.

buildEsQuery is here - packages/kbn-es-query/src/es_query/build_es_query.ts

Anyway, it seems some of these details might changes but they'd be inconsequential. Does this sound good @flash1293 ?

flash1293 · 2021-12-06T20:27:44Z

@mattkime sounds good to me, this will be fine for our use case.

flash1293 · 2022-01-11T09:42:05Z

This isn't blocked anymore - #121367 gives us the required API.

We can call getFieldsForIndexPattern with the provided filter and use the response to filter down the regular field list from the data view.

This should be implemented on the server as part of the existing existing_fields endpoint switching between this API and the existing sampling approach based on a newly introduced advanced setting switch.

flash1293 added enhancement New value added to drive a business result Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens blocked labels Sep 22, 2021

This was referenced Sep 22, 2021

[Lens] Add telemetry for field list available/emptiness #101467

Closed

[Lens] Cache field existence requests #104060

Closed

mattkime mentioned this issue Dec 17, 2021

[data views] Allow passing filter through to field caps for contextual field list #121367

Merged

flash1293 removed the blocked label Jan 11, 2022

flash1293 self-assigned this Jan 12, 2022

flash1293 mentioned this issue Jan 13, 2022

[Lens] Filtered field list using field caps API #122915

Merged

flash1293 closed this as completed in #122915 Jan 27, 2022

flash1293 mentioned this issue Aug 31, 2022

[Lens] Field existence via 500 sample is not intuitive #58330

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Lens] Switch field list filtering to filtered field caps instead of sampling #112782

[Lens] Switch field list filtering to filtered field caps instead of sampling #112782

flash1293 commented Sep 22, 2021

ghudgins commented Nov 2, 2021

flash1293 commented Nov 2, 2021

mattkime commented Nov 24, 2021

flash1293 commented Nov 24, 2021

mattkime commented Nov 24, 2021

sixstringcode commented Nov 24, 2021

flash1293 commented Nov 24, 2021 •

edited

Loading

mattkime commented Nov 24, 2021

flash1293 commented Nov 25, 2021

ppisljar commented Nov 25, 2021

flash1293 commented Nov 25, 2021

mattkime commented Nov 27, 2021

flash1293 commented Nov 27, 2021 •

edited

Loading

mattkime commented Dec 6, 2021

mattkime commented Dec 6, 2021

flash1293 commented Dec 6, 2021

flash1293 commented Jan 11, 2022

[Lens] Switch field list filtering to filtered field caps instead of sampling #112782

[Lens] Switch field list filtering to filtered field caps instead of sampling #112782

Comments

flash1293 commented Sep 22, 2021

ghudgins commented Nov 2, 2021

flash1293 commented Nov 2, 2021

mattkime commented Nov 24, 2021

flash1293 commented Nov 24, 2021

mattkime commented Nov 24, 2021

sixstringcode commented Nov 24, 2021

flash1293 commented Nov 24, 2021 • edited Loading

mattkime commented Nov 24, 2021

flash1293 commented Nov 25, 2021

ppisljar commented Nov 25, 2021

flash1293 commented Nov 25, 2021

mattkime commented Nov 27, 2021

flash1293 commented Nov 27, 2021 • edited Loading

mattkime commented Dec 6, 2021

mattkime commented Dec 6, 2021

flash1293 commented Dec 6, 2021

flash1293 commented Jan 11, 2022

flash1293 commented Nov 24, 2021 •

edited

Loading

flash1293 commented Nov 27, 2021 •

edited

Loading