Using partial results in Discover #76307

lizozom · 2020-08-31T16:54:12Z

In x-pack, the data.search service receives partial results returned from Elasticsearch'es _async_search endpoint.
However, this capability is not exposed by SearchSource, as we wait until the final result is received, before returning it.

While using partial results in most visualizations requires significant work on expressions, we could still use partial results in discover, maps, TSVB and Timelion relatively easily, by consuming the Observable returned from data.search.search and also making the fetch$ method of SearchSource public.

We could attempt a POC on Discover, to demonstrate these capabilities, once the Discover search query is split into two (#69134, #55975).

It is important to note that making this change would however mean that msearch would not work for those solutions.

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-08-31T16:54:14Z

Pinging @elastic/kibana-app-arch (Team:AppArch)

AlonaNadler · 2020-09-01T00:29:47Z

we could still use partial results in discover

Wow this could be amazing !! especially in Discover

lizozom · 2020-10-27T13:49:54Z

@timroes had mentioned that due to upcoming changes in Discover this task will become irrelevant.
I'll let him elaborate, but in the meanwhile I'm closing the issue.

timroes · 2020-10-27T14:23:01Z

After splitting up the query in Discover we'll have 3 queries there:

Loading documents (without exact hit count, sorted by discover sort order)
Loading exact hit count (only loading exact count nothing else)
Loading data to display in the visualization (in case it's a time based index)

The last query is the one where it would make most sense to use partial data. During the refactoring to split those queries (and get rid of some cyclic dependencies) we'll use esaggs to load the data for the chart. Thus there is no separate task here than loading partial data via expressions in general.

The second query could potentially use it, thought since it simply loading the count of matches, and we already with the first (fast) query show the estimated count, it would not create much more than a "counter animation" of the hits, so you'd first see "> 50,000", "51,230" , "55,000" instead of jumping from "> 50,000" to "55,000" directly. Also from my current understanding of ES performance that query wouldn't even take significantly longer, so it's not really a benefit, over complexifying the architecture for this (if we simply want a counter animation, we could simply randomly increase that number in JS :D)

The request to load the documents should (that's the whole purpose of those splitting) be significantly fast since it simply loads documents. In the case where partial documents would benefit us (because we know they are loading at the end of the list), like in the time based use-case, this query would not depend at the size of the data you have in ES, and always return extremly fast, so there is no benefit showing partial results here (we'd most likely not ever run into the case where it would be slow enough to work with partial results).

AlonaNadler · 2020-10-27T14:36:58Z

How about partial loading for the documents? that is the use case our users will need the most, also the one that our competitors come up often in. Often when users need to do very long searches for bad actors they might search for long time range and can potentially wait hours for the query to return. In these searches, there is a lot of benefits to partial results

timroes · 2020-10-29T11:16:06Z

For the document loading query (1 above) we could potentially enable it, though we need to find a way not to have the users content jump, i.e. not automatically showing new results incoming, since - so far my understanding - we're not getting a guarantee that the partial results are coming in in the requested sort order (i.e. the 2nd partial results, could fill in documents randomly within the previously ones loaded). Thus we could have a mechanis, that informs the user that new data is there, and than clicks it to refresh. Automatically updating only makes sense as long as we can be sure, we're not suddenly removing documents the user is interacting with atm, which would only work if we have a guarantee that partial documents are arriving in the requested sort order step by step (which from my last syncs with ES is not the case). I'll reopen this so we can discuss further details on how we could implement such a behavior in discover safely.

AlonaNadler · 2020-10-29T14:37:52Z

@timroes I agree with you that ideally, we want the results in some sort order. However, I think it is worthwhile even if it doesn't show up in a specific order as a first step. We can then see how much users are bothered by the lack of order and add it in the future.
This capability will be useful in queries that run for a long time (few minutes- few hours) and let users get some of the results as they stream in

lizozom · 2020-11-03T12:59:56Z

@timroes @AlonaNadler
~~Loading the documents themselves should be very fast, shouldn't it, as it's only a small chunk of documents each time?~~

Update

Just synced with @jimczi, async_search doesn't support partial results for ~~top hits~~ latest documents, only for aggregations. So doing partial results won't be possible. Anyway, once top hits are fetched separately from the aggs, they should return much much faster.

If you are ok with in, I do think that this issue can be closed.

AlonaNadler · 2020-11-03T16:04:35Z

I think the main advantage in having partial results in Discover is having it for the results of the raw documents (in the red square):

Mainly since it will allow users who have long queries to see intermediate results while they wait instead of waiting for the query to complete.
Partial results for the histogram in Discover is nice to have but the main feature when it comes to make it slow in Discover is to stream ` the documents and view the results while the query still in progress.

@lizozom as far as I can tell, and @timroes knows better. Discover doesn't use Top hits for the raw documents results

timroes · 2020-11-03T16:39:11Z

Clarified with Liza, that "Top hits" in this case was indeed referring to the "top search results" we're using and not the "top hits" aggregation. Since Elasticsearch does not support partial results on those - and as confirmed by Jim also this query will be super fast once we split it up - there is no place in Discover partial results would still make sense. I am closing this.

If we have the feeling this is a justified use-case, please open an issue in the elasticsearch repository for adding partial results to search results (hits). If ES will agree on building them, we can reopen this issue for tracking again, but for the sake of keeping the issue amount manageable I'd close this for now, since there is currently no Kibana work in this.

AlonaNadler · 2020-11-03T17:16:45Z

@elastic-jb @lizozom can you open an elasticsearch issue on what Kibana needs to support intermediate results in discover results? and link it here

lizozom · 2020-11-04T13:51:08Z

Please take at a look at the benchmarks I did on different types of queries.

It shows that once the query is split, fetching the latest documents is going to be at least x10 faster than it is today, as what restricts the performance of Discover today is loading the aggs and latest documents in the same query.

Therefore, and @jimczi and @giladgal mentioned this before, there's no significant performance benefit in adding partial results support to fetching latest documents, as long as those two things are fetched separately.

lizozom · 2020-12-09T17:29:22Z

Closing as there is significant work on partial results and splitting out the queries on Discover.
This PR is still irrelevant at the moment. WIll reopen if relevant.

lizozom added Feature:Search Querying infrastructure in Kibana Team:AppArch labels Aug 31, 2020

Kerry350 mentioned this issue Sep 10, 2020

[Logs UI] [R&D] Determine scope of work for migrating to async search in the Logs UI stream view #76677

Closed

lizozom changed the title ~~Using partial results in Kibana~~ Using partial results in Discover Oct 26, 2020

lizozom closed this as completed Oct 27, 2020

timroes reopened this Oct 29, 2020

timroes added Feature:Discover Discover Application Team:Visualizations Visualization editors, elastic-charts and infrastructure labels Oct 29, 2020

timroes mentioned this issue Nov 2, 2020

Split Discover query into three #69134

Closed

timroes closed this as completed Nov 3, 2020

timroes reopened this Nov 3, 2020

timroes added blocked Feature:elasticsearch labels Nov 3, 2020

lizozom closed this as completed Dec 9, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using partial results in Discover #76307

Using partial results in Discover #76307

lizozom commented Aug 31, 2020 •

edited

Loading

elasticmachine commented Aug 31, 2020

AlonaNadler commented Sep 1, 2020

lizozom commented Oct 27, 2020 •

edited

Loading

timroes commented Oct 27, 2020

AlonaNadler commented Oct 27, 2020

timroes commented Oct 29, 2020

AlonaNadler commented Oct 29, 2020

lizozom commented Nov 3, 2020 •

edited

Loading

AlonaNadler commented Nov 3, 2020

timroes commented Nov 3, 2020

AlonaNadler commented Nov 3, 2020 •

edited

Loading

lizozom commented Nov 4, 2020

lizozom commented Dec 9, 2020

Using partial results in Discover #76307

Using partial results in Discover #76307

Comments

lizozom commented Aug 31, 2020 • edited Loading

elasticmachine commented Aug 31, 2020

AlonaNadler commented Sep 1, 2020

lizozom commented Oct 27, 2020 • edited Loading

timroes commented Oct 27, 2020

AlonaNadler commented Oct 27, 2020

timroes commented Oct 29, 2020

AlonaNadler commented Oct 29, 2020

lizozom commented Nov 3, 2020 • edited Loading

AlonaNadler commented Nov 3, 2020

timroes commented Nov 3, 2020

AlonaNadler commented Nov 3, 2020 • edited Loading

lizozom commented Nov 4, 2020

lizozom commented Dec 9, 2020

lizozom commented Aug 31, 2020 •

edited

Loading

lizozom commented Oct 27, 2020 •

edited

Loading

lizozom commented Nov 3, 2020 •

edited

Loading

AlonaNadler commented Nov 3, 2020 •

edited

Loading