Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Meta] Search Sessions Roadmap #61738

Closed
25 of 32 tasks
lizozom opened this issue Mar 29, 2020 · 13 comments
Closed
25 of 32 tasks

[Meta] Search Sessions Roadmap #61738

lizozom opened this issue Mar 29, 2020 · 13 comments
Assignees
Labels
Feature:Search Sessions Feature:Search Querying infrastructure in Kibana impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort Meta Team:DataDiscovery Discover App Team (Document Explorer, Saved Search, Surrounding documents, Graph) v8.0.0

Comments

@lizozom
Copy link
Contributor

lizozom commented Mar 29, 2020

Part of https://github.com/elastic/dev/issues/1209

Search Sessions

A search session refers to running one or more ES searches in the background, in the context of a dashboard, or another application, allowing a user to come back and view the results later.

Proof of Concept

The POC PR includes a partial implementation of the Background Search feature, used for demo purposes. It also contains a sample plugin.

It should be reviewed and serve as a basis for the full implementation.

Tasks

(*) Means that the capability was implemented in the POC.

Improve search API

RFC

Client side BackgroundSession

#83640

Server side Background Session Service

  • (*) Implement server side background session service API (trackId / store / getId) [Search] Implement a background search service #61743
  • (*) Implement server side background session service monitoring loop that syncs tracked IDs to a saved objcet if a user decides to store the session.
  • Handle edge case of extending expiration of requests not generated by async search
  • Implement versioned updates (concurrency)
  • Don't allow adding new requests to a complete / error session

Monitoring service

Management

UI

Improvements

Misc

Flaky tests

Docs

keep_alive time

The ElasticSearch _async_search endpoint stores results of queries that ran longer than wait_for_completion_timeout by default.

The default keep_alive time is 5 days, and this includes both the query run time and store time. So if keep_alive is 5 days and the query ran for 4 days, it will be stored by default for another 1 day. If a query did not complete until the keep_alive time is reached - it will be canceled.

So while implementing this feature, it is important that Kibana doesn't store all queries for that time period. Storing queries for a long time is expensive in terms of both resources and cost. Instead, we should store the data for an initially short time, and then extend it, by sending a GET request to the same search ID, with a longer keep_alive.

Known limitations

  • Follow up searches - Some visualizations send additional searches based on the results of an initial search. For example, known examples are a pie chart with an Other bucket or a histogram based on the non default timefield.
    If we rely on the browser to track and store outgoing requests for a given background search, navigating away from the visualization before the secondary request was sent, will result in a partial background search that triggers an additional background search when opened.
    The solution to this seems to be supporting server side execution, but this maybe complex and time consuming.
    • ES support: When can we expect ES atomic handling of the two examples mentioned (pie charts, histogram)
    • Do Maps and TSVB have similar limitations?
  • TSVB does not use the existing search service. More over, it sends multiple requests from the server side. This might be resolved by making the search service available on the server side and using it from TSVB, or having TSVB use the _async_search API and return the search IDs somehow to the client.
  • Timelion won't be supported.
  • Completion notifications should be available to the user, regardless which application he's using ATM. This requires implementing a generic push notifications service (With Web Workers?). Maybe implemented in Phase 2.
  • Web Sockets - can we use them on cloud somehow?
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app-arch (Team:AppArch)

@lukasolson lukasolson added the Feature:Search Querying infrastructure in Kibana label Mar 30, 2020
@lizozom lizozom changed the title Background Search Roadmap [Meta] Background Search Roadmap Mar 31, 2020
@lizozom
Copy link
Contributor Author

lizozom commented Apr 1, 2020

TL;DR As ES allows updating the keep_alive of data, we shall store all searches for a short period of time, and update their keep_alive if the user chooses to send to background.

Some design \ UI concerns after a meeting with @mdefazio

Running a background search

Considering the workflow a user has to take to run a background search, there are a few alternatives:

Cancel and restart

  1. User runs a regular search
  2. After ~15 sec a notifications shows suggesting to run a background search
  3. If the user confirms, we cancel the existing search and run a new one in the background, that stores results.

👍 No significant changes to existing UI. High discoverability. Medium implementation complexity
👎 Can cause UI jitters (progress indicators moving backwards, completed visualizations restarting). Creates a lot of edge case handling. User has to wait 15 seconds to run in background.

Store everything

  1. User runs a regular search
  2. Internally we store all search requests
  3. After ~15 sec a notifications shows suggesting to run a background search
  4. If the user confirms, we only need to store the background search IDs

👍 Simple implementation.
👎 unknown amount of additional data stored in ES for significant periods of time (24? 48 hours?). Hard to test impact or life sized clusters. User has to wait 15 seconds to run in background.

Multi option search button

  1. User runs a regular search or a background search in the search bar
  2. Only background searches are cached.
  3. If a user runs a regular search, after ~15 sec we can show them a notification that they can re-run the query in the background, guiding them to discover the feature.

👍 Simple implementation, high discoverability. No need to wait to run in background.
👎 Requires re-designing the search button in top nav.

Restoring a background search

A user should be able to reopen a complete background search from within the application.
To do that, I suggest having a Background Search fly-out, listing all completed searches for that application, sorted by completion time.

  1. What would trigger this fly-out? TopNavMenu? Other?
  2. Can this be enough for the 1st phase of this feature? Or do we also want to implement the app level notifications?

App level notifications

Is there planned support for this from the ES-UI team?

@AlonaNadler
Copy link

Cancel and restart
User runs a regular search
After ~15 sec a notifications shows suggesting to run a background search
If the user confirms, we cancel the existing search and run a new one in the background, that stores results.

this is a new option to me, what's the purpose of this option? I understand the ability to cancel, why the restart?

unknown amount of additional data stored in ES for significant periods of time (24? 48 hours?). Hard to test impact or life sized clusters. User has to wait 15 seconds to run in background.

If I understand correctly this is not the results of the cached dashboard. Can we store it for the timeout time configured?

@tomcallahan
Copy link

Relates elastic/dev#1209

@lizozom
Copy link
Contributor Author

lizozom commented Apr 13, 2020

@lukasolson the default ES behavior is already to store results for 5days
elastic/elasticsearch#49931

@lizozom
Copy link
Contributor Author

lizozom commented Apr 19, 2020

Spoke with @jimczi about the ES implementation.

  1. The _async_search endpoint stores results of queries that ran longer than wait_for_completion_timeoutby default.
  2. The default keep_alive time is 5 days!
  3. This includes both the query run time and store time. So if keep_alive is 5 days and the query ran for 4 days, it will be stored by default for another 1 day.
  4. If a query did not complete until the keep_alive time is reached - it will be canceled.
  5. Kibana shouldn't store all queries for that time period. Storing queries for a long time is expensive in terms of both resources and cost. Instead, we should store the data for an initially short time, and then extend it.
  6. Extending the keep_alive time is done by sending a GET request to that search ID.

Added questions to ES team at elastic/elasticsearch#49931 (comment)

@lizozom
Copy link
Contributor Author

lizozom commented May 4, 2020

Spoke with @rudolf about platform concerns -

  1. The background service creates BackgroundSearch saved objects during a user session, but may update them asynchronously, when there is no active user. To do that, we can use the createInternalRepository to create a SavedObjectClient.
  2. In the same scenario, we can gain access to core.elasticsearch.legacy.client.callAsInternalUser to make calls to ES and update the expiration time from background.
  3. We should add some telemetry information to the feature: How big does idMapping get? How often do we query ES for saved objects? How many new background searches are created?
  4. To reset the session ID upon navigating away from any app, we can use the core.application.currentAppId$ observable. This is to make sure that under no circumstances an app re-uses another app's sessionId.

@lizozom lizozom changed the title [Meta] Background Search Roadmap [Meta] Search Sessions Roadmap Feb 9, 2021
@exalate-issue-sync exalate-issue-sync bot added impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort labels Jun 2, 2021
@exalate-issue-sync exalate-issue-sync bot added loe:medium Medium Level of Effort and removed loe:small Small Level of Effort labels Sep 28, 2021
@VijayDoshi
Copy link

@lizozom do you know if there is any work happening to address the multiple query serialization known issue? With "other" as the default in Lens - this will have an impact on customers using the Frozen tier who are using search sessions. Are we actively moving the search service to the server? cc: @sixstringcode

@sixstringcode
Copy link

@VijayDoshi @lizozom is working on other projects now, @lukasolson and I are driving the planning ahead. The secondary query/other bucket problem is top of our queue for finishing up the Make It Slow effort.

@exalate-issue-sync exalate-issue-sync bot added loe:small Small Level of Effort and removed loe:medium Medium Level of Effort labels Jan 4, 2022
@ppisljar
Copy link
Member

Thank you for contributing to this issue, however, we are closing this issue due to inactivity as part of a backlog grooming effort. If you believe this feature/bug should still be considered, please reopen with a comment.

@ppisljar ppisljar closed this as not planned Won't fix, can't repro, duplicate, stale Aug 11, 2022
@petrklapka petrklapka added Team:DataDiscovery Discover App Team (Document Explorer, Saved Search, Surrounding documents, Graph) and removed Team:AppServicesSv labels Nov 21, 2022
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-data-discovery (Team:DataDiscovery)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Search Sessions Feature:Search Querying infrastructure in Kibana impact:low Addressing this issue will have a low level of impact on the quality/strength of our product. loe:small Small Level of Effort Meta Team:DataDiscovery Discover App Team (Document Explorer, Saved Search, Surrounding documents, Graph) v8.0.0
Projects
None yet
Development

No branches or pull requests