Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Separe STATUS from RESULT in async search #72115

Conversation

mayya-sharipova
Copy link
Contributor

Separate metadata from the actual response in async search

Getting the status of an async search could be costly if the response is
already stored. In this case, we need to read the whole document's source,
parse from the source async search response, and build from it a status
response. This could take not trivial amount of time if the initial
stored async search response is big.

This PR:

  • separate STATUS into a separate stored binary field
  • make EXPIRATION_TIME also stored field (as we can update
    keep_alive without updating STATUS or RESULT), as we want
    status to reflect the most recent expiration time.
  • change retrieving status from GET request with _source to
    GET request with stored_fields=[STATUS, EXPIRATION_TIME] without
    _source. This allows faster retrieval of status.

Relates to #62947
Closes #71223

Separate metadata from the actual response in async search

Getting the status of an async search could be costly if the response is
already stored. In this case, we need to read the whole document's source,
parse from the source async search response, and build from it a status
response. This could take not trivial amount of time if the initial
stored async search response is big.

This PR:
- separate STATUS into a separate stored binary field
- make EXPIRATION_TIME also stored field (as we can update
  keep_alive without updating STATUS or RESULT), as we want
  status to reflect the most recent expiration time.
- change retrieving status from GET request with _source to
  GET request with stored_fields=[STATUS, EXPIRATION_TIME] without
  _source. This allows faster retrieval of status.

Relates to elastic#62947
Closes elastic#71223
@mayya-sharipova mayya-sharipova added the :Search/Search Search-related issues that do not fall into other categories label Apr 22, 2021
@elasticmachine elasticmachine added the Team:Search Meta label for search team label Apr 22, 2021
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (Team:Search)

@mayya-sharipova mayya-sharipova added >enhancement v7.13.0 v8.0.0 and removed Team:Search Meta label for search team labels Apr 22, 2021
@mayya-sharipova
Copy link
Contributor Author

mayya-sharipova commented Apr 22, 2021

main changes in AsyncTaskIndexService.java, all other changes is just corrections for these changes.

@mayya-sharipova
Copy link
Contributor Author

We've discussed this again, and considering that _async_search index should NOT include big responses, and for responses < 1Mb, retrieving _source or a separate stored field doesn't make much difference , we've decided for now NOT to proceed with separating metadata from the actual response.

Thus, closing this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>enhancement :Search/Search Search-related issues that do not fall into other categories v7.14.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Separate metadata from the actual response in async search index
4 participants