Clarify async search REST parameters #54069

javanna · 2020-03-24T11:18:19Z

When playing with the new async search API I noticed a couple of inconsistencies and potential naming problems that I would like to discuss. Note that it's important to address these now as the API haven't been released yet and they are declared stable in our REST spec.

I asked @karmi for his input to validate my concerns and come up with some proposal. The following are the problems and the changes that we are proposing:

wait_for_completion: it indicates how long you are willing to block and wait for results when submitting an async search, effectively turning async search to sync.
wait_for_completion is used in other API but with type boolean, while it is exposed as a number, effectively a timeout, in submit async search. This introduces inconsistency in our REST API, and it will cause issues for some of the language clients.
Proposal: rename it to wait_for_results_timeout: this way we include the timeout terminology and we don't reuse the existing wait_for_completion. Also results better explains what it is that users are waiting for compared to completion.
keep_alive: it indicates how long the async search is available within the cluster. That means that when such timeout expires, the search will be stopped if still running or its results will be purged if it has already completed.
The keep_alive naming comes from http terminology where it has to do with connections, while here the semantics is around how long state will be available/stored in the cluster, which could lead to misinterpreting what the parameter does.
Proposal: rename it to keep_results_timeout: this way we move away from reusing http terminology, and we make it clear that it's also a timeout around how long results will be available. Maybe what is not super clear about this is that the counting starts when the async search is submitted, not when it is completed. Suggestions are welcome.
clean_on_completion: it indicates whether results should not be stored once they are returned within the above described timeout.
There is some double negation in its description that makes it hard to understand it. Also, the notion of completion can be confusing as it's not about whether the search was completed but whether the results were returned within the provided (currently wait_for_completion) timeout. By default, results are not stored when they are returned directly by submit async search. Being it a boolean it may make users think that they can disable storing results at all times, but storing results can not be disabled, rightly so, when submit async search did not return them within the timeout.
I considered removing this parameter, because when results have been returned, they could be stored externally. It turns out though that this parameter is useful to make testing deterministic and it makes sense to keep it.
Proposal: rename it to keep_results and make it an enum rather than a boolean with two possible values: auto (the default behaviour: store results unless submit async search returned them within keep_results_timeout) and always (store results for later retrieval even if they have been returned by submit async search within the provided timeout). I find that this better reflects the behaviour of the API and aligns well with the above proposed rename of keep_alive to keep_results_timeout as they are somehow related. Note that always does not mean forever, the results will always be cleaned when their validity expires.
The rename of wait_for_completion to wait_for_results_timeout should also be applied to the get async search API, but maybe we should consider whether this parameter is useful when retrieving results? Users that are calling get async search are taking advantage of the async nature of async search, hence while I see why one would block and wait when submitting, I don't see why one would block and wait when retrieving results. Is avoiding an additional call when the search is almost complete a good enough reason to expose this parameter?

The text was updated successfully, but these errors were encountered:

elasticmachine · 2020-03-24T11:18:21Z

Pinging @elastic/es-search (:Search/Search)

javanna · 2020-03-24T12:42:01Z

cc @elastic/es-clients

karmi · 2020-03-24T12:50:39Z

Thanks for opening the issue, @javanna! To amend the explanation, here's how the corresponding JSON would look like:

"wait_for_results_timeout":{
  "type":"time",
  "description":"...",
  "default": "1s"
},

"keep_results":{
  "type":"enum",
  "options":[
    "auto",
    "always",
  ],
  "description":"..."
},

"keep_results_timeout": {
  "type": "time",
  "description": "..."
},

I would also stress that it would be nice to have this fixed and merged before we release it in a regular, stable release, because renaming parameters after that will constitute breaking changes in language clients.

Mpdreamz · 2020-03-24T13:23:31Z

Thanks for raising this @javanna

keep_results_timeout is an improvement but timeout seems to indicate an abort action, keep_results_duration conveys more that it influences what happens on a successful attempt to store of the results.

👍 on all the other name changes much clearer!

javanna · 2020-03-24T13:40:31Z

keep_results_duration conveys more that it influences what happens on a successful attempt to store of the results

I see, this makes sense to me. As I said in the description I am not yet convinced on this rename, because the parameter now mentions results while it also affects how long the async search runs. It's more about how long the async search will be valid overall, no matter if it's running or completed, if that makes sense.

Mpdreamz · 2020-03-24T14:04:32Z

It does, maybe max_duration ? Last suggestion on my end before deferring to those who built async search 😄

javanna · 2020-03-24T14:09:38Z

FYI @lukasolson , please comment if you have suggestions ;)

karmi · 2020-03-24T15:04:40Z

(...) timeout seems to indicate an abort action (...)

The timeout is, in fact, an abort operation, see this description from the original issue:

when such timeout expires, the search will be stopped if still running or its results will be purged if it has already completed.

In other words, it is not a timeout for the current request, but for the operation happening in background — but still a timeout.

karmi · 2020-03-24T20:29:43Z

Is there any argument against the naming change? We're really close to feature freeze, and I consider the current parameter names really confusing to users.

lukasolson · 2020-03-24T20:42:53Z

No issues from my side with renaming.

jimczi · 2020-03-24T21:15:21Z

I like wait_for_completion, it's consistent with other APIs that provides similar feature (reindex, force_merge, ..) and I don't see why wait_for_results_timeout would be a better name.
I don't have strong opinions on clean_on_completion, it's mainly use in tests so I was hoping to remove at some point but it remained.
keep_alive can be confusing but it's also inlined with the behavior which is to keep the async search id around for a certain period.
All in one I would prefer that we ensure that documentation reflects the options accurately rather than naming which sounds subjective to me.

karmi · 2020-03-25T06:40:34Z

I like wait_for_completion, it's consistent with other APIs that provides similar feature (reindex, force_merge, ..) and I don't see why wait_for_results_timeout would be a better name.

The trouble with wait_for_completion is that in every other API, it's a "boolean" type. My concern is that it's confusing to suddenly change the type to "timeout" in this particular API. Therefore, it's not only inconsistent — due to the different value type —, but potentially confusing.

Another concern is that "completion" is not technically and semantically right here — there's no "completion" happening here, as eg. with "Delete By Query" API (and similar APIs).

clean_on_completion, it's mainly use in tests so I was hoping to remove at some point but it remained.

From my conversation with @javanna, I understand that it's mostly useful for tests at this point. But as soon as we expose it in the public API, it becomes a part of it, with all the guarantees on backwards compatibility (especially in ecosystems such as .NET and Go, which are averse to changing the public API).

The reason we have changed it to keep_results is that it keeps the "this is about results" theme in the naming, and also, since it's an enum, we might find other use cases for this particular parameter/feature. A boolean type is very limited in this sense, and this particular parameter looks suspiciously prone to get additional semantics later.

keep_alive can be confusing but it's also inlined with the behavior which is to keep the async search id around for a certain period.

As @javanna said in the original proposal, it seems like the "leaking" of a particular HTTP semantic into this parameter is confusing. For one, it's technically incorrect: this is not about "keep alive" in the HTTP sense, ie. keeping persisstent connections between client and server. It's more similar to something like a web session (keeping state on the server). That "state" are the results: see the keep_results parameter. And, contrary to what @Mpdreamz said above, it is controlling an abort operation. Therefore, we wanted to a) include the reference to "result" in the name, and b) to include the "timeout" in the name.

jimczi · 2020-03-25T08:41:40Z

The trouble with wait_for_completion is that in every other API, it's a "boolean" type. My concern is that it's confusing to suddenly change the type to "timeout" in this particular API. Therefore, it's not only inconsistent — due to the different value type —, but potentially confusing.

Thanks for explaining, although I find wait_for_results_timeout equally confusing since this API returns partial results too.

Another concern is that "completion" is not technically and semantically right here — there's no "completion" happening here, as eg. with "Delete By Query" API (and similar APIs).

What about the completion of the request ? Returning the final results of the request is the completion that this parameter refers to.

I'd vote for wait_for_completion_timeout since you're concerned by the fact that wait_for_completion is normally used as a boolean. What do you think ?

The reason we have changed it to keep_results is that it keeps the "this is about results" theme in the naming, and also, since it's an enum, we might find other use cases for this particular parameter/feature. A boolean type is very limited in this sense, and this particular parameter looks suspiciously prone to get additional semantics later.

Well "results" can be confusing too, the thing I worry about here is that we're saying that this parameter is used for tests and we don't see a real use case for it but we want an enum so that we can add more. I don't have strong feelings but I feel like it will be easier to remove if we keep a simple boolean no matter how it is named.

seems like the "leaking" of a particular HTTP semantic into this parameter is confusing. For one, it's technically incorrect: this is not about "keep alive" in the HTTP sense, ie. keeping persisstent connections between client and server. It's more similar to something like a web session (keeping state on the server). That "state" are the results: see the keep_results parameter. And, contrary to what @Mpdreamz said above, it is controlling an abort operation. Therefore, we wanted to a) include the reference to "result" in the name, and b) to include the "timeout" in the name.

I agree with @Mpdreamz , this is not a timeout but a duration. I am not sure about the HTTP semantic either, the idea is borrowed but we still have a state that we want to keep alive for a certain period. Again no strong feelings here but I do think that documentation is the key no matter what naming we choose at the end.

karmi · 2020-03-25T08:53:21Z

I'd vote for wait_for_completion_timeout since you're concerned by the fact that wait_for_completion is normally used as a boolean. What do you think ?

Yes, that's an option we have evaluated with @javanna when we talked about it, but given the concerns about the semantics of "completion", we went with the "results"-related naming, as it seemed more descriptive.

But of course, wait_for_completion_timeout is better than wait_for_completion for the reason you mention.

javanna · 2020-03-25T12:30:19Z

Ok I can see how results does not tell whether they are partial or final, while completion implies final more clearly. Let's go with the following:

rename wait_for_completion to wait_for_completion_timeout
rename clean_on_completion to keep_on_completion to address the double negation. It is likely that we will turn the default behaviour to always keep results, in which case the flag could be removed as it would no longer be needed for testing either. I wish that we could mark this parameter experimental, but I don't think we can do that in the REST spec today.
leave keep_alive unchanged

Let me know if you have concerns about this.

This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search. Also it renames clean_on_completion to keep_on_completion and turns around its behaviour. Closes elastic#54069

This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search. Also it renames clean_on_completion to keep_on_completion and turns around its behaviour. Closes #54069

javanna added >enhancement :Search/Search Search-related issues that do not fall into other categories team-discuss labels Mar 24, 2020

javanna mentioned this issue Mar 25, 2020

Async search: rename REST parameters #54198

Merged

javanna closed this as completed in #54198 Mar 26, 2020

lukasolson mentioned this issue Mar 26, 2020

[Search service] Update _async_search parameter names elastic/kibana#61498

Closed

codebrain mentioned this issue Mar 27, 2020

Update specifications for 7.7.0 elastic/elasticsearch-net#4513

Merged

lizozom mentioned this issue Mar 29, 2020

[Meta] Search Sessions Roadmap elastic/kibana#61738

Closed

32 tasks

javanna mentioned this issue Apr 24, 2020

[DOCS] Async search response ID clarification #55636

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify async search REST parameters #54069

Clarify async search REST parameters #54069

javanna commented Mar 24, 2020

elasticmachine commented Mar 24, 2020

javanna commented Mar 24, 2020

karmi commented Mar 24, 2020

Mpdreamz commented Mar 24, 2020

javanna commented Mar 24, 2020

Mpdreamz commented Mar 24, 2020

javanna commented Mar 24, 2020

karmi commented Mar 24, 2020

karmi commented Mar 24, 2020

lukasolson commented Mar 24, 2020

jimczi commented Mar 24, 2020

karmi commented Mar 25, 2020

jimczi commented Mar 25, 2020

karmi commented Mar 25, 2020

javanna commented Mar 25, 2020

Clarify async search REST parameters #54069

Clarify async search REST parameters #54069

Comments

javanna commented Mar 24, 2020

elasticmachine commented Mar 24, 2020

javanna commented Mar 24, 2020

karmi commented Mar 24, 2020

Mpdreamz commented Mar 24, 2020

javanna commented Mar 24, 2020

Mpdreamz commented Mar 24, 2020

javanna commented Mar 24, 2020

karmi commented Mar 24, 2020

karmi commented Mar 24, 2020

lukasolson commented Mar 24, 2020

jimczi commented Mar 24, 2020

karmi commented Mar 25, 2020

jimczi commented Mar 25, 2020

karmi commented Mar 25, 2020

javanna commented Mar 25, 2020