-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify async search REST parameters #54069
Comments
Pinging @elastic/es-search (:Search/Search) |
cc @elastic/es-clients |
Thanks for opening the issue, @javanna! To amend the explanation, here's how the corresponding JSON would look like: "wait_for_results_timeout":{
"type":"time",
"description":"...",
"default": "1s"
},
"keep_results":{
"type":"enum",
"options":[
"auto",
"always",
],
"description":"..."
},
"keep_results_timeout": {
"type": "time",
"description": "..."
}, I would also stress that it would be nice to have this fixed and merged before we release it in a regular, stable release, because renaming parameters after that will constitute breaking changes in language clients. |
Thanks for raising this @javanna
👍 on all the other name changes much clearer! |
I see, this makes sense to me. As I said in the description I am not yet convinced on this rename, because the parameter now mentions results while it also affects how long the async search runs. It's more about how long the async search will be valid overall, no matter if it's running or completed, if that makes sense. |
It does, maybe |
FYI @lukasolson , please comment if you have suggestions ;) |
The timeout is, in fact, an abort operation, see this description from the original issue:
In other words, it is not a timeout for the current request, but for the operation happening in background — but still a timeout. |
Is there any argument against the naming change? We're really close to feature freeze, and I consider the current parameter names really confusing to users. |
No issues from my side with renaming. |
I like |
The trouble with Another concern is that "completion" is not technically and semantically right here — there's no "completion" happening here, as eg. with "Delete By Query" API (and similar APIs).
From my conversation with @javanna, I understand that it's mostly useful for tests at this point. But as soon as we expose it in the public API, it becomes a part of it, with all the guarantees on backwards compatibility (especially in ecosystems such as .NET and Go, which are averse to changing the public API). The reason we have changed it to
As @javanna said in the original proposal, it seems like the "leaking" of a particular HTTP semantic into this parameter is confusing. For one, it's technically incorrect: this is not about "keep alive" in the HTTP sense, ie. keeping persisstent connections between client and server. It's more similar to something like a web session (keeping state on the server). That "state" are the results: see the |
Thanks for explaining, although I find
What about the completion of the request ? Returning the final results of the request is the completion that this parameter refers to. I'd vote for
Well "results" can be confusing too, the thing I worry about here is that we're saying that this parameter is used for tests and we don't see a real use case for it but we want an enum so that we can add more. I don't have strong feelings but I feel like it will be easier to remove if we keep a simple boolean no matter how it is named.
I agree with @Mpdreamz , this is not a timeout but a duration. I am not sure about the HTTP semantic either, the idea is borrowed but we still have a state that we want to keep alive for a certain period. Again no strong feelings here but I do think that documentation is the key no matter what naming we choose at the end. |
Yes, that's an option we have evaluated with @javanna when we talked about it, but given the concerns about the semantics of "completion", we went with the "results"-related naming, as it seemed more descriptive. But of course, |
Ok I can see how
Let me know if you have concerns about this. |
This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search. Also it renames clean_on_completion to keep_on_completion and turns around its behaviour. Closes elastic#54069
This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search. Also it renames clean_on_completion to keep_on_completion and turns around its behaviour. Closes #54069
This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search. Also it renames clean_on_completion to keep_on_completion and turns around its behaviour. Closes #54069
This commit renames wait_for_completion to wait_for_completion_timeout in submit async search and get async search. Also it renames clean_on_completion to keep_on_completion and turns around its behaviour. Closes #54069
When playing with the new async search API I noticed a couple of inconsistencies and potential naming problems that I would like to discuss. Note that it's important to address these now as the API haven't been released yet and they are declared stable in our REST spec.
I asked @karmi for his input to validate my concerns and come up with some proposal. The following are the problems and the changes that we are proposing:
wait_for_completion
: it indicates how long you are willing to block and wait for results when submitting an async search, effectively turning async search to sync.wait_for_completion
is used in other API but with typeboolean
, while it is exposed as anumber
, effectively a timeout, in submit async search. This introduces inconsistency in our REST API, and it will cause issues for some of the language clients.Proposal: rename it to
wait_for_results_timeout
: this way we include the timeout terminology and we don't reuse the existingwait_for_completion
. Alsoresults
better explains what it is that users are waiting for compared tocompletion
.keep_alive
: it indicates how long the async search is available within the cluster. That means that when such timeout expires, the search will be stopped if still running or its results will be purged if it has already completed.The
keep_alive
naming comes from http terminology where it has to do with connections, while here the semantics is around how long state will be available/stored in the cluster, which could lead to misinterpreting what the parameter does.Proposal: rename it to
keep_results_timeout
: this way we move away from reusing http terminology, and we make it clear that it's also a timeout around how long results will be available. Maybe what is not super clear about this is that the counting starts when the async search is submitted, not when it is completed. Suggestions are welcome.clean_on_completion
: it indicates whether results should not be stored once they are returned within the above described timeout.There is some double negation in its description that makes it hard to understand it. Also, the notion of
completion
can be confusing as it's not about whether the search was completed but whether the results were returned within the provided (currentlywait_for_completion
) timeout. By default, results are not stored when they are returned directly by submit async search. Being it aboolean
it may make users think that they can disable storing results at all times, but storing results can not be disabled, rightly so, when submit async search did not return them within the timeout.I considered removing this parameter, because when results have been returned, they could be stored externally. It turns out though that this parameter is useful to make testing deterministic and it makes sense to keep it.
Proposal: rename it to
keep_results
and make it anenum
rather than aboolean
with two possible values:auto
(the default behaviour: store results unless submit async search returned them withinkeep_results_timeout
) andalways
(store results for later retrieval even if they have been returned by submit async search within the provided timeout). I find that this better reflects the behaviour of the API and aligns well with the above proposed rename ofkeep_alive
tokeep_results_timeout
as they are somehow related. Note thatalways
does not mean forever, the results will always be cleaned when their validity expires.The rename of
wait_for_completion
towait_for_results_timeout
should also be applied to the get async search API, but maybe we should consider whether this parameter is useful when retrieving results? Users that are calling get async search are taking advantage of the async nature of async search, hence while I see why one would block and wait when submitting, I don't see why one would block and wait when retrieving results. Is avoiding an additional call when the search is almost complete a good enough reason to expose this parameter?The text was updated successfully, but these errors were encountered: