Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Link child requests to the root task that initiated the action #48422

Closed
wants to merge 2 commits into from

Conversation

jimczi
Copy link
Contributor

@jimczi jimczi commented Oct 23, 2019

This commit modifies the hierarchy of tasks to link any child requests executed
by a transport action to the root task that initiated the request. For instance today
a shard search is linked to the main search task that coordinate the entire request.
However if the main search task was created by a multi search action, multiple roundtrips
are needed to link the multi search to the shard searches that it creates (by recursively checking the parent of the tasks).
This is not convenient when cancelling a task since it requires to send the cancellation for the root task first, then to all nodes to cancel the child task and recursively until there is no more child tasks. Currently we only handle one level so multi_search cancellation cannot reach the grand-children (the shard search).

The proposal in this change is to always link child tasks to their root task (the task that initiated the original action). While the change is simple it might be considered as breaking since we loose the association between children and grand-children so it would not be possible to cancel an individual search without cancelling the entire msearch action.
I think it's an acceptable trade-off mainly because other solutions that would not break this behavior would require much more changes.
We could for instance keep the full list of parents for a specific task but that would require to change all the call to execute a task locally.

Another solution would be to add more roundtrips to task cancellations but that's also a big change that would complicate cancellation significantly.

I didn't change the documentation around cancellation yet since this change might be controversial. Let's discuss it here and we can add documentation if we reach a consensus.

This commit modifies the hierarchy of tasks to link any child requests executed
by a transport action to the root task that initiated the request. For instance today
a shard search is linked to the main search task that coordinate the entire request.
However if the main search task was created by a multi search action, multiple roundtrips
are needed to link the multi search to the shard searches that it creates (by recursively checking the parent of the tasks).
This is not convenient when cancelling a task since it requires to send the cancellation for the root task first, then to all
nodes to cancel the child task and recursively until there is no more child tasks. Currently we only handle one level so multi
_search cancellation cannot reach the grand-children (the shard search).
The proposal in this change is to always link child tasks to their root task (the task that initiated the original action).
While the change is simple it might be considered as breaking since we loose the association between children and grand-children
so it would not be possible to cancel an individual search without cancelling the entire msearch action. I think it's an acceptable
trade-off mainly because other solutions that would not break this behavior would require much more changes. We could for instance
keep the full list of parents for a specific task but that would require to change all the call to execute a task locally.
Another solution would be to add more roundtrips to task cancellations but that's also a big change that would complicate
cancellation significantly.
@jimczi jimczi added >enhancement >breaking :Search/Search Search-related issues that do not fall into other categories :Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. v8.0.0 labels Oct 23, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search (:Search/Search)

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Task Management)

Copy link
Contributor

@ywelsch ywelsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This just goes to show that the tasks API is not the right tool for what you would like to build here. More discussion about this here: #47378

@jimczi
Copy link
Contributor Author

jimczi commented Oct 29, 2019

This just goes to show that the tasks API is not the right tool for what you would like to build here.

We discussed another route offline, for reindex there are discussions around a new mechanism to allow cancelling only the root task. For msearch we should use a header and add the ability to cancel a task from its x-opacque-id or any other header that could be propagated to its children.
I am closing this pr since we want to keep the hierarchy between tasks and we'll discuss the alternatives in new issues/prs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>breaking :Distributed Coordination/Task Management Issues for anything around the Tasks API - both persistent and node level. >enhancement :Search/Search Search-related issues that do not fall into other categories v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants