Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ML] Complete the Data Frame task on stop #41752

Merged
merged 10 commits into from
May 10, 2019

Conversation

davidkyle
Copy link
Member

A call to stop() will cause the indexer to stop shortly after when the next search returns. The persistent task can only be completed once the indexer is stopped otherwise we have a zombie indexer spinning away but not accessible to the task manager. If the wait_for_completion flag is true the stop requests waits for the persistent task to complete.

I added a onStop event which is fired when the indexer stops. Note this is different to onFinish which is triggered when the indexer completes the current phase.

Delete is changed to simply check if the task is active and if not delete the data frame transform config.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core

@benwtrent benwtrent self-requested a review May 2, 2019 15:08
Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are removing the persistent task from cluster state when stop is called, there is no way to start it again from where the cursor is currently located.

Are there plans to check the IndexerState when onSaveState is called and determining if a indexed snapshot of the current cursor position and checkpoint should be made?

I think something like:

if (indexerState.equals(IndexerState.DELETING)) { ...take snapshot...}

would be a good idea. That way when start is called again, the previous cursor position can be loaded up from the index.

@davidkyle
Copy link
Member Author

There was a failure in test clean up after the multi node tests
ElasticsearchStatusException[Cannot delete data frame [data-frame-transform-crud] as the task is running. Stop the task first]

I made the stop action delegate to the master node for coordinating responses which should fix the problem

Copy link
Member

@benwtrent benwtrent left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should consider taking snapshots of the current cursor position and checkpoint so that transforms can be stopped, and then started again from the same spot.

@@ -76,7 +76,8 @@
@After
public void cleanUpTransforms() throws IOException {
for (String transformId : transformsToClean) {
highLevelClient().dataFrame().stopDataFrameTransform(new StopDataFrameTransformRequest(transformId), RequestOptions.DEFAULT);
highLevelClient().dataFrame().stopDataFrameTransform(
new StopDataFrameTransformRequest(transformId, Boolean.TRUE, TimeValue.timeValueSeconds(20)), RequestOptions.DEFAULT);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Boolean.True should probably just be true

@davidkyle
Copy link
Member Author

run elasticsearch-ci/1

1 similar comment
@davidkyle
Copy link
Member Author

run elasticsearch-ci/1

@davidkyle davidkyle merged commit 9d94d57 into elastic:master May 10, 2019
@davidkyle davidkyle deleted the stop-removes-task branch May 10, 2019 08:07
davidkyle added a commit to davidkyle/elasticsearch that referenced this pull request May 10, 2019
Wait for indexer to stop then complete the persistent task on stop.
If the wait_for_completion is true the request will not return until stopped.
davidkyle added a commit that referenced this pull request May 21, 2019
Wait for indexer to stop then complete the persistent task on stop.
If the wait_for_completion is true the request will not return until stopped.
gurkankaymak pushed a commit to gurkankaymak/elasticsearch that referenced this pull request May 27, 2019
Wait for indexer to stop then complete the persistent task on stop.
If the wait_for_completion is true the request will not return until stopped.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants