Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wait for green after opening job in NetworkDisruptionIT #50232

Merged
merged 1 commit into from
Dec 16, 2019

Conversation

davidkyle
Copy link
Member

@davidkyle davidkyle commented Dec 16, 2019

The test was failing to relocate a job to a new node after a network disruption because the .ml-state index did not have active shards:

[o.e.p.PersistentTasksClusterService] ignoring task job-relocation-job because assignment is the same node: [null], explanation: [Not opening job [relocation-job], because not all primary shards are active for the following indices [.ml-state]]

.ml-state is created when the first job is opened then the node was removed from the cluster before the index had time to replicate. Waiting for a green cluster state before triggering the disruption should ensure the replicas are present and fix the test.

I hope this closes #49908 but I'll leave the issue open and trace logging enabled for a week in case it reoccurs.

@davidkyle davidkyle added >test Issues or PRs that are addressing/adding tests :ml Machine learning labels Dec 16, 2019
@elasticmachine
Copy link
Collaborator

Pinging @elastic/ml-core (:ml)

Copy link
Contributor

@droberts195 droberts195 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@davidkyle davidkyle merged commit 736e9f9 into elastic:master Dec 16, 2019
SivagurunathanV pushed a commit to SivagurunathanV/elasticsearch that referenced this pull request Jan 23, 2020
@davidkyle davidkyle deleted the fix-disruption-test branch June 2, 2020 08:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:ml Machine learning >test Issues or PRs that are addressing/adding tests v7.6.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[CI] NetworkDisruptionIT.testJobRelocation failing
4 participants