Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retries get_work on null response #1345

Merged
merged 1 commit into from
Nov 19, 2015

Conversation

daveFNbuck
Copy link
Contributor

Sometimes the scheduler gives a null response to get_work, resulting in an error
when the worker tries to treat the response as a dict and killing the worker.
This change will retry up to 3 times upon receiving a null response, allowing
the worker to live and continue working.

Note: this has a merge conflict with #1341 because they both modify get_work but it's easily resolved.

@daveFNbuck
Copy link
Contributor Author

Note: this has been happening to me about 4 times a day since recently moving my scheduler to a different server from my workers.

@erikbern
Copy link
Contributor

LGTM but why does the scheduler return a null result? That sounds like the underlying issue

@daveFNbuck
Copy link
Contributor Author

I have no idea how that happens or whether it's a network issue or a scheduler issue. I don't think my logs go back far enough to check.

@erikbern
Copy link
Contributor

erikbern commented Nov 6, 2015

oops seems like there's a merge conflict here. any chance you can rebase?

@daveFNbuck
Copy link
Contributor Author

Done. Thanks for all the merges today!

Sometimes the scheduler gives a null response to get_work, resulting in an error
when the worker tries to treat the response as a dict and killing the worker.
This change will retry up to 3 times upon receiving a null response, allowing
the worker to live and continue working.
erikbern added a commit that referenced this pull request Nov 19, 2015
@erikbern erikbern merged commit ef3a2b1 into spotify:master Nov 19, 2015
@daveFNbuck daveFNbuck deleted the recover_from_bad_get_work branch June 1, 2017 23:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants