Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Backport of core: allow pausing and un-pausing of leader broker routine into release/1.3.x #13614

Conversation

hc-github-team-nomad-core
Copy link
Contributor

Backport

This PR is auto-generated from #13045 to be assessed for backporting due to the inclusion of the label backport/1.3.x.

The below text is copied from the body of the original PR.


Test failure: percy-ui and unrelated
Related to #11638
Commits: split to make review easier

This changes allows operators to control whether or not the eval broker is running on the leader server. This is useful in outage situations where administrators wish to stop any work being available to the scheduler workers. It is also requisite work for future work which will allow deletion of evaluations which have not been processed.

Alongside pausing the eval broker, the blocked evals process is also disabled at the same time. This sub-process pushes evaluations into the eval broker and is also restored from the state store using the same process as the eval broker. It therefore makes sense, considering the end goal, to disable both processes for state consistency.

To ensure operators pausing/un-pausing the broker doesn't conflict with leadership transitions a mutex is used to control access to the eval broker and blocked evals processes. This is used along with a leadership check when changing the broker status which requires taking into account operator configuration.

The issue mentions disabling the reapFailedEvaluations process, however, I don't believe this is required for this change as disabling the eval broker also flushes all stored evaluation data. It therefore seemed safer to leave this alone and not require additional coordination.

The new operator scheduler commands allow inspecting and modifying the scheduler config. This is useful as it doesn't require you to supply the full payload object or remember the exact curl command to run.

@github-actions
Copy link

github-actions bot commented Nov 4, 2022

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 4, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants