-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting reserved_ports results in jobs being blocked #8421
Comments
Any chance someone can look into this? @shoenig 🙏 |
Hey @evandam @scalp42 I spent some time today trying to reproduce this with a few different combinations of reserved ports and service/batch jobs. I'm unable to hit the described scenario. If you have a jobfile and config I can drop in an run to generate this that would be great though I know how to hard and time consuming it can be to get to that. Would you be able to share the full output of the evaluation from the API and/or the list of evals for the job. Thanks! |
Hi @nickethier I just tried adding Here's the output of There's a ton of evaluations in that output that I believe are all the same error, but dropping them all in just in case it helps. Running
Removing the |
yeah, I encountered something similar. Job can't update and blocked (queued) |
@evandam by my test, if current running job already take a port in reserved_ports, set the reserved_ports and restart Nomad node will cause updating jobs blocked. after stop&start (reschedule) all the jobs which have took the ports in reserved_ports, things go right. @nickethier not investigate into code, does the guess above is reasonable ? |
@Sea-Flying in my case the job that was already running did not use a port in |
We believe that this is closed by #16401, which will ship in Nomad 1.5.1 (with backports) |
Nomad version
Issue
Similar to #1046, we're seeing jobs stuck in pending state when
reserved_ports
is set./etc/nomad/client.json
Running a job results in it being stuck in a pending state, creating evals with errors like so:
Here is the job HCL being used:
prod-platform-core-rake.hcl
I'm not sure if this is related to #4605 since Nomad does not seem to pick up the network interfaces on our EC2 instances.
The text was updated successfully, but these errors were encountered: