Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Troubleshooting blocked evaluation #19446

Closed
suikast42 opened this issue Dec 12, 2023 · 4 comments
Closed

Troubleshooting blocked evaluation #19446

suikast42 opened this issue Dec 12, 2023 · 4 comments

Comments

@suikast42
Copy link
Contributor

suikast42 commented Dec 12, 2023

I have had played with the cpu settings of my vm. After that some allocations can't be placed.

image

With that Evaluation Response.

{
  "priority": 50,
  "type": "service",
  "triggeredBy": "queued-allocs",
  "status": "canceled",
  "statusDescription": "existing blocked evaluation exists for job \"observability\"",
  "failedTGAllocs": [
    {
      "Name": "mimir",
      "CoalescedFailures": 0,
      "NodesEvaluated": 1,
      "NodesExhausted": 0,
      "NodesAvailable": {
        "nomadder1": 1
      },
      "ClassFiltered": null,
      "ConstraintFiltered": null,
      "ClassExhausted": null,
      "DimensionExhausted": null,
      "QuotaExhausted": null,
      "Scores": null
    }
  ],
  "previousEval": "13dc8866-7b53-a94a-4d35-868913816f80",
  "nextEval": null,
  "blockedEval": null,
  "modifyIndex": 2518,
  "modifyTime": "2023-12-12T10:09:42.439Z",
  "createIndex": 2512,
  "createTime": "2023-12-12T10:09:35.334Z",
  "waitUntil": null,
  "namespace": "default",
  "plainJobId": "observability",
  "relatedEvals": [
    "13dc8866-7b53-a94a-4d35-868913816f80"
  ],
  "job": "[\"observability\",\"default\"]",
  "node": null
}

Nomad logs

Explore-logs-2023-12-12 11_11_32.txt

But how can I find the exact error for the placement error.

An information like not enough RAM or CPU could be very helpful.

@tgross
Copy link
Member

tgross commented Dec 12, 2023

Hi @suikast42! Nomad tries to keep only 1 evaluation per job in each stage of the eval broker, so that the scheduler doesn't have to run lots of no-op evaluations. (See Load Shedding in the Nomad Eval Broker if you want to learn more about this.)

So the evaluation you've got here was canceled, not blocked, because there was another blocked eval already:

"statusDescription": "existing blocked evaluation exists for job "observability"",

What you'll want to do is nomad eval list -job $jobid to find the blocked eval. Then you can nomad eval status on that evaluation to find out what it's blocked on.

@tgross tgross changed the title Troubleshooting BLOCKED evaluation Troubleshooting blocked evaluation Dec 12, 2023
@jrasell
Copy link
Member

jrasell commented Jan 9, 2024

Hi @suikast42, I am going to close this issue as we have not heard back and the comment Tim seems accurate in describing the steps to take to understand the blocked eval. Thanks.

@jrasell jrasell closed this as not planned Won't fix, can't repro, duplicate, stale Jan 9, 2024
@suikast42
Copy link
Contributor Author

suikast42 commented Jan 9, 2024

Thanks. But I can't reproduce this anymore. Will look at back if that occurs again.

Copy link

github-actions bot commented Jan 2, 2025

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 2, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
Development

No branches or pull requests

3 participants