Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A nomad jobspec provided as JSON results in a panic when the jobspec does not contain an ID property #17418

Closed
rcousens opened this issue Jan 13, 2023 · 2 comments · Fixed by #17689
Assignees
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/api HTTP API and SDK issues theme/crash type/bug

Comments

@rcousens
Copy link

rcousens commented Jan 13, 2023

locals {
  NOMAD_JOB = {
    Job = {
      Name = "test"
      ....
    }
  }
}

resource "nomad_job" "nomad-spring" {
  jobspec = jsonencode(local.NOMAD_JOB)
  json    = true
}

Debug Output

Gist with debug output

Panic Output

See Gist with debug output

Expected Behavior

Nomad should have invalidated the jobspec and complained about a missing ID field

Actual Behavior

Provider crashed

Steps to Reproduce

Please list the steps required to reproduce the issue, for example:

  1. Define an otherwise valid jobspec without an ID property
  2. Try and apply it

Important Factoids

The crash ultimately comes from here though I don't know that it's the responsibility of the API client to validate the job has a non nil ID property: nomad API client

Because job.ID is escaped and passed as a pointer, the failure to validate the spec contains an "ID" property causes a crash with a nil pointer dereference

References

Source of crash in provider

@lgfa29
Copy link
Contributor

lgfa29 commented Jun 3, 2023

Thanks for the report @rcousens!

This is actually a bug in the Nomad API client, where a nil check is missing before we get here:

wm, err := j.client.put("/v1/job/"+url.PathEscape(*job.ID)+"/plan", req, &resp, q)

I was able to reproduce this by using a sample JSON file like this:

JSON job
{
    "Job": {
        "Affinities": null,
        "AllAtOnce": false,
        "Constraints": null,
        "ConsulNamespace": "",
        "ConsulToken": "",
        "CreateIndex": 14,
        "Datacenters": [
            "dc1"
        ],
        "DispatchIdempotencyToken": "",
        "Dispatched": false,
        "JobModifyIndex": 14,
        "Meta": null,
        "Migrate": null,
        "ModifyIndex": 14,
        "Multiregion": null,
        "Name": "foo",
        "Namespace": "default",
        "NodePool": "default",
        "NomadTokenID": "",
        "ParameterizedJob": null,
        "ParentID": "",
        "Payload": null,
        "Periodic": null,
        "Priority": 50,
        "Region": "global",
        "Reschedule": null,
        "Spreads": null,
        "Stable": false,
        "Status": "pending",
        "StatusDescription": "",
        "Stop": false,
        "SubmitTime": 1685759953739812000,
        "TaskGroups": [
            {
                "Affinities": null,
                "Constraints": [
                    {
                        "LTarget": "${attr.consul.version}",
                        "Operand": "semver",
                        "RTarget": ">= 1.7.0"
                    }
                ],
                "Consul": {
                    "Namespace": ""
                },
                "Count": 1,
                "EphemeralDisk": {
                    "Migrate": false,
                    "SizeMB": 300,
                    "Sticky": false
                },
                "MaxClientDisconnect": null,
                "Meta": null,
                "Migrate": {
                    "HealthCheck": "checks",
                    "HealthyDeadline": 300000000000,
                    "MaxParallel": 1,
                    "MinHealthyTime": 10000000000
                },
                "Name": "foo",
                "Networks": null,
                "ReschedulePolicy": {
                    "Attempts": 0,
                    "Delay": 30000000000,
                    "DelayFunction": "exponential",
                    "Interval": 0,
                    "MaxDelay": 3600000000000,
                    "Unlimited": true
                },
                "RestartPolicy": {
                    "Attempts": 2,
                    "Delay": 15000000000,
                    "Interval": 1800000000000,
                    "Mode": "fail"
                },
                "Scaling": null,
                "Services": [
                    {
                        "Address": "192.168.1.2",
                        "AddressMode": "auto",
                        "CanaryMeta": null,
                        "CanaryTags": null,
                        "CheckRestart": null,
                        "Checks": null,
                        "Connect": null,
                        "EnableTagOverride": false,
                        "Meta": null,
                        "Name": "test",
                        "OnUpdate": "require_healthy",
                        "PortLabel": "mysql",
                        "Provider": "consul",
                        "TaggedAddresses": null,
                        "Tags": null,
                        "TaskName": ""
                    }
                ],
                "ShutdownDelay": null,
                "Spreads": null,
                "StopAfterClientDisconnect": null,
                "Tasks": [
                    {
                        "Affinities": null,
                        "Artifacts": null,
                        "Config": {
                            "args": [
                                "1"
                            ],
                            "command": "/bin/sleep"
                        },
                        "Constraints": null,
                        "DispatchPayload": null,
                        "Driver": "raw_exec",
                        "Env": null,
                        "Identity": null,
                        "KillSignal": "",
                        "KillTimeout": 5000000000,
                        "Kind": "",
                        "Leader": false,
                        "Lifecycle": null,
                        "LogConfig": {
                            "Disabled": false,
                            "Enabled": null,
                            "MaxFileSizeMB": 10,
                            "MaxFiles": 3
                        },
                        "Meta": null,
                        "Name": "foo",
                        "Resources": {
                            "CPU": 20,
                            "Cores": 0,
                            "Devices": null,
                            "DiskMB": 0,
                            "IOPS": 0,
                            "MemoryMB": 10,
                            "MemoryMaxMB": 0,
                            "Networks": null
                        },
                        "RestartPolicy": {
                            "Attempts": 2,
                            "Delay": 15000000000,
                            "Interval": 1800000000000,
                            "Mode": "fail"
                        },
                        "ScalingPolicies": null,
                        "Services": null,
                        "ShutdownDelay": 0,
                        "Templates": null,
                        "User": "",
                        "Vault": null,
                        "VolumeMounts": null
                    }
                ],
                "Update": {
                    "AutoPromote": false,
                    "AutoRevert": false,
                    "Canary": 0,
                    "HealthCheck": "checks",
                    "HealthyDeadline": 300000000000,
                    "MaxParallel": 1,
                    "MinHealthyTime": 10000000000,
                    "ProgressDeadline": 600000000000,
                    "Stagger": 30000000000
                },
                "Volumes": null
            }
        ],
        "Type": "service",
        "Update": {
            "AutoPromote": false,
            "AutoRevert": false,
            "Canary": 0,
            "HealthCheck": "",
            "HealthyDeadline": 0,
            "MaxParallel": 1,
            "MinHealthyTime": 0,
            "ProgressDeadline": 0,
            "Stagger": 30000000000
        },
        "VaultNamespace": "",
        "VaultToken": "",
        "Version": 0
    }
}

And running nomad job plan -json job.json, which results in this panic:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x0 pc=0x1009e1f68]

goroutine 1 [running]:
github.com/hashicorp/nomad/api.(*Jobs).PlanOpts(0x14000d1fc58, 0x14000643860, 0x14000d1fbbe, 0x101ce2235?)
        github.com/hashicorp/nomad/api@v0.0.0-20221006174558-2aa7e66bdb52/jobs.go:438 +0x98
github.com/hashicorp/nomad/command.(*JobPlanCommand).Run(0x1400040e600, {0x1400010e080, 0x2, 0x2})
        github.com/hashicorp/nomad/command/job_plan.go:250 +0x718
github.com/mitchellh/cli.(*CLI).Run(0x14000bee000)
        github.com/mitchellh/cli@v1.1.5/cli.go:262 +0x4a8
main.Run({0x1400010e060, 0x4, 0x4})
        github.com/hashicorp/nomad/main.go:107 +0x29c
main.main()
        github.com/hashicorp/nomad/main.go:77 +0x50

Since this is an issue with Nomad I will move this to the hashicorp/nomad repo.

Thanks again for the report!

Copy link

github-actions bot commented Jan 9, 2025

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jan 9, 2025
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/api HTTP API and SDK issues theme/crash type/bug
Projects
Development

Successfully merging a pull request may close this issue.

2 participants