-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connect Gateways: Resources are not respected/supported #10899
Comments
Hi @chuckyz and thanks for the report. I have not been able to reproduce this locally yet. When running on main I had to modify the job slightly and when the job registered the resources of the sidecar task where detailed correctly as set within the jobspec. I was unable to get the job running with 1.0.2 at this point. If you have any addition reproduction steps that could help, that would be appreciated. I'll mark the issue as needing further investigation. |
We use this pretty extensively IIRC, and I have a vague recollection of this being addressed in a subsequent patch |
Heh good memory @idrennanvmware, but I think #9854 was about a precondition when using I'm not sure yet what's going on here. Tweaked the jobspec (out of laziness, shouldn't be related) to make it work
and it seems to run fine with the expected resources
@chuckyz can I ask, where are you getting the reported allocated resources metrics from? |
Hi @chuckyz . Like Seth, I was able to set the resources as expected. Honoring resources for consul connect proxies was added in #9639 that shipped in Nomad 1.0.2. Since it's a server-side change, all servers must be running 1.0.2 or later. Here is my attempt at replication: $ nomad job run -detach ./job.hcl
Job registration successful
Evaluation ID: e360b99c-fcb1-d1cc-e023-8b342fbc8cd7
$ nomad job inspect ingress-grpc | jq '.Job.TaskGroups[0].Tasks[0].Resources'
{
"CPU": 2000,
"Devices": null,
"DiskMB": 0,
"IOPS": 0,
"MemoryMB": 2048,
"Networks": null
}
$ nomad version
Nomad v1.0.2 (4c1d4fc6a5823ebc8c3e748daec7b4fda3f11037)
$ nomad server members
Name Address Port Status Leader Protocol Build Datacenter Region
notnoop-C02X1N38JG5H.global 127.0.0.1 4648 alive true 2 1.0.2 dc1 global Where job "ingress-grpc" {
datacenters = ["dc1"]
type = "service"
meta {
GATEWAY_NAME = "ingress-gateway-grpc"
DEPLOY_TIME = "2021-07-12T13:31:02-07:00"
}
group "ingress-group" {
network {
mode = "bridge"
port "inbound" {
static = 7000
to = 7001
}
port "envoy_prom" { to = 9102 }
}
service {
name = "${NOMAD_META_GATEWAY_NAME}"
port = "7000"
tags = ["http"]
meta {
envoy_metrics_port = "${NOMAD_HOST_PORT_envoy_prom}"
}
connect {
sidecar_task {
resources {
cpu = 2000
memory = 2048
}
}
gateway {
proxy {
}
ingress {
listener {
port = 7000
protocol = "tcp"
service {
name = "none"
#hosts = ["none"]
}
}
}
}
}
}
}
} I have also confirmed that I get 250 CPU / 128 MB RAM when the server 1.0.1: $ ./nomad job run -detach ./job.hcl
Job registration successful
Evaluation ID: 256be59a-0d24-f859-dccb-218bafcc62b2
$ ./nomad job inspect ingress-grpc | jq '.Job.TaskGroups[0].Tasks[0].Resources'
{
"CPU": 250,
"Devices": null,
"DiskMB": 0,
"IOPS": 0,
"MemoryMB": 128,
"Networks": null
}
$ ./nomad version
Nomad v1.0.1 (c9c68aa55a7275f22d2338f2df53e67ebfcb9238)
$ ./nomad server members
Name Address Port Status Leader Protocol Build Datacenter Region
notnoop-C02X1N38JG5H.global 127.0.0.1 4648 alive true 2 1.0.1 dc1 global |
We're running 1.0.2+ent, I wonder if maybe it was missed from there somehow? We'll upgrade to 1.1.2+ent soonish and confirm if this is fixed or not. |
This is quite puzzling indeed. I just verified the behavior on 1.0.2+ent and also tested the resulting allocation and docker container:
|
In playing around with this I've realized there's a difference between having an absent from absent
from empty
|
I am also seeing the same issue, resources added under sidecar_task to configure ingress gateway job are not honored. Job file:
The deployed job shows SidecarTask value as null and nomad job plan also doesn't show any changes.
We are running nomad version - v1.1.5 |
@gulavanir, are you submitting jobs with
Indeed if I submit that job with
If I fix that job to set
|
Ahh @chuckyz are you also using |
@shoenig we enforce HCL1 spec because we have templates that have variables that are not HCL2 compliant. See reference here:#9838 So without an escape character (IIRC a few other suggested something similar) we're in between a bit of a rock and a hard place Am I misreading the referenced github issue and this is actually now supported? |
@idrennanvmware you haven't missed anything; we aren't going to drop HCL1 support until there is a usable path to HCL2 which as you note, doesn't exist yet. But there are bugs in the hand-written HCL1 parser, and this particular one is right here, where we make an outdated assumption that if |
The HCL1 parser did not respect connect.sidecar_task.resources if the connect.sidecar_service block was not set (an optimiztion that no longer makes sense with connect gateways). Fixes #10899
I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues. |
Nomad version
1.0.2
Operating system and Environment details
Ubuntu 18.04
Issue
resources{}
under `connReproduction steps
Run the job below.
Expected Result
This launches a task named
ingress-gateway-grpc
with 2000mhz cpu and 2048mb memory.Actual Result
This launches a task named
ingress-gateway-grpc
with 250mhz cpu and 128mb memory.Job file (if appropriate)
We've tried to put
resources{}
undergateway
but it doesn't seem to work.Notes:
sidecar_task seems to get it's resources from
nomad/api/services.go
Line 218 in 712ad49
whereas all gateway tasks seem to get their config from
nomad/api/services.go
Line 379 in 712ad49
which doesn't include any
*Resources
calls.Looking into
nomad/api/services.go
Line 265 in 712ad49
I looked through the envoy_bootstrap_hook.go file as well and was unable to find the actual task definition for the gateways.
@shoenig do you know where in the code we should be looking? I'm more than happy to submit a PR passing through resources to the config. I think it should probably go under
gateway { ingress { resources{} } }
orgateway { proxy { resources {} }
.The text was updated successfully, but these errors were encountered: