-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add new workloads #25106
base: main
Are you sure you want to change the base?
Add new workloads #25106
Conversation
@@ -103,6 +103,12 @@ variable "aws_kms_alias" { | |||
# provide a list of builds to override the values of nomad_sha, nomad_version, | |||
# or nomad_local_binary. Most of the time you can ignore these variables! | |||
|
|||
variable "nomad_local_binary_server_unique" { | |||
description = "A nomad local binary paths to deploy to servers, to override nomad_local_binary" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure I understand what this variable is supposed to do (what's "unique" mean here?), and the description matches nomad_local_binary_server
exactly. It seems like we're overriding nomad_local_binary_server
but I can't tell why. Shouldn't Enos set nomad_local_binary_server
to make sure we're using the Linux binaries for the server instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because nomad_local_binary_server is an array and we would need to pass the same path 3 times, but enos does not accept functions, so it would have to come from the previous step as a list or we would have to add an extra module to build the list
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, ok... honestly I'm not sure that array has ever been used to have different values, so I wonder if it would be worth refactoring to just have a single override. But we can do that in a separate PR if you want.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im guessing at some point someone wanted to test compatibility of different versions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. I think I wrote that? 😊 But we really got away from running E2E manually, so there's probably a bunch of tiny design decisions that could be improved there.
7157cbd
to
b98792f
Compare
b98792f
to
40f537c
Compare
8bbcb8a
to
c5ae959
Compare
c5ae959
to
51c80fe
Compare
84b6710
to
1ba199f
Compare
1ba199f
to
25cf0ba
Compare
workloads = { | ||
service_raw_exec = { job_spec = "jobs/raw-exec-service.nomad.hcl", alloc_count = 3, type = "service" } | ||
service_docker = { job_spec = "jobs/docker-service.nomad.hcl", alloc_count = 3, type = "service" } | ||
system_docker = { job_spec = "jobs/docker-system.nomad.hcl", alloc_count = 0, type = "system" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alloc count for system jobs doesn't really matter, its set to 0 to emphasis it
output "new_allocs_count" { | ||
description = "The number of allocs that should be running in the cluster" | ||
value = local.system_job_count * chomp(enos_local_exec.get_nodes.stdout) + local.service_batch_allocs | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's this output for? It doesn't match allocs_count
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Im trying to make this module aware of existing allocs so it outputs all the running allocs, new and old, and the output can be directly used for a next step in enos, because it does not accept functions as step.variables, so this output helps me debug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense! Let's update the descriptions to make that clear. Right now this is the same as allocs_count
.
nomad_local_binary = step.copy_initial_binary.binary_path[matrix.os] | ||
nomad_local_binary_server = step.copy_initial_binary.binary_path[local.server_os] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should rebase this PR on #25172 once that's been merged.
config { | ||
image = "alpine:latest" | ||
command = "sh" | ||
args = ["-c", "while true; do sleep 30000; done"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Batch workloads are a little tricky -- we probably want to make sure that they can complete and not get rescheduled by client/server restarts, rather than having them wait forever. But that'll require some new assertion logic, so let's come back to that.
image = "alpine:latest" | ||
command = "sh" | ||
args = ["-c", "while true; do sleep 30000; done"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For service/system jobs, we should probably use a workload where we can assert more about it than "is the alloc running?". busybox httpd
would let us run a network service, so that we're exercising things like restoring CNI. Ok to leave for this PR, but let's come back to this too.
Co-authored-by: Tim Gross <tgross@hashicorp.com>
Description
This PR adds new system and batch workloads to the upgrade tests.
Testing & Reproduction steps
Links
Contributor Checklist
changelog entry using the
make cl
command.ensure regressions will be caught.
and job configuration, please update the Nomad website documentation to reflect this. Refer to
the website README for docs guidelines. Please also consider whether the
change requires notes within the upgrade guide.
Reviewer Checklist
backporting document.
in the majority of situations. The main exceptions are long-lived feature branches or merges where
history should be preserved.
within the public repository.