-
Notifications
You must be signed in to change notification settings - Fork 9.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS Batch compute environment needs recreating after launch template change #15535
Comments
@microbioticajon Thanks for raising this issue. |
Hi @ewbankkit, That was it! The compute environment was relying on the default launch template but terraform was unable to detect the change unless While obvious now I think about it, a hint in the docs might helps others who get stuck with the same issue. I have reapplied the plan with a modified launch template but unfortunately Im now getting the following related error:
Im not sure how to get around this - it looks like TF now recognises that the compute environment needs to be replaced but AWS wont let it while there are still queues associated with it. Many thanks, |
I am seeing the same error even after manually destroying batch environments in the console. Any ideas of how to reset the (remote) state without re-initializing the project from scratch?
EDIT: fixed it with an explicit |
It looks like using Otherwise I have to lookup the current launch template value and increment it manually every time I deploy, just to get a new compute environment made correctly? I mean, the way I understand it, anytime terraform makes any change to a launch template, it should just remake any associated compute environments. Even if you use $Default or $Latest Batch only takes a snapshot of them at the time of compute environment creation; it won't dynamically recognize changes to $Latest or $Default over time. https://docs.aws.amazon.com/batch/latest/userguide/create-compute-environment.html
|
I think the only reliable solution to this in my situation, is for my deployment to mark the compute environment as tainted every time in order to force re-creation. |
@bhayden53 I have been struggling with this for about a (painful) year but just noticed a small improvement from using $Latest. You can instead use aws_launch_template.this.latest_version which simply replaces $Latest with the latest version number. This allows terraform to recognize that the CE needs to be replaced. I honestly don't understand what AWS thinks $Latest (and $Default) actually do in compute environments currently. It seems completely broken to me. However, the issue that @vspinu raises I see often and do not understand the root cause. It seems to me like a bug in the provider, specifically that it doesn't know that the queue must be deleted before the compute environment can be replaced. I suspect this is just a limitation of the AWS API and must be adapted to in the provider. @ewbankkit if a small reproducible example would help I can provide one. It would be fantastic if we can find a solution. |
AWS Support has told me that it intentionally takes a snapshot of the $Latest or $Default version at the time of CE creation. Definitely not what any reasonable user would expect it to do. I think I also got the "there is an issue in our internal tracker and I have added your voice to it" response as well. Thanks for the other workaround. |
This looks like it might be related to #30438. Since this issue was first opened, Batch behavior has changed and now allows updates to compute environment launch templates if "the service role is set to AWSServiceRoleForBatch (the default) and that the allocation strategy is BEST_FIT_PROGRESSIVE or SPOT_CAPACITY_OPTIMIZED. BEST_FIT isn't supported." See https://docs.aws.amazon.com/batch/latest/userguide/updating-compute-environments.html |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Hi Guys,
Im having trouble applying changes to my AWS Batch configuration. As part of my batch cluster I use a custom Launch Template for the instances in the compute environment. However when I make a change to the Launch Template the Batch compute environment remains un-modified.
Terraform version
v0.13.3
Affected Resource(s)
Expected Behaviour
According to the AWS Batch docs, if the Launch Template is updated with a new version, the entire compute environment needs to be destroyed and rebuilt:
https://docs.aws.amazon.com/batch/latest/userguide/launch-templates.html
Actual behaviour
aws_compute_environment remains unchanged
As a result, the only way to apply changes to the Launch Template is to manually destroy the compute environment before applying the plan or taint the resources through the command line.
I performed a quick search and I cannot find a way to trigger a forced re-create on a resource within the plan itself.
Any fixes, help or work-arounds would be greatly appreciated.
Note:
My current launch template has resulted in an invalid compute environment which cannot be deleted even when tainted which is why I need to update the launch template. See: #8549
The text was updated successfully, but these errors were encountered: