-
-
Notifications
You must be signed in to change notification settings - Fork 377
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deployment Fails with "ResourceConflictException" in Lambda #833
Comments
I was about to make the same post. I even tried a fresh app from the README, it deploys the first time but after that I can't do any more deployments. I can't figure out why it would suddenly stop working,
|
It's possible that destroying the stack each time will let you deploy, but that means a several minutes where there is no website. @t1bb4r Did you try the temporary fix of putting |
@RickCogley That worked for me, thanks a lot! |
Sure thing @t1bb4r. I tried this with a couple more sites and I'm getting the same error consistently, with different up setups. |
In the article that Ben found, it mentions that lambda permissions can be added to a service role being used by CloudFormation (see "Updating CloudFormation’s service role" section on https://aws.amazon.com/blogs/compute/coming-soon-expansion-of-aws-lambda-states-to-all-functions/). I know that |
We are also experiencing this issue |
Hey guys sorry for the delay, taking a look at this. I read the announcement post but I'm a bit confused how it would influence Up, the recommended policy for running Up (https://apex.sh/docs/up/credentials/#iam_policy_for_up_cli) already has `lambda:Get*. It sounds a bit like simply updating the SDK will work, I'll try that today and update here (and push a release if it's fine). |
I'm not having any luck reproducing it actually, I'm still able to deploy my apps with 1.7.0-pro and I tried doing a few fresh application stacks as well. Are you guys seeing any particular pattern or is it across all of your apps? |
I'm having it across any existing up apps ... if i create a new stack (destroy and create an existing) it will work. The way that i've hacked around this is running: Before an Which was defined in the article |
Thanks for looking into it @tj. I had tried it on a few sites which were built on AWS "sub-organizations" underneath our master account. (not sure what they are really called) All of those failed with the error, and each of their IAM users does have the right permissions, it appears. I just tried it on one on our master account, and it succeeded. So I tried another one on our master, and that failed. FYI |
Not sure if it makes any difference, but the apps I am deploying are just static sites, either hand coded HTML files and a few assets in a "html" folder, or, Hugo generated into its usual "public" folder. |
We started experiencing this issue today as well. This workaround allowed us to do deployments though:
|
I was reading in the docs that it actually recommends:
So it seems like they actually anticipate being in a stuck state which is a bit odd, it’s like they’re admitting it’s broken. Do you guys use it in a VPC? Mine aren’t in a VPC, that could explain why I’m not really seeing it. There might not be anything I can really do there, I wish any new deploy would simply override the previous, but it looks like that’s not really how they wrote the system. |
hi @tj as for us, no, we're not using it in a VPC. |
I created an app 5 days ago from the README and was experiencing this issue. It's now working. No changes to the lambda description, aws account, up version or app code and its just working. I deployed a few times (5 days ago this was a 100% failure):
The only conclusion that I can make is that AWS made some changes to cause this, but then fixed it again. Is anyone still experiencing this issue right now? |
I've got a hugo site that consistently works, and a static site in an "html" folder that consistently fails. Just re-confirmed that neither site has the The only real difference between the settings is that the (succeeding) hugo site has setup and build steps whereas the (failing) html site is just a literal file copy. There was a "endpoint:regional" setting in the In AWS console, lambda page for the failing static site:
This comment claudiajs/claudia#226 (comment) mentions that they are using terraform and updated a version ... It's a hail mary (as is the above sequence of voodoo majick testing) but @tj, as you mentioned maybe a recompile would actually help? Who knows... |
Ok, found something else @tj: this forum post https://forums.aws.amazon.com/thread.jspa?messageID=995863&tstart=0 says you "need to put a check for the function state in between the update_function_code and the publish version calls. Make sure the state is active before proceeding https://docs.aws.amazon.com/lambda/latest/dg/functions-states.html" And, someone else mentions: "I also noticed that the ci/cd tool is using an old version of the AWS SDK (1.11.834), and if I deploy the code using AWS CLI (2.2.37) it works. Could this be related?" |
@RickCogley ahhh interesting, that sounds like a reasonable fix. I guess there's always room for a race condition after doing the request for the status as well since it's not atomic, but if we can assume it's deploying in a CI or just one person at a time it should be ok. I guess in that case we'd just have to keep polling until it's done, which sounds like it can be several minutes according to the docs. I'll try and get that in on Monday, I still couldn't reproduce that state but I'll make sure they deploy normally and hopefully that'll fix it in your cases |
we are also seeing this issue. setting |
Thanks @tj ! |
yikes so I guess you need to poll/wait before UpdateFunctionCode, UpdateFunctionConfiguration, and PublishVersion by the looks of it haha.. good old AWS, making things slow and difficult. I'll have to add some reasonable limit for now when it comes to the wait so it doesn't hang forever, but ideally it's configurable |
re-opening until you guys can confirm the fix since I can't reproduce it. It'll take about 20m to get the releases built/uploaded. I guess the worst-case is some of them are actually getting stuck in that pending state |
Ok if you |
Confirmed I get the latest version and it works on the site that was failing. Thanks! Edit: I mean I got the latest version automatically when deploying via GH actions. Also, running |
@tj trying to run
|
ok, ran
Hth |
awesome thanks guys! I'll close for now 😄 |
Hi! |
Did you upgrade per the above? |
Yeah! |
Hi guys, same thing here, all in the last version and the same error, and even putting the flags in the optional fields. |
Prerequisites
up upgrade
)-v, --verbose
flag.Description
Please see:
https://apex-dev.slack.com/archives/C65P0GAV8/p1631749067003000
... and:
https://aws.amazon.com/blogs/compute/coming-soon-expansion-of-aws-lambda-states-to-all-functions/
I have this issue too, but it was first reported by Ben Nichols on the Slack
#up
channel.Whether via CLI
up staging
andup production
, or, via in my case Github Actions, you get an error like:... and the deployment fails.
Steps to Reproduce
Make a visible change in one of your branches and do
up staging
orup production
as appropriate, orgit push
to the branch and have your CI run it. Either way, you get an error like the above.As Ben Nichols mentioned, you can add
aws:states:opt-out
as the lambda description, to bypass the problem, but it's reportedly going to stop working as of 1st Oct 2021.This feels like something other up users are suddenly going to experience, so it's my hope that someone can figure out how to change the code to fix this problem urgently.
Slack
Join us on Slack https://chat.apex.sh/
The text was updated successfully, but these errors were encountered: