-
Notifications
You must be signed in to change notification settings - Fork 3.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
When multiple Lambda functions share code, build a single asset shared between them #989
Comments
We can also employ server-side copies (upload once then copy out) |
@rix0rrr for my use case that would be bad since it would just make lots of extra copies and run up my S3 bill. Playing with the CDK has already created 27GB of stuff in my CDK assets bucket. FWIW, I implemented option 1 and it seems to work great in my case, but I think it would fail if I didn't use my first lambda in my app. |
I understand. But, the reason we have multiple copies of the same file for multiple assets is that they are in different prefixes. We have to allow that they can evolve individually (maybe they're the same by accident) and the consuming resources need read permissions on the entire prefix so that jobs started on an old asset don't suddenly lose perms to read their file if the "current" version changes. Not applicable to Lamda so much but definitely to CodeBuild for example |
I see the issue with AssetCode. It's being silly by creating the same Asset object multiple times, and every Asset object triggers an upload. The proper solution would be to make AssetCode a construct, but in order not to break API compatibility I'm going to cache the Asset object like you've been doing locally. I will trigger a server-side copy if the same Asset data is uploaded multiple times to different Asset objects, but I'm not going to deduplicate inside the S3 bucket (in order to not run afoul of permission issues as mentioned before). I'm sorry about your S3 bill, but I think the more general solution to this will be some kind of garbage collection facility (as opposed to frugally uploading, which helps in the short term but not in the long run). |
This change implements two asset upload frugality measures: - If a lambda.AssetCode object is reused for multiple Lambdas, the same underyling Asset object will be reused (which leads to the asset data only being uploaded once). - If nonetheless multiple Asset objects are created for the same source data, the data will only be uploaded once and subsequently copied on the server-side to avoid the additional data transfer. Fixes #989.
This change implements two asset bandwidth conservation measures: - If a lambda.AssetCode object is reused for multiple Lambdas, the same underyling Asset object will be reused (which leads to the asset data only being uploaded once). - If nonetheless multiple Asset objects are created for the same source data, the data will only be uploaded once and subsequently copied on the server-side to avoid the additional data transfer. Fixes #989.
Bug Fixes ========= * **aws-autoscaling:** allow minSize to be set to 0 ([#1015](#1015)) ([67f7fa1](67f7fa1)) * **aws-codebuild:** correctly pass the timeout property to CFN when creating a Project. ([#1071](#1071)) ([b1322bb](b1322bb)) * **aws-codebuild:** correctly set S3 path when using it as artifact. ([#1072](#1072)) ([f32cba9](f32cba9)) * **aws-kms:** add output value when exporting an encryption key ([#1036](#1036)) ([cb490be](cb490be)) * Switch from `js-yaml` to `yaml` ([#1092](#1092)) ([0b132b5](0b132b5)) Features ========= * **applets:** integrate into toolkit ([#1039](#1039)) ([fdabe95](fdabe95)), closes [#849](#849) [#342](#342) [#291](#291) * **aws-codecommit:** use CloudWatch Events instead of polling by default in the CodePipeline Action. ([#1026](#1026)) ([d09d30c](d09d30c)) * **aws-dynamodb:** allow specifying partition/sort keys in props ([#1054](#1054)) ([ec87331](ec87331)), closes [#1051](#1051) * **aws-ec2:** AmazonLinuxImage supports AL2 ([#1081](#1081)) ([97b57a5](97b57a5)), closes [#1062](#1062) * **aws-lambda:** high level API for event sources ([#1063](#1063)) ([1be3442](1be3442)) * **aws-sqs:** improvements to IAM grants API ([#1052](#1052)) ([6f2475e](6f2475e)) * don't upload the same asset multiple times ([#1011](#1011)) ([35937b6](35937b6)), closes [#989](#989) * **codepipeline/cfn:** Use fewer statements for pipeline permissions ([#1009](#1009)) ([8f4c2ab](8f4c2ab)) * add a new construct library for ECS ([#1058](#1058)) ([ae03ddb](ae03ddb)) * **pkglint:** Make sure .snk files are ignored ([#1049](#1049)) ([53c8d76](53c8d76)), closes [#643](#643) * **toolkit:** deployment ui improvements ([#1067](#1067)) ([c832eaf](c832eaf)) BREAKING CHANGES ========= * The ec2.Connections object has been changed to be able to manage multiple security groups. The relevant property has been changed from `securityGroup` to `securityGroups` (an array of security group objects). * **aws-codecommit:** This modifies the default behavior of the CodeCommit Action. It also changes the internal API contract between the aws-codepipeline-api module and the CodePipeline Actions in the service packages. * **applets:** The applet schema has changed to allow Multiple applets can be define in one file by structuring the files like this: * **applets:** The applet schema has changed to allow definition of multiple applets in the same file. The schema now looks like this: applets: MyApplet: type: ./my-applet-file properties: property1: value ... By starting an applet specifier with npm://, applet modules can directly be referenced in NPM. You can include a version specifier (@1.2.3) to reference specific versions. * **aws-sqs:** `queue.grantReceiveMessages` has been removed. It is unlikely that this would be sufficient to interact with a queue. Alternatively you can use `queue.grantConsumeMessages` or `queue.grant('sqs:ReceiveMessage')` if there's a need to only grant this action.
Bug Fixes ======== * **aws-autoscaling:** allow minSize to be set to 0 ([#1015](#1015)) ([67f7fa1](67f7fa1)) * **aws-codebuild:** correctly pass the timeout property to CFN when creating a Project. ([#1071](#1071)) ([b1322bb](b1322bb)) * **aws-codebuild:** correctly set S3 path when using it as artifact. ([#1072](#1072)) ([f32cba9](f32cba9)) * **aws-kms:** add output value when exporting an encryption key ([#1036](#1036)) ([cb490be](cb490be)) * Switch from `js-yaml` to `yaml` ([#1092](#1092)) ([0b132b5](0b132b5)) Features ======== * don't upload the same asset multiple times ([#1011](#1011)) ([35937b6](35937b6)), closes [#989](#989) * **app-delivery:** CI/CD for CDK Stacks ([#1022](#1022)) ([f2fe4e9](f2fe4e9)) * add a new construct library for ECS ([#1058](#1058)) ([ae03ddb](ae03ddb)) * **applets:** integrate into toolkit ([#1039](#1039)) ([fdabe95](fdabe95)), closes [#849](#849) [#342](#342) [#291](#291) * **aws-codecommit:** use CloudWatch Events instead of polling by default in the CodePipeline Action. ([#1026](#1026)) ([d09d30c](d09d30c)) * **aws-dynamodb:** allow specifying partition/sort keys in props ([#1054](#1054)) ([ec87331](ec87331)), closes [#1051](#1051) * **aws-ec2:** AmazonLinuxImage supports AL2 ([#1081](#1081)) ([97b57a5](97b57a5)), closes [#1062](#1062) * **aws-lambda:** high level API for event sources ([#1063](#1063)) ([1be3442](1be3442)) * **aws-sqs:** improvements to IAM grants API ([#1052](#1052)) ([6f2475e](6f2475e)) * **codepipeline/cfn:** Use fewer statements for pipeline permissions ([#1009](#1009)) ([8f4c2ab](8f4c2ab)) * **pkglint:** Make sure .snk files are ignored ([#1049](#1049)) ([53c8d76](53c8d76)), closes [#643](#643) * **toolkit:** deployment ui improvements ([#1067](#1067)) ([c832eaf](c832eaf)) * Update to CloudFormation resource specification v2.11.0 BREAKING CHANGES ======== * The ec2.Connections object has been changed to be able to manage multiple security groups. The relevant property has been changed from `securityGroup` to `securityGroups` (an array of security group objects). * **aws-codecommit:** this modifies the default behavior of the CodeCommit Action. It also changes the internal API contract between the aws-codepipeline-api module and the CodePipeline Actions in the service packages. * **applets:** The applet schema has changed to allow Multiple applets can be define in one file by structuring the files like this: * **applets:** The applet schema has changed to allow definition of multiple applets in the same file. The schema now looks like this: applets: MyApplet: type: ./my-applet-file properties: property1: value ... By starting an applet specifier with npm://, applet modules can directly be referenced in NPM. You can include a version specifier (@1.2.3) to reference specific versions. * **aws-sqs:** `queue.grantReceiveMessages` has been removed. It is unlikely that this would be sufficient to interact with a queue. Alternatively you can use `queue.grantConsumeMessages` or `queue.grant('sqs:ReceiveMessage')` if there's a need to only grant this action.
Bug Fixes ======== * **aws-autoscaling:** allow minSize to be set to 0 ([#1015](#1015)) ([67f7fa1](67f7fa1)) * **aws-codebuild:** correctly pass the timeout property to CFN when creating a Project. ([#1071](#1071)) ([b1322bb](b1322bb)) * **aws-codebuild:** correctly set S3 path when using it as artifact. ([#1072](#1072)) ([f32cba9](f32cba9)) * **aws-kms:** add output value when exporting an encryption key ([#1036](#1036)) ([cb490be](cb490be)) * Switch from `js-yaml` to `yaml` ([#1092](#1092)) ([0b132b5](0b132b5)) Features ======== * don't upload the same asset multiple times ([#1011](#1011)) ([35937b6](35937b6)), closes [#989](#989) * **app-delivery:** CI/CD for CDK Stacks ([#1022](#1022)) ([f2fe4e9](f2fe4e9)) * add a new construct library for ECS ([#1058](#1058)) ([ae03ddb](ae03ddb)) * **applets:** integrate into toolkit ([#1039](#1039)) ([fdabe95](fdabe95)), closes [#849](#849) [#342](#342) [#291](#291) * **aws-codecommit:** use CloudWatch Events instead of polling by default in the CodePipeline Action. ([#1026](#1026)) ([d09d30c](d09d30c)) * **aws-dynamodb:** allow specifying partition/sort keys in props ([#1054](#1054)) ([ec87331](ec87331)), closes [#1051](#1051) * **aws-ec2:** AmazonLinuxImage supports AL2 ([#1081](#1081)) ([97b57a5](97b57a5)), closes [#1062](#1062) * **aws-lambda:** high level API for event sources ([#1063](#1063)) ([1be3442](1be3442)) * **aws-sqs:** improvements to IAM grants API ([#1052](#1052)) ([6f2475e](6f2475e)) * **codepipeline/cfn:** Use fewer statements for pipeline permissions ([#1009](#1009)) ([8f4c2ab](8f4c2ab)) * **pkglint:** Make sure .snk files are ignored ([#1049](#1049)) ([53c8d76](53c8d76)), closes [#643](#643) * **toolkit:** deployment ui improvements ([#1067](#1067)) ([c832eaf](c832eaf)) * Update to CloudFormation resource specification v2.11.0 BREAKING CHANGES ======== * The ec2.Connections object has been changed to be able to manage multiple security groups. The relevant property has been changed from `securityGroup` to `securityGroups` (an array of security group objects). * **aws-codecommit:** this modifies the default behavior of the CodeCommit Action. It also changes the internal API contract between the aws-codepipeline-api module and the CodePipeline Actions in the service packages. * **applets:** The applet schema has changed to allow Multiple applets can be define in one file by structuring the files like this: * **applets:** The applet schema has changed to allow definition of multiple applets in the same file. The schema now looks like this: applets: MyApplet: type: ./my-applet-file properties: property1: value ... By starting an applet specifier with npm://, applet modules can directly be referenced in NPM. You can include a version specifier (@1.2.3) to reference specific versions. * **aws-sqs:** `queue.grantReceiveMessages` has been removed. It is unlikely that this would be sufficient to interact with a queue. Alternatively you can use `queue.grantConsumeMessages` or `queue.grant('sqs:ReceiveMessage')` if there's a need to only grant this action.
What's the workaround here? Just pass in the same lambda.AssetCode() object ref to different lambdas? |
Use case:
In order to build lambdas that use the Python science stack, we need to build a giant (50+MB) zip file with all the dependencies. We then add all of our Python functions to that zip and specify the same zip as the code for each lambda with a different handler.
The problem:
Each of the lambda functions that I create have a unique asset object (even if I create a single, shared lambda.Code object). Each of these objects creates an S3 object. So if I have ten lambda functions, I will upload ten separate asset objects at 50+ MB each.
Potential solutions:
lambda.AssetCode
object unique by checking to see ifasset
is already set before creating a new one inbind()
. This means that users would need to explicitly create the shared Code object and reuse it in each lambda definition.lambda.AssetCode
by adding a map tolambda.AssetCode
and pulling the same asset into multipleAssetCode
objects. In this scenario, users could simply reference the same file/directory.assets.Asset
class. In this example, we would cache the path and make sure that we only do the upload once.I think the last solution is the only real solution, because the other two solutions don't handle the case where you create two lambdas, but only wire the second one into your app which is likely to happen in various scenarios.
The text was updated successfully, but these errors were encountered: