-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pipelines] Add stage to strip assets out of cloud assembly before deploying to CloudFormation #9917
Comments
I'm also seeing this error and becoming very close to hitting the limit 253MB with 36 lambdas, 1 docker container, and two application stages, Staging and Prod. |
@MamishIo any progress here? |
Ran into this too while deploying to multiple regions, worked for 2 regions, got the limit on 3. |
@MamishIo is there any workaround for this? |
There is no easy workaround as of yet. |
What could work is producing 2 cloud artifacts from the synth step (one with the assets, one without) and then using property overrides to switch between them for the different actions. |
@rix0rrr Is there any timeline when this might be fixed? We're not able to use pipelines for a multi-region setup because of this. |
There is no timeline as of yet. |
Another workaround you could try is postprocessing the |
We stage assets into the Cloud Assembly directory. If there are multiple nested Cloud Assemblies, the same asset will be staged multiple times. This leads to an N-fold increase in size of the Cloud Assembly when used in combination with CDK Pipelines (where N is the number of stages deployed), and may even lead the Cloud Assembly to exceed CodePipeline's maximum artifact size of 250MB. Add the concept of an `assetOutdir` next to a regular Cloud Assembly `outDir`, so that multiple Cloud Assemblies can share an asset directory. As an initial implementation, the `assetOutdir` of nested Cloud Assemblies is just the regular `outdir` of the root Assembly. We are playing a bit fast and loose with the semantics of file paths across our code base; many properties just say "the path of X" without making clear whether it's absolute or relative, and if it's relative what it's relative to (`cwd()`? Or the Cloud Assembly directory?). Turns out that especially in dealing with assets, the answer is "can be anything" and things just happen to work out based on who is providing the path and who is consuming it. In order to limit the scope of the changes I needed to make I kept modifications to the `AssetStaging` class: * `stagedPath` now consistently returns an absolute path. * `relativeStagedPath()` a path relative to the Cloud Assembly or an absolute path, as appropriate. Related changes in this PR: - Refactor the *copying* vs. *bundling* logic in `AssetStaging`. I found the current maze of `if`s and member variable changes too hard to follow to convince myself the new code would be doing the right thing, so I refactored it to reduce the branching factor. - Switch the tests of `aws-ecr-assets` over to Jest using `nodeunitShim`. Fixes #10877, fixes #9627, fixes #9917.
) We stage assets into the Cloud Assembly directory. If there are multiple nested Cloud Assemblies, the same asset will be staged multiple times. This leads to an N-fold increase in size of the Cloud Assembly when used in combination with CDK Pipelines (where N is the number of stages deployed), and may even lead the Cloud Assembly to exceed CodePipeline's maximum artifact size of 250MB. Add the concept of an `assetOutdir` next to a regular Cloud Assembly `outDir`, so that multiple Cloud Assemblies can share an asset directory. As an initial implementation, the `assetOutdir` of nested Cloud Assemblies is just the regular `outdir` of the root Assembly. We are playing a bit fast and loose with the semantics of file paths across our code base; many properties just say "the path of X" without making clear whether it's absolute or relative, and if it's relative what it's relative to (`cwd()`? Or the Cloud Assembly directory?). Turns out that especially in dealing with assets, the answer is "can be anything" and things just happen to work out based on who is providing the path and who is consuming it. In order to limit the scope of the changes I needed to make I kept modifications to the `AssetStaging` class: * `stagedPath` now consistently returns an absolute path. * `relativeStagedPath()` a path relative to the Cloud Assembly or an absolute path, as appropriate. Related changes in this PR: - Refactor the *copying* vs. *bundling* logic in `AssetStaging`. I found the current maze of `if`s and member variable changes too hard to follow to convince myself the new code would be doing the right thing, so I refactored it to reduce the branching factor. - Switch the tests of `aws-ecr-assets` over to Jest using `nodeunitShim`. Fixes #10877, fixes #9627, fixes #9917. ---- *By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
|
I'm not sure if this issue is completely fixed by #11008, as it is reported. #11008 only fixes issues of assets being needlessly duplicated—it doesn't do anything to solve the issues with assets needlessly moving forward in the pipeline and potentially hitting size limits. I'm currently encountering this issue, as my assets include several large Docker images with large build context dependencies. As a result, the CloudAssembly artifact hits 4.6GB in size by the time it goes forward into the CFN deployment stage. |
Also faced this issue recently. Even after some optimization, I'm uncomfortably close to the 256MB limit. |
@rix0rrr Any chance we could get this re-opened? See my comment above. |
This is continuously causing us pain when working through deploying a bunch of static assets via https://docs.aws.amazon.com/cdk/api/latest/docs/aws-s3-deployment-readme.html. |
This is currently causing us pain. The Master build won't go through on our pipeline. We have the following error staring at us: Action execution failed Please help. |
This is a real pain and it breaks our pipelines. Any chance the 256 MB limit can be increased? |
Any update on this? |
Does anyone have any working work-arounds for this - and @aws team, is there anything specific that could be worked on to aid in this? |
It would be great to at least have a viable workaround for this until a fix is put into place. It's causing a lot pain. |
You can try something like this, its quite hacky but it works for me.
Don't forget to add a S3::PutObject Permission to the ServiceRole. |
We hit this issue today too. @aws team it would sure be nice if someone took the time to add a clearer guide on how to work around this given that it doesn't sound like a fix is on the radar soon. thank you |
Works around the 256MB input artifact size limit for CFN deploy actions in CodePipeline, which can be exceeded due to asset files. Related to aws/aws-cdk#9917.
Works around the 256MB input artifact size limit for CFN deploy actions in CodePipeline, which can be exceeded due to asset files. Related to aws/aws-cdk#9917.
This might not be applicable for most people since my project is a bit weird (e.g. Java instead of TS, legacy CDK pipeline lib, CodeBuild synth via HtyCorp/serverbot2-core@d439729 The pipeline is spitting out 260MB assemblies now but deploying without any problems! Hope that helps someone even if it's not a great general solution. |
Unless I'm mistaken, all assets are already published past the strip_assets_step = CodeBuildStep(
'StripAssetsFromAssembly',
input=pipeline.cloud_assembly_file_set,
commands=[
'S3_PATH=${CODEBUILD_SOURCE_VERSION#"arn:aws:s3:::"}',
'ZIP_ARCHIVE=$(basename $S3_PATH)',
'rm -rfv asset.*',
'zip -r -q -A $ZIP_ARCHIVE *',
'aws s3 cp $ZIP_ARCHIVE s3://$S3_PATH',
],
)
pipeline.add_wave('BeforeDeploy', pre=[strip_assets_step])
# Add your stages...
pipeline.build_pipeline()
pipeline.pipeline.artifact_bucket.grant_write(strip_assets_step.project) |
@tobni the strip_assets_step is working correctly for me and shows the artifact SynthOutput is 1.1 MB however the subsequent stages in the wave still get an input artifact SynthOutput that's 200MB+. Is there a missing step to get them to use the output from strip_assets_step? Edit: |
I finally got @tobni's code to work with cross region replication, which uses a different randomly named bucket for every region!
And you need the following permissions
|
@tobni and @jonathan-kosgei thanks a lot guys for the help.
|
Tagging @rix0rrr and @MamishIo following advice from the Comment Visibility Warning. I ran into this issue today. I believe the current situation is that people have found a bit of an icky workaround in adding extra CodeBuildSteps to clean out the assets in the SynthOutput (See above comments) but it would be great to not have to do this. Based on what others have said it seems like the SynthOutput doesn't need to be passed at all in the first place and this could be removed? Doing so would render this workaround unneeded. |
We hit this issue this week and had to put together a work around from the answers here. Adding to the comments from @jonathan-kosgei to add a version of the strip_assets_step = CodeBuildStep(
'StripAssetsFromAssembly',
input=pipeline.cloud_assembly_file_set,
commands=[
"cross_region_replication_buckets=$(grep BucketName cross-region-stack-* | awk -F 'BucketName' '{print $2}' | tr -d ': ' | tr -d '\"' | tr -d ',')",
'S3_PATH=${CODEBUILD_SOURCE_VERSION#"arn:aws:s3:::"}',
'ZIP_ARCHIVE=$(basename $S3_PATH)',
'rm -rf asset.*',
'zip -r -q -A $ZIP_ARCHIVE *',
'aws s3 cp $ZIP_ARCHIVE s3://$S3_PATH',
'object_location=${S3_PATH#*/}',
'for bucket in $cross_region_replication_buckets; do aws s3 cp $ZIP_ARCHIVE s3://$bucket/$object_location; done'
],
) You can also access the replication names dynamically from pipeline.build_pipeline()
cross_region_support = pipeline.pipeline.cross_region_support
replication_bucket_arns = [
cross_region_support[key].replication_bucket.bucket_arn
for key in cross_region_support.keys()]
replication_bucket_objects = [arn + '/*' for arn in replication_bucket_arns]
replication_resources = replication_bucket_arns + replication_bucket_objects
pipeline.pipeline.artifact_bucket.grant_write(strip_assets_step.project)
strip_assets_step.project.add_to_role_policy(
cdk.aws_iam.PolicyStatement(
effect=cdk.aws_iam.Effect.ALLOW,
resources=replication_resources,
actions=["s3:*"],
)
)
strip_assets_step.project.add_to_role_policy(
cdk.aws_iam.PolicyStatement(
effect=cdk.aws_iam.Effect.ALLOW,
resources=["*"],
actions=["kms:GenerateDataKey"]
)
) |
@rix0rrr it seems the common workaround is to wipe out the assets. Is this a suggested workaround? |
Throwing in my TypeScript solution for cross-region buckets based on the above: const { crossRegionSupport, artifactBucket } = pipeline.pipeline
const artifactBuckets = [
artifactBucket,
...Object.values(crossRegionSupport).map((crs) => crs.replicationBucket),
]
for (const bucket of artifactBuckets) {
bucket.grantReadWrite(stripAssetsStep.project)
} |
How about this mad solution: Create an object lambda access point on the bucket. Lambda would filter the artifacts on the fly, and remove unnecessary files. The only thing I am unsure how to achieve is to tell the steps below to use the access point, instead of the bucket directly. I am guessing this would be possible to do at the CDK core level, but not sure if would be possible to do as a "workaround". |
I am also leaving my Java solution based on @tobni 's implementation. Thanks a lot!
|
this is annoying and honestly feels like something that cdk/pipelines should handle.
Add this to your |
For those using java here is wat is working for me in my multi region cdk pipeline
Thanks to all that have contributed to this issue with examples in different langauges! |
Adding to this, we ran into this today. Any updates on a solution or recommended work around that doesn't involve adding a step? |
We also ran into this issue yesterday. Is there any solution covering Python without adding a new step? |
To avoid the CodePipeline artifact size limit in CloudFormation deploy actions, the pipeline should generate an intermediate artifact which is the cloud assembly but with asset files removed, and use this as the input for the deploy actions.
Use Case
Regardless of the source provider used, CFN deploy actions have an input artifact size limit of 256MB. The CDK pipeline uses the initial cloud assembly, containing all asset files, all the way through to the CFN action inputs, even though the stacks don't require them (as far as I understand the asset system, all assets are published and linked to CFN parameters by this point).
For builds that produce large/multiple assets totalling over 256MB, this causes CodePipeline limit errors in the deployment stages. Assemblies up to 1GB or 5GB (depending on the source provider) could be produced with this change.
Specific example: monorepos used to build many related services that are all deployed as separate containers/functions/etc.
Proposed Solution
Add an extra pipeline stage after asset publishing and before application stage deployment, which runs a CodeBuild action to load the cloud assembly, strip out asset files, and generate a new artifact containing only the CFN templates and any data necessary for CFN. The CFN actions should use this new artifact as their input.
Other
This is currently exaggerated by the lack of a de-dupe option when deploying multiple application stages using the same assets - I believe feature request #9627 will reduce code size substantially.
Overall code size can be reduced by using Lambda layers, but this adds build and deploy complexity compared to using standalone code assets.
[] 👋 I may be able to implement this feature request
[]⚠️ This feature might incur a breaking change
(:warning: assumes I haven't overlooked some need for the CFN actions to have direct access to asset files)
This is a 🚀 Feature Request
The text was updated successfully, but these errors were encountered: