Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(lambda-python): asset hash is non-deterministic #12984

Merged
merged 14 commits into from
Feb 24, 2021

Conversation

eladb
Copy link
Contributor

@eladb eladb commented Feb 11, 2021

When a Python handler uses external dependencies, the hash calculated on the output is non-determinstic due to the fact that it includes timestamps. To resolve that, change the asset hash strategy to SOURCE which means that the bundle will only be re-created if one of the source files changes.

Additionally, the Python image hash, which is also included in the asset hash (to ensure that if the build image changes, the bundle is invalidated) included the absolute path for the Dockerfile. This caused the image hash itself to change every time the image was built on a different system. To fix this, we stage the dependency files & dockerfile into a temp directory and build the image from there.

Fixes #12770
Fixes #12684


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license

When a Python handler uses external dependencies, the hash calculated on the output is non-determinstic due to the fact that it includes timestamps. To resolve that, change the asset hash strategy to `SOURCE` which means that the bundle will only be re-created if one of the source files changes.

Fixes #12770
Fixes #12684
@gitpod-io
Copy link

gitpod-io bot commented Feb 11, 2021

@eladb eladb requested a review from nija-at February 11, 2021 14:17
@eladb eladb requested a review from iliapolo February 11, 2021 14:17
@mergify mergify bot added the contribution/core This is a PR that came from AWS. label Feb 11, 2021
Copy link
Contributor

@iliapolo iliapolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means effectively forcing users to modify their requirements in order to push an updated version of a dependency. Maybe make this configurable?

@eladb
Copy link
Contributor Author

eladb commented Feb 11, 2021

This means effectively forcing users to modify their requirements in order to push an updated version of a dependency. Maybe make this configurable?

What do you mean by configurable? What kind of API would you imagine? deployOnEveryUpdate? This conflicts with one of the foundational tenets we have in the CDK which says that only when the source changes your app changes. We have multiple places in our tooling (CLI/CDK pipelines) which rely on that tenet.

Technically, they don't need to modify requirements.txt, just to change something in their source. So they can just add a comment somewhere and bump a random value in order for their bundle to be recreated.

Let me know how do you propose to model this.

@iliapolo iliapolo added the pr/do-not-merge This PR should not be merged at this time. label Feb 11, 2021
Copy link
Contributor

@iliapolo iliapolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This conflicts with one of the foundational tenets we have in the CDK which says that only when the source changes your app changes.

Yeah, and I'm all for it. But we do have this option of using OUTPUT, which presumably addresses changes in dependencies (?). I'm just wondering if we should address the use case of users explicitly not locking their dependencies, expecting them to update on build, which is what happens now.

As far as API, yeah something like deployOnEveryUpdate, or maybe even straight up expose hashType for advanced users. I can try to come up with something less awkward if you choose to pursue it. But i'm also good with using SOURCE for now and possibly extending later.

Added do-not-merge.

@fsmanuel
Copy link

@eladb thanks for working on this!
@iliapolo I can partially understand your concerns. As I tried to explain in #12770 I'm worried about the current side effects.

I think the best solution would be to allow to set assetHashType via PythonFunctionProps and PythonLayerVersionProps. We can stick to the current behavior but make it possible to change it to SOURCE if desired.

@eladb
Copy link
Contributor Author

eladb commented Feb 22, 2021

I'll add a way to control this, but I think the default should be SOURCE

Elad Ben-Israel added 8 commits February 22, 2021 13:46
@eladb eladb removed the pr/do-not-merge This PR should not be merged at this time. label Feb 24, 2021
@aws-cdk-automation
Copy link
Collaborator

AWS CodeBuild CI Report

  • CodeBuild project: AutoBuildProject89A8053A-LhjRyN9kxr8o
  • Commit ID: c81557b
  • Result: SUCCEEDED
  • Build Logs (available for 30 days)

Powered by github-codebuild-logs, available on the AWS Serverless Application Repository

@mergify
Copy link
Contributor

mergify bot commented Feb 24, 2021

Thank you for contributing! Your pull request will be updated from master and then merged automatically (do not update manually, and be sure to allow changes to be pushed to your fork).

@mergify mergify bot merged commit 37debc0 into master Feb 24, 2021
@mergify mergify bot deleted the benisrae/python-function-use-source-hash branch February 24, 2021 10:13
TRANTANKHOA added a commit to TRANTANKHOA/cdk-ecr-deployment that referenced this pull request Apr 30, 2021
set `assetHashType: AssetHashType.SOURCE`, see aws/aws-cdk#12984
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-lambda-python contribution/core This is a PR that came from AWS.
Projects
None yet
4 participants