Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve build performance of NodejsFunction on Mac (@aws-cdk/aws-lambda-nodejs package) #8544

Closed
1 of 2 tasks
Dzhuneyt opened this issue Jun 14, 2020 · 11 comments · Fixed by #8766
Closed
1 of 2 tasks
Assignees
Labels
@aws-cdk/aws-lambda-nodejs effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. in-progress This issue is being actively worked on. os/mac Related specifically to MacOS behavior p1

Comments

@Dzhuneyt
Copy link
Contributor

Using volumes in Docker on Mac specifically is a huge pain point due to performance issues. CPU often spikes to above 100% utilization when there is heavy read or write operations inside the container (as it happens with the parcel-builder Docker container that the CDK construct NodejsFunction uses internally.

This has been very well documented in https://docs.docker.com/docker-for-mac/osxfs-caching/ and the workarounds are also described.

I suggest we force Docker volume mounts within this construct to start using Docker volume in "delegated" mode, to improve build performance (speed) on Mac based deployment environments.

I think a good starting point is:
https://github.com/aws/aws-cdk/blob/master/packages/%40aws-cdk/aws-lambda-nodejs/lib/builder.ts#L118

Use Case

Developing and deploying CDK stacks with many lambdas on a Mac machine.

Proposed Solution

Replace -v sourcevolume:targetvolume with -v sourcevolume:targetvolume:delegated

Other

  • 👋 I may be able to implement this feature request
  • ⚠️ This feature might incur a breaking change

This is a 🚀 Feature Request

@Dzhuneyt Dzhuneyt added feature-request A feature should be added or improved. needs-triage This issue or PR still needs to be triaged. labels Jun 14, 2020
@jogold
Copy link
Contributor

jogold commented Jun 15, 2020

@Dzhuneyt would this apply to other OS as well? Can you propose the change in core (lambda-nodejs is going to be refactored to use the bundling in core)? Should be an option in DockerVolume:

/**
* A Docker volume
*/
export interface DockerVolume {

cc @eladb

@Dzhuneyt
Copy link
Contributor Author

Other OSs don't suffer from this performance issue and the ":cached" or ":delegated" flag is a no-op for them.

@Dzhuneyt
Copy link
Contributor Author

@jogold could you point me to another GitHub issue or a place where I can learn more about this upcoming refactoring? Regardless, I think this fix should go forward with the current implementation until such refactoring happens (because it's tiny and the benefit for developers that use MacOS as a dev env of choice are pretty obvious).

@jogold
Copy link
Contributor

jogold commented Jun 15, 2020

@jogold could you point me to another GitHub issue or a place where I can learn more about this upcoming refactoring?

Not opened yet, will try to start the discussion in an issue/PR this week or next week.

@moltar
Copy link
Contributor

moltar commented Jun 16, 2020

Is there a reason why it runs inside a Docker container anyways? Seems to be a bit heavy handed. I think there should be an option to run it locally.

@jogold
Copy link
Contributor

jogold commented Jun 16, 2020

could you point me to another GitHub issue or a place where I can learn more about this upcoming refactoring?

@Dzhuneyt see #8576

Is there a reason why it runs inside a Docker container anyways?

See Why we use Docker? in #8460
See also aws/aws-cdk-rfcs#7

@SomayaB SomayaB added os/mac Related specifically to MacOS behavior and removed needs-triage This issue or PR still needs to be triaged. labels Jun 17, 2020
@Dzhuneyt
Copy link
Contributor Author

@jogold I plan on working on the implementation in "core" at some point. Does this mean this PR is unnecessary? Personally, I still see value in it being pushed live until such refactoring comes into effect, but the decision is with you.

@Dzhuneyt
Copy link
Contributor Author

@Dzhuneyt would this apply to other OS as well? Can you propose the change in core (lambda-nodejs is going to be refactored to use the bundling in core)? Should be an option in DockerVolume:

/**
* A Docker volume
*/
export interface DockerVolume {

cc @eladb

Has any work started on this part "lambda-nodejs is going to be refactored to use the bundling in core"?

@jogold
Copy link
Contributor

jogold commented Jun 22, 2020

@Dzhuneyt #8576 has been merged, lambda-nodejs now uses the bundling from core.

@eladb eladb added effort/medium Medium work item – several days of effort p1 labels Jun 22, 2020
@SomayaB SomayaB added the in-progress This issue is being actively worked on. label Jun 25, 2020
@jogold
Copy link
Contributor

jogold commented Jun 26, 2020

@Dzhuneyt do you still want to propose this change? Let met know if you need some help.

@Dzhuneyt
Copy link
Contributor Author

Yes, I will try to implement this. It's my first major contribution to AWS CDK, so I might need some guidance. I'll write here if i need any help.

@mergify mergify bot closed this as completed in #8766 Jul 1, 2020
mergify bot pushed a commit that referenced this issue Jul 1, 2020
First time contributor to AWS CDK here. Let me know if I've missed something.

As a developer who uses macOS for development purposes on a daily basis and AWS CDK for an enterprise project, I am really impacted by the official and [well known Docker volumes performance issue](https://docs.docker.com/docker-for-mac/osxfs-caching/). Since the project that we are working on, relies heavily on Lambdas that are primarly written in TypeScript, the mass deployment of these Lambdas using AWS CDK CLI and the (recently introduced) NodejsFunction construct takes a significant amount of time and causes a CPU overload.

This is mainly due to the fact that the construct internally does the following things:
1. Starting a docker container with Parcel
2. Mounting the root of the closest folder that contains .git (this is the root of the whole CDK app for us, that includes the CDK stacks, the Lambdas and any other utility functions and node_modules folders, since we opted for the "monorepo" structural pattern)
3. Runs the parcel bundler inside the container, which emits a single .js file as an asset that will serve as the Lambda's "code"
4. Deploys the .js file using the standard lambda.Function construct (NodejsFunction is just a wrapper of lambda.Function with @aws-cdk/core bundling and Parcel involved).

For our project, this means CDK deployment spawns dozens of containers (one per Lambda), that all mount (using Docker volumes), the whole project. The frequent IO operations by Parcel within the container cause a massive CPU spike in all of these containers, causing each Lambda compilation to take 1-2 seconds. This latency is quickly magnified as the number of Lambdas grow within the project.

The latency and CPU spike is mainly a side effect of how Docker and volumes work in macOS for which you can read more on the above link. This is a common pain point for all developers who use these tools, even outside the AWS CDK world. Here are some examples and further reading, including further reading on how flags like ":cached" or ":delegated" help:

- docker/for-mac#1759 (comment)
- https://stackoverflow.com/questions/58277794/diagnosing-high-cpu-usage-on-docker-for-mac/58293240#58293240
- https://stackoverflow.com/questions/51694789/macbook-pro-2018-high-temperature-and-cpu-usage-with-docker/51698665#51698665
- https://stackoverflow.com/questions/60878918/docker-hyperkit-process-cpu-usage-going-crazy-how-to-keep-it-under-control#comment107712547_60878918
- https://stackoverflow.com/questions/55951014/docker-in-macos-is-very-slow/55953023#55953023

I've decided to use ":delegated" here since the bundling mechanism generates files inside the container that need replication on the host machine with small tolerable delay, not the other way around (where ":cached" would have been useful). The addition of the flag is non-conditional based on the current OS, because the existence of the flag is a NO-OP in all other operating systems (tested) and is only taken into account by Docker when the volume driver is detected to be osxfs.

Closes #8544
----

*By submitting this pull request, I confirm that my contribution is made under the terms of the Apache-2.0 license*
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/aws-lambda-nodejs effort/medium Medium work item – several days of effort feature-request A feature should be added or improved. in-progress This issue is being actively worked on. os/mac Related specifically to MacOS behavior p1
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants