-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Imports of large Python library (aws-cdk-lib) extremely slow #3389
Comments
Hi, original reporter of aws/aws-cdk#19000 here. 🙂 I wonder if it's something wrong with my environment. I didn't originally try out CDK TypeScript and now wanted to check how the experience is there. To my surprise, it's quite slow... In a directory initialized with $ time cdk ls --debug
CdkTsWorkshopStack
cdk ls --debug 9.44s user 1.52s system 86% cpu 12.634 total Any suggestions to what I should look into? |
Did another test, running in Docker (to exclude my Macbook from the equation), and the results are quite different: Setup TypeScript: $ docker run -it --rm --workdir /cdk_workshop node:16-bullseye bash
# inside container:
$ npm install -g cdk@2.12.0
$ cdk init sample-app --language typescript Test: $ time cdk ls
CdkWorkshopStack
real 0m6.648s
user 0m7.601s
sys 0m1.378s Setup Python: $ docker run --rm -it --workdir /cdk_workshop python:3.9-bullseye bash
# inside container:
$ curl -fsSL https://deb.nodesource.com/setup_16.x | bash - && apt-get install -y nodejs
$ npm install -g cdk@2.12.0
$ cdk init sample-app --language python
$ source .venv/bin/activate
$ pip install -r requirements.txt Test: $ time cdk ls
cdk-workshop
real 0m6.195s
user 0m4.823s
sys 0m1.832s
$ time python -c "import aws_cdk as cdk"
real 0m6.495s
user 0m5.019s
sys 0m1.954s Edit: updated with Python import timing result |
Removing the It seems like this report suggests that: importing Note that importing This perhaps has to do more with I'm going to try to come up with a "somewhat scientific" measurement protocol here (across all supported languages), to try and better qualify where the problem might originate from. |
Okay so far I do see there is a significant delta between the various "minimal apps":
I can see affected languages would be: C#, Python, and Java. Although this has a sample-size of 1. Interestingly, Go has lower user + system times, but higher wall time than JavaScript (this is somewhat surprising -- where do the extra 3 seconds go?) |
Focusing on Python... suing
That is ~70% of the runtime accounted by reading from the child process' output stream... That might be time spent waiting for the node process to return data... but I guess the python-side profile doesn't quite tell the full picture on this. |
I've actually gotten conflict data on the |
On a different machine – 2016 MacBook Pro 13 (Intel i7), quite a big step down from the 2019 MacBook Pro 16 that I initially did the testing on, it actually behaves faster: $ time npx cdk@2 ls
cdk-19000
npx cdk@2 ls 6.90s user 2.84s system 106% cpu 9.164 total
$ time python3 -c 'import aws_cdk'
python3 -c 'import aws_cdk' 5.40s user 2.53s system 104% cpu 7.556 total
$ npx cdk@2 --version
2.13.0 (build b0b744d) ~7 sec is an improvement to ~17 sec, so if I could get the same performance on my main machine... As mentioned, I get better performance when running in a docker container – I'm contemplating setting up my dev environment for CDK to run in Docker... |
As a workaround, I think this is something I'll be able to use:
version: '3'
services:
cdk:
build: .
working_dir: /cdk
volumes:
- ./:/cdk
entrypoint: ["bash", "-c", "trap exit INT TERM; while :; do sleep 1 & wait; done;"]
FROM python:3.9-bullseye
RUN curl -fsSL https://deb.nodesource.com/setup_16.x | bash - && apt-get install -y nodejs && \
npm install -g cdk@2.12.0 Then these steps:
|
Regarding Docker Compose (not strictly on-topic, but I think this might be helpful for others coming here): The docker-compose approach didn't work for me in the end. For |
@kbakk did you ever find a resolution for this? On my win10 machine, CDK commands execute painfully slowly (over 60s for a > time python -c "import aws_cdk as cdk"
real 0m50.914s
user 0m0.015s
sys 0m0.062s I see similar slowness running > time cdk synth -q
real 0m47.562s
user 0m0.153s
sys 0m0.339s Interestingly, I get much better performance when using an Ubuntu 20.04 WSL instance from the same machine: > time python -c "import aws_cdk as cdk"
real 0m10.985s
user 0m9.548s
sys 0m1.731s When I profiled app.py, I also saw a significant portion of time being taken up in |
No, still slow. I'm currently running these commands Linux VM (hosted in Google Cloud, can't award AWS for this 😄). |
might be related #3365 |
I've just installed cdk on fedora35 and am using python. "cdk ls" in a new project is taking >6s. The node process which is spawned by "import aws_cdk as cdk" is taking most of the time, and I notice that it's populating 7000+ files each time in a new directory /tmp/jsii-kernel-rAnDoM under subdirs: node_modules/{aws-cdk-lib,constructs}. I can watch it taking seconds to do this. (The directory is usually destroyed at the end of each command.) I don't know much about node. I tried installing aws-cdk-lib globally with npm (having originally installed aws-cdk as per the docs) but the behaviour is unchanged. Everything works, but just with this overhead on each command. |
Whenever I see Windows be slower than WSL, the first thing that comes to mind is Windows Defender. Have you excluded the containing folder from Defender to count that out? |
@RomainMuller has there been any progress from your side (or anyone else at AWS) on this? Is there anything that can be done to ensure this gets attention? |
Hello, we observe the same issue. Simple python file with the content |
We spent some time debugging the CDK load process. It is somewhat unwieldy due to two major issues/design decisions that contribute to the problem. I will outline it here for the general public, as I assume that AWS people decided on these tradeoffs knowingly. The first issue is with the JSII distribution itself. Unlike python modules, where the source code is already present in the python library directory, JSII package comes as a tarball containing the typescript files. On each run of the python program, the tarball is extracted to a temporary directory and executed from the temp dir. This is further slowed down by the fact that the file is gzipped, so CPU time must be expended to decompress. Finally, the decompression and unpacking is performed in javascript, which is much slower that the native 'tar' command. We see no user benefit in having the tarball decompressed on each program run. Instead, we worked around this by unpacking the files once and re-using the same path for each application run. Getting rid of the repeated unpacking already saves almost half of the load time -- we were able to get to about 4 seconds instead of 7. The other issue is with the fact that aws_cdk is now packaged as myriad modules for each of the aws service (EC2, SSM, ...) altogether. When the JSII initializes, it loads and interprets all the code in one go, as opposed to lazy loading of what is actually required. This would be advantageous if majority of the modules were actually required, but in a realistic project only a handful of the 200+ modules is used. This results in a terrible waste as we are actually loading thousands of callable methods that are never called. Unfortunately, working around that isn't very easy as it requires editing the huge (50MB+) .jsii manifest which we couldn't do by hand. However, we suspect that this factor is responsible for majority of the rest of the load times. I'd like to stress that I find it not acceptable to add 7+ seconds to any program's load time without a very good justification. It strikes me as odd that this was not noticed or given weight during internal testing. I do like to work with CDK and I am open to discussing possible solutions, rather sooner than later. |
This is odd though, I have only seen complaints about this from Python developers at this stage... The bundling/loading is done in the exact same way in all languages... so why is Python the only one experiencing the slowness? Or am I missing signals in other languages, too? |
.NET is also painfully slow, to the point where we dropped it in favor of TypeScript. Even TypeScript isn't exactly snappy, though it's better than Python and .NET by a margin. |
I have dug deeper on this subject and found out that (sample size = 1):
From this perspective:
Specifically on some of @du291 claims (by the way, thank you for the detailed writeup):
My experimentations demonstrated that the
I initially did not believe the difference would be big enough to be material. When we chose this mechanism about 4/5 years ago, we had run some benchmarks and the npm library performed similar to the
Unpacking on every run was deemed safer (guarantees the files on-disk have not been tampered with, removes some race condition risks, etc...). It also consumes less disk space at rest.
Given the research above that confirms your findings, I guess we can easily improve the situation a lot by doing something similar... I don't know how you got around to do your deed here, but if it makes sense & you're able and willing to file a PR with what you have... this might give us a headstart.
This is correct. I would note though that in this particular area, using TypeScript gives you no edge. The way
It's obviously hard to disagree with this statement. In an ideal world, the load performance would be identical (in the same ballpark) regardless of your language of choice (after all, the premise of jsii is that you should be free to choose your language independently of other considerations), and I will treat any excessive "jsii tax" as a defect. Immediate next steps here are:
Longer term (these are non-trivial and risk incurring breaking changes):
|
Adds an experimental (hence opt-in) feature that caches the contents of loaded libraries in a directory that persists between executions, in order to spare the time it takes to extract the tarballs. When this feature is enabled, packages present in the cache will be used as-is (i.e: they are not checked for tampering) instead of being extracted from the tarball. The cache is keyed on: - The hash of the tarball - The name of the library - The version of the library Objects in the cache will expire if they are not used for 30 days, and are subsequently removed from disk (this avoids a cache growing extremely large over time). In order to enable the feature, the following environment variables are used: - `JSII_RUNTIME_PACKAGE_CACHE` must be set to `enabled` in order for the package cache to be active at all; - `JSII_RUNTIME_PACKAGE_CACHE_ROOT` can be used to change which directory is used as a cache root. It defaults to: * On MacOS: `$HOME/Library/Caches/com.amazonaws.jsii` * On Linux: `$HOME/.cache/aws/jsii/package-cache` * On Windows: `%LOCALAPPDATA%\AWS\jsii\package-cache` * On other platforms: `$TMP/aws-jsii-package-cache` - `JSII_RUNTIME_PACKAGE_CACHE_TTL` can be used to change the default time entries will remain in cache before expiring if they are not used. This defaults to 30 days, and the value is expressed in days. Set to `0` to immediately expire all the cache's content. When troubleshooting load performance, it is possible to obtain timing data for some critical parts of the library load process within the jsii kernel by setting `JSII_DEBUG_TIMING` environment variable. Related to #3389
Hi @RomainMuller , thank you for the reply and action plan. It's very appreciated! Please find some responses below...
This is my measurement... I guess every platform is different. One way to reach a compromise can be lz4...
Yes, I agree, most of the problems are heavily magnified by the library size. I think jsii has hit a scaling problem-- the discussion should be, do we want to ship all these 200+ modules together? Then we need to massively optimize it... Or do we want people to hand pick the 10 they use, like in CDK1, and then none of that matters... I only speak for myself, I didn't mind having 10 imports and pip requirements.
I haven't really thought about tampering here ... I guess all the pip packages in python library (including the CDK python files) are suspect to tampering as well? But what is the risk?
See patch below... it's not in any shape to be pulled into the repo, but gives you the idea. Seeing that you already implemented a cache, it might be moot. The original idea was to keep the unpacked files with the pip package... (that would take care of old versions, etc). When we tried that, we run into some issues regarding the directory structure that is expected (there needs to be "node-modules", and under it "constructs" and "aws-cdk-lib") and willing not to do symlink shenanigans for the sake of simple measurement, we settled on having a persistent unpacked copy in /tmp (which has the same structure as the on-the-fly copy that would normally appear under a random name there). The first hunk covers the check of "already-loaded assembly" which possibly incorrectly assumes that from existence of a directory, rather than checking the actual Then, we remove the need for The hunk at 5170 fixates the load directory and removes the hook to delete it. Finally the last two at 5145 and 5348 have to do with a different thing. We attempted to reduce the load times by commenting out unneeded modules in This patch is against the webpack version, because we didn't (still don't) have much insight into what's going on so we used quite a lot of reverse engineering on the installed pip module -- YMMV.
|
To be fair, part of the issue is the CDK is a super package at both the package and the code level. The CDK could still be a "super" package (e.g., only one pip/npm install to get all the things you need), but not be tightly linked together like it is today (import what you need only). The download size would still be large, but JSII compression and the proposed lambda layer changes make that more tolerable as a one-time cost. |
Adds an experimental (hence opt-in) feature that caches the contents of loaded libraries in a directory that persists between executions, in order to spare the time it takes to extract the tarballs. When this feature is enabled, packages present in the cache will be used as-is (i.e: they are not checked for tampering) instead of being extracted from the tarball. The cache is keyed on: - The hash of the tarball - The name of the library - The version of the library Objects in the cache will expire if they are not used for 30 days, and are subsequently removed from disk (this avoids a cache growing extremely large over time). In order to enable the feature, the following environment variables are used: - `JSII_RUNTIME_PACKAGE_CACHE` must be set to `enabled` in order for the package cache to be active at all; - `JSII_RUNTIME_PACKAGE_CACHE_ROOT` can be used to change which directory is used as a cache root. It defaults to: * On MacOS: `$HOME/Library/Caches/com.amazonaws.jsii` * On Linux: `$HOME/.cache/aws/jsii/package-cache` * On Windows: `%LOCALAPPDATA%\AWS\jsii\package-cache` * On other platforms: `$TMP/aws-jsii-package-cache` - `JSII_RUNTIME_PACKAGE_CACHE_TTL` can be used to change the default time entries will remain in cache before expiring if they are not used. This defaults to 30 days, and the value is expressed in days. Set to `0` to immediately expire all the cache's content. When troubleshooting load performance, it is possible to obtain timing data for some critical parts of the library load process within the jsii kernel by setting `JSII_DEBUG_TIMING` environment variable. Related to #3389 --- By submitting this pull request, I confirm that my contribution is made under the terms of the [Apache 2.0 license]. [Apache 2.0 license]: https://www.apache.org/licenses/LICENSE-2.0
Loading other languages are also slow, but Python loading is ~3x the others. And it's not a difference of 50ms vs 150ms. It's 3s vs 9s. |
Hi, I reached this issue after encountering slow import times for cdk based libraries (we're using I've started this discussion to try to understand why it isn't working. So if anyone comes to this issue and is having the same problem, follow the discussion there. |
I've also noticed the same issue on my Windows machine, import time is slightly faster running on Ubuntu WSL2. My current workaround for this issue is to avoid installing the entire
This brings my |
@matthewpick Yes, but this way you are using CDKv1 which is on maintenance mode (only receiving critical bug fixes and security patches) since June 1, 2022 and will stop receiving any support on June 1, 2023. |
@ermanno Thanks for the clarification. You are correct, I recently started using CDK and wasn't aware that module-based import/installation was only a thing in v1, which is unfortunate. I guess I'll just put up with the ~30s import time for the time being. Definitely slows down incremental development of new infrastructure via |
Note that caching has been implemented, and allows for much quicker subsequent runs. |
This is great for development, but my use case is CI/CD where it's often a cold start or just one execution of |
#4181 will provide improvements to the JavaScript side of the load problem. There is still an apparent issue where loading any part of the Python generated code results in ALL of it being loaded, which can be slow due to the sheer amount of code this amounts to. There may be avenues to generate code differently to avoid this particular behavior, but I'm not entirely sure how and this requires further research to avoid causing breaking changes.
Your CI/CI can persist the cache location and re-use it in between runs. Generally speaking, CI/CD also can afford to be a little slower as there's normally no human patiently waiting for it to be done before they can do some work... As far as I'm aware, we're talking about seconds here (maybe enough to make a couple of minutes), not hours... |
I would be fine it was seconds. Removing the part that imports all modules from Needless to say, I speculate there's a very great runtime optimization waiting to be implemented here. Similar to how Python's |
We've implemented caching in CI and it's working reasonably well. It's a little ugly, but it works — we send an archive of the local cache (on main branch only) to S3, and pull it on branch runs. |
We have since fixed the caching issue as well-- albeit the first run after any update seems to choke the memory of the instance it's running on. Had to upgrade the instance type to 8gb get any reliable runs. Kinda ridiculous that 8gb is the minimum needed but I guess that's the kind of applications jsii and its derivatives are... |
Hey, can anyone pls update regarding this issue? We still getting this performance issue delay of ~30seconds in our test suites and its really unbearable. important notes - we're using WINDOWS Platform not MAC Please assist ! |
anyone? |
This issue is preventing AWS ParallelCluster from migrating to AWS CDK v2. Could you please raise the priority? |
🐛 Bug Report
Affected Languages
TypeScript
orJavascript
Python
Java
C#
,F#
, ...)Go
General Information
What is the problem?
Originally reported as aws/aws-cdk#19000
I noticed that simple commands such as
cdk ls
would take very long (sometimes above 20 seconds, rarely much less) to complete. This in a project created usingcdk init sample-app --language python
.Reproduction Steps
What did you expect to happen?
A simple import should not take longer than 1 second (ideally even less, but there are constrains that make that diffucult here I understand).
What actually happened?
As seen above, it takes ~17 seconds. This is quite consistent, however first or second time after initing the project, it can take 20-25 seconds.
CDK CLI Version
2.12.0 (build c9786db)
Framework Version
Node.js Version
v16.14.0
OS
MacOS 11.6.2 (20G314) - Intel Macbook
Language
Python
Language Version
Python (3.8.9, 3.9.10)
Other information
To better understand where the time went, I debugged the issue with some print statements:
The offending method seems to be (from Python's point of view)
jsii/_kernel/providers/process.py: _NodeProcess.send
I understand that this communicates with a node process, so I assume it to be something going on there that takes very long.
The text was updated successfully, but these errors were encountered: