Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(core): proliferation of cdk.out$hash directories in $TMPDIR #27356

Open
diranged opened this issue Sep 29, 2023 · 7 comments
Open

(core): proliferation of cdk.out$hash directories in $TMPDIR #27356

diranged opened this issue Sep 29, 2023 · 7 comments
Labels
@aws-cdk/core Related to core CDK functionality bug This issue is a bug. p2 package/tools Related to AWS CDK Tools or CLI

Comments

@diranged
Copy link

Describe the bug

On my laptop, I've noticed recently that my $TMPDIR is filling up... this feels like a new behavior, as we've been developing with CDK for almost a year now, and only recently did this start happening to me. It's tough to pinpoint when, but I think it has to do with the cdk.out directory being renamed to cdk.out$hash at some point. In the last 2 days, I've accumulated over 170GB of temp data:

$ sudo du -sch $TMPDIR   
170G	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T/
170G	total

When I dig into it, it's all CDK data:

$ sudo du -sch $TMPDIR/cdk*
 16K	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk-custom-resource00f1xy
 16K	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk-custom-resource00gzE3
 12K	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk-custom-resource03avRC
...
 12K	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk-custom-resourcezzLzLS
 20K	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk-test-app-05EjOV
 96K	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk-test-app-0CZJXf
...
 28K	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk-test-app-DOfnAs
 41M	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk-test-app-DTQJUe
 41M	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk-test-app-DTr6AZ
...
212K	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk.outWD3zzZ
 41M	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk.outWEJF5m
 41M	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk.outWEZIMn
 41M	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk.outWEpfOE
...
  0B	/var/folders/dm/b5by_qw91nd0ctdjbvggzgr40000gq/T//cdk8s.outdir.zxI76J
169G	total

I have over 9000+ individual test directories:

$ ls -la $TMPDIR | grep cdk | wc
    9617   86553  715166

This feels similar to #2869 - but not exactly the same..

Expected Behavior

I expect that the TMPDIR data would be cleaned up after each run... but I think that this was never needed back when the output dir was $TMPDIR/cdk.out .. but now it's $TMPDIR/cdk.out$rand and that is causing this buildup of junk.

Current Behavior

Build up of left over junk tmp data dirs..

Reproduction Steps

Just run your tests over and over again

Possible Solution

No response

Additional Information/Context

No response

CDK CLI Version

2.93.0

Framework Version

No response

Node.js Version

18

OS

OSX

Language

Typescript

Language Version

No response

Other information

No response

@diranged diranged added bug This issue is a bug. needs-triage This issue or PR still needs to be triaged. labels Sep 29, 2023
@github-actions github-actions bot added the @aws-cdk/core Related to core CDK functionality label Sep 29, 2023
@indrora
Copy link
Contributor

indrora commented Oct 2, 2023

I suspect that it's expected that $TMPDIR doesn't survive reboots (that is, it's a ramdisk rather than real disk space).

@indrora indrora added p2 package/tools Related to AWS CDK Tools or CLI and removed needs-triage This issue or PR still needs to be triaged. labels Oct 2, 2023
@diranged
Copy link
Author

diranged commented Oct 2, 2023

@indrora On a CI/CD system, that makes sense... but on a local development environment I think it's more of a problem. It also likely slows down the build process quite a bit as well - rebuilding assets that don't need to be rebuilt. Thoughts?

@misirlou-tg
Copy link

I have seen files & directories accumulate in my temp folder as well when running cdk synth or cdk deploy. The first time I noticed it the there were 100s of folders with over 30GB.

I narrowed it down to two file name patterns and one directory name pattern, I look for all of them and remove the directories where they live.

The following shows the temp dirs/files left behind in a "clean" temp dir after running cdk synth once on a small project

C:\Users\build\AppData\Local\Temp>dir Amazon.CDK.Asset.AwsCliV1.aws-cdk-asset-awscli* /s/b
C:\Users\build\AppData\Local\Temp\d2f5yn3d.hy0\Amazon.CDK.Asset.AwsCliV1.aws-cdk-asset-awscli-v1-2.2.177.tgz

C:\Users\build\AppData\Local\Temp>dir jsii-runtime.js /s/b
C:\Users\build\AppData\Local\Temp\iqkq15ry.yu4\bin\jsii-runtime.js

C:\Users\build\AppData\Local\Temp>dir cdk-custom-resource* /s/b
C:\Users\build\AppData\Local\Temp\cdk-custom-resource5ypanG

C:\Users\build\AppData\Local\Temp>rd /s/q d2f5yn3d.hy0

C:\Users\build\AppData\Local\Temp>rd /s/q iqkq15ry.yu4

C:\Users\build\AppData\Local\Temp>rd /s/q cdk-custom-resource5ypanG

@whereisaaron
Copy link

If you work with AWS CDK locally you soon have 1000's of abandoned folders in /tmp consuming 100GB+ of diskspace!! Surely it is the application's responsibility to clean up its temporary files? If clean up is not ready available, in the meantime could the temp folder have a predictable prefix or extension, so that manual clean up is easier. Right now it is generating random 8.3 folder names like this is an MS DOS application 😅

e.g.
yt5tu2fe.qm3 --> awscdk-yt5tu2fe.qm3
or
yt5tu2fe.qm3 --> yt5tu2feqm3.awscdk

...
2.4M    yt5tu2fe.qm3
2.4M    yts00joi.3if
58M     ytwfqwhv.rrg
67M     yvrstndb.o3p
2.4M    yybbwobp.f51
2.4M    yyhruseu.r45
62M     yznbuveb.bil
2.4M    yzqrjcxr.zrf
58M     z0amgf5y.ind
2.4M    z0hclc20.gsb
2.4M    z0lm0avk.dfc
2.4M    z1i2ayve.uhb
2.4M    z2yd5sca.i2a
58M     z3fpl4e1.4jb
2.4M    z4pdtzwy.cvg
58M     z4svshro.lxn
62M     z52k5uta.t02
2.4M    z5vxbrxb.crv
2.4M    zbasg4kv.bi1
2.4M    zbckroax.vt2
2.4M    zbdlchch.foh
58M     zdmkgg3o.jns
67M     zehsulxf.u5a
2.4M    zenzce51.jh5
2.4M    zgmmavsg.fil
62M     zgszwowm.wtt
2.4M    zgvdgafl.2is
2.4M    zgweaxic.pcf
58M     zhwt1tx3.oks
2.4M    zjkgwme5.3q2
67M     zjmsrsuv.ibq
58M     zjvjduaa.3fa
2.4M    zk055vcr.fdw
2.4M    zk4wx0g2.q1i
58M     zkbviiad.52w
2.4M    zkim13u0.rxs
2.4M    zl111zyt.hxq
2.4M    zl1r3svb.syb
2.4M    zlqm3vyz.xwy
58M     zltd2fn0.zrb
2.4M    zmnxnaje.3ge
2.4M    znmfotlv.kky
58M     znpogu23.201
2.4M    zobow0ac.esw
62M     zodnn2ux.mpv
2.4M    zr2f2mq3.5n5
58M     zraj3qur.3xi
...

@indrora
Copy link
Contributor

indrora commented Jun 7, 2024

n.b. I no longer work for Amazon, the following is purely my own opinion as a member of the community.

The CDK is generally designed to be used on Linux; the vast majority of Linux distributions mount /tmp as a tmpfs file system in memory; currently, Debian is the odd one out (i believe with Ubuntu moving to tmpfs as well about a decade ago) as it generally targets “low resource” systems by default.

macOS is an inscrutable black box in this regard in that it relies on launchd and cron to clean up unloved files that have not been touched or opened in >3 days depending on the version of macOS that you are running.

That said: it is generally considered (In My Experience) good practice as you have noted to include some prefix or to consume some sub-path of /tmp to make manual cleansing and hygiene easier on an individual.

I believe a possible workaround (though I have not tested this) is to use the TMPDIR environment variable. I will defer to current maintainers on that specific issue however.

@lewisdiamond
Copy link
Member

Whether /tmp is tmpfs or not is rather irrelevant. If a piece of software is creating a random temporary directory (meaning it is not reused across multiple calls for optimizing builds for example), it should clean it up before exiting and only leak when unrecoverable errors occur. tmpfs isn't free, it's RAM, I'd say it's way worse to do this in tmpfs. Without a swap, this will end up consuming significant amounts of RAM, with a swap it will just delay it and prevent other things from getting swapped out.

@indrora
Copy link
Contributor

indrora commented Nov 19, 2024

N.B. I no longer work for Amazon as previously stated and this is my opinion as a community member.

The standard says "/tmp is a wasteland trash dump, don't expect what you put there to be there next time you come back" in slightly more formal words. No guidance is given on hygiene, but the charitable read is "Some programs will leave garbage there, and how that garbage is dealt with is up to the system administrator."

On the other side, some people feel that it should be cleaned up, such as systemd-nspawn.

My comment here? If it bugs you so much, pull requests are appreciated. Feel free to submit a PR that adds the desired cleanup, ostensibly behind a flag or other toggle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
@aws-cdk/core Related to core CDK functionality bug This issue is a bug. p2 package/tools Related to AWS CDK Tools or CLI
Projects
None yet
Development

No branches or pull requests

5 participants