Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use graceful-fs all over instead of monkey patching fs #9443

Merged
merged 13 commits into from
Apr 28, 2020

Conversation

SimenB
Copy link
Member

@SimenB SimenB commented Jan 22, 2020

Summary

From graceful-fs's readme:

This should only ever be done at the top-level application layer, in order to delay on EMFILE errors from any fs-using dependencies. You should not do this in a library, because it can cause unexpected delays in other parts of the program.

This might cause errors if some dependency we use do a lot of fs operations, but if so we should send them a PR using graceful-fs themselves. And users hitting it can choose to manually gracefulify themselves if they want to.

Ref #8331.

Test plan

Green CI.

@SimenB
Copy link
Member Author

SimenB commented Jan 22, 2020

CI OOMs - that's a bad sign...

@SimenB SimenB force-pushed the graceful-fs-directly branch 2 times, most recently from d614c6f to f36fad7 Compare January 22, 2020 14:44
@SimenB
Copy link
Member Author

SimenB commented Jan 22, 2020

Seems to leak when running in band, running in workers and heap size seems stable. I don't have time to dig into this now. Seems to happen on master as well, so not sure why it explodes on CI now and didn't before...

@kirillgroshkov
Copy link

Does it make sense to maybe consider merging this as a fix: #8331
?

@SimenB SimenB force-pushed the graceful-fs-directly branch 2 times, most recently from d44b0d4 to 84bf591 Compare February 24, 2020 09:59
@SimenB
Copy link
Member Author

SimenB commented Feb 24, 2020

Seems to fix the memory issues on CI, so graceful-fs definitely leaks. Should put together a reproduction for them

@SimenB
Copy link
Member Author

SimenB commented Mar 29, 2020

I've opened a PR at graceful-fs that seemingly fixes this: isaacs/node-graceful-fs#184

@SimenB SimenB force-pushed the graceful-fs-directly branch from 84bf591 to 8dd5817 Compare March 29, 2020 09:37
@codecov-io
Copy link

codecov-io commented Mar 29, 2020

Codecov Report

Merging #9443 into master will not change coverage by %.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #9443   +/-   ##
=======================================
  Coverage   64.92%   64.92%           
=======================================
  Files         288      288           
  Lines       12199    12199           
  Branches     3024     3024           
=======================================
  Hits         7920     7920           
+ Misses       3639     3638    -1     
- Partials      640      641    +1     
Impacted Files Coverage Δ
packages/babel-jest/src/index.ts 50.00% <ø> (ø)
packages/jest-cli/src/init/index.ts 82.60% <ø> (ø)
packages/jest-config/src/index.ts 11.59% <ø> (ø)
packages/jest-config/src/normalize.ts 76.66% <ø> (ø)
...ges/jest-config/src/readConfigFileAndSetRootDir.ts 0.00% <ø> (ø)
packages/jest-config/src/resolveConfigPath.ts 92.00% <ø> (ø)
packages/jest-haste-map/src/crawlers/node.ts 86.66% <ø> (ø)
packages/jest-haste-map/src/index.ts 81.26% <ø> (ø)
packages/jest-haste-map/src/lib/FSEventsWatcher.ts 12.72% <ø> (ø)
packages/jest-haste-map/src/lib/WatchmanWatcher.js 0.00% <ø> (ø)
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2a4b073...64ac79a. Read the comment docs.

@@ -24,10 +24,6 @@ export async function run(
cliArgv?: Config.Argv,
cliInfo?: Array<string>,
): Promise<void> {
const realFs = require('fs');
const fs = require('graceful-fs');
fs.gracefulify(realFs);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the fix, buried in fs -> graceful-fs import changes

@ron23
Copy link

ron23 commented Mar 31, 2020

Data point: this solution makes the leak worse in our repo.
Without the fix, the heap size at the end of small subset of tests was ~500MB.
With the fix ~750MB.

@SimenB
Copy link
Member Author

SimenB commented Mar 31, 2020

@ron23 did you include the patch for graceful-fs? It's the last commit.

The patch is the fix for the memory leaks, the changes in this PR without the patch is expected to make things worse since we load a new (leaking) graceful-fs each time (similar to what happens when people load graceful-fs in their tests).

@ron23
Copy link

ron23 commented Mar 31, 2020

@ron23 did you include the patch for graceful-fs? It's the last commit.

Oops I have not, didn't realize I need to.
Anyways, I just tried that (via yarn link on the main jest folder and on jest-runtime and jest-cli) and I got the exact same results.
Am I missing something?
Just to be sure it's running with the right code, I added logging on top of your graceful-fs.js changes which I do see while running the tests so it should be linking correctly.
I did notice the initial test takes way longer to boot (almost a minute) and that the heap size is about 50MB bigger.

EDIT: my leak might be coming from different place as well, so don't take that as if your PR ain't working. If only I knew where the leak is coming from...

@SimenB
Copy link
Member Author

SimenB commented Mar 31, 2020

For testing I recommend not to link anything (there's so many packages and inter-dependencies it's easy to miss some). This is easier:

  1. checkout graceful-fs-directly branch
  2. run yarn to ensure patch is applied and all code is built
  3. uninstall jest from you project
  4. install patch-package in your project, and copy over the patches directory to get the fix
  5. ensure you only have a single version of graceful-fs in your tree
  6. from your project run yarn patch-package to apply the patch, then run ../folder-with-cloned-jest/jest.

@SimenB SimenB force-pushed the graceful-fs-directly branch from 64ac79a to d8cd798 Compare April 13, 2020 12:52
@SimenB SimenB force-pushed the graceful-fs-directly branch from 8293c0c to 37413b1 Compare April 27, 2020 08:36
@SimenB
Copy link
Member Author

SimenB commented Apr 27, 2020

Upstream PR has been merged, just need to wait for a release and we can merge this

@SimenB SimenB force-pushed the graceful-fs-directly branch from 19ae6d6 to a2a7802 Compare April 28, 2020 17:51
This reverts commit d19bdb5.
@SimenB SimenB merged commit 7e37b0f into jestjs:master Apr 28, 2020
@SimenB SimenB deleted the graceful-fs-directly branch April 28, 2020 18:23
iliapolo added a commit to aws/aws-cdk that referenced this pull request Apr 29, 2020
Couple of things here:

**require.cache**

In version `v25.5.0`, jest introduced an [implementation](jestjs/jest#9841) of their own to the `require.cache` object. It seems that it doesn't handle caching `json` modules properly, which causes our code to fail because `mod.paths` is `undefined` when we are querying for `json` files that were required (for example `package.json` or `cloud-assembly.schema.json`).

https://github.com/aws/aws-cdk/blob/07fe642e38118e24837b492fa182737fc41bb429/packages/%40aws-cdk/core/lib/private/runtime-info.ts#L66  

```console
TypeError: Cannot read property 'map' of undefined
```

The "fix" was to add a null check to prevent failure when looking up modules who don't have the `paths` property defined. 

Note that this was only observed in test environments using `jest > v25.5.0`, not during actual runtime of `cdk synth`. This is because `jest` manipulates the built-in `require` module of `nodejs`.

**graceful-fs**

In version `v25.5.0`, jest [added](jestjs/jest#9443) a dependency on the `graceful-fs` package. The version they depend on (`4.2.4`) differs from the version that we bring (`4.2.3`). This caused the `graceful-fs` dependency that comes from `jest` no to be deduped by node at installation. In turn, this caused multiple copies of the library to be installed on disk. 

The `graceful-fs` package monkey patches the `process.chdir` and `process.cwd` functions with a cached version for better performance.

For reasons not completely clear to us, the existence of multiple copies of the module, caused `process.cwd()` to return incorrect cached results, essentially always returning the directory that the process was started from, without consideration to calls to `process.chdir`. This broke `decdk` (and possibly many more) tests which rely on `process.chdir`.  

Fixes #7657
iliapolo added a commit to aws/aws-cdk that referenced this pull request Apr 29, 2020
Couple of things here:

**require.cache**

In version `v25.5.0`, jest introduced an [implementation](jestjs/jest#9841) of their own to the `require.cache` object. It seems that it doesn't handle caching `json` modules properly, which causes our code to fail because `mod.paths` is `undefined` when we are querying for `json` files that were required (for example `package.json` or `cloud-assembly.schema.json`).

https://github.com/aws/aws-cdk/blob/07fe642e38118e24837b492fa182737fc41bb429/packages/%40aws-cdk/core/lib/private/runtime-info.ts#L66

```console
TypeError: Cannot read property 'map' of undefined
```

The "fix" was to add a null check to prevent failure when looking up modules who don't have the `paths` property defined.

Note that this was only observed in test environments using `jest > v25.5.0`, not during actual runtime of `cdk synth`. This is because `jest` manipulates the built-in `require` module of `nodejs`.

**graceful-fs**

In version `v25.5.0`, jest [added](jestjs/jest#9443) a dependency on the `graceful-fs` package. The version they depend on (`4.2.4`) differs from the version that we bring (`4.2.3`). This caused the `graceful-fs` dependency that comes from `jest` no to be deduped by node at installation. In turn, this caused multiple copies of the library to be installed on disk.

The `graceful-fs` package monkey patches the `process.chdir` and `process.cwd` functions with a cached version for better performance.

For reasons not completely clear to us, the existence of multiple copies of the module, caused `process.cwd()` to return incorrect cached results, essentially always returning the directory that the process was started from, without consideration to calls to `process.chdir`. This broke `decdk` (and possibly many more) tests which rely on `process.chdir`.

Fixes #7657
@github-actions
Copy link

This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
Please note this issue tracker is not a help forum. We recommend using StackOverflow or our discord channel for questions.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 11, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants