Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

jspm bundle test.js -wid takes 100% of CPU #1775

Closed
Vanuan opened this issue May 3, 2016 · 20 comments
Closed

jspm bundle test.js -wid takes 100% of CPU #1775

Vanuan opened this issue May 3, 2016 · 20 comments

Comments

@Vanuan
Copy link

Vanuan commented May 3, 2016

I'm following this guide:
http://jspm.io/0.17-beta-guide/development-bundling.html

The problem is that after this line:

Watchman:  Watchman was not found in PATH.  See https://facebook.github.io/watchman/docs/install.html for installation instructions

jspm is starting to hit 100% of CPU. If I change the file it takes 5 seconds to notice the change and another 10 seconds to actually build it. So those bundles is not a speed up but rather a slowdown of development experience.

@Vanuan
Copy link
Author

Vanuan commented May 3, 2016

strace shows a lot of such syscalls:

inotify_add_watch(11, "/src/node_modules/jspm/node_modules/systemjs-builder/node_modules/babel-core/node_modules/babel-register/node_modules/home-or-tmp/node_modules",
IN_MODIFY|IN_ATTRIB|IN_MOVED_FROM|IN_MOVED_TO|IN_CREATE|IN_DELETE|IN_DELETE_SELF|IN_MOVE_SELF) = 1472

@Vanuan
Copy link
Author

Vanuan commented May 3, 2016

Should it watch for node_modules and jspm_packages files? Is there a way to add them to watch exceptions?

@Vanuan
Copy link
Author

Vanuan commented May 4, 2016

After several minutes it's dropped to 0% cpu usage. But after each save is goes up again.
It looks like every change creates a new watcher: https://github.com/jspm/jspm-cli/blob/0.17/lib/bundle.js#L279

So there are 2 issues:

  1. initial watch is too slow, we should limit a number of files under watch
  2. we should reuse watcher and not create a new one on each save

@guybedford
Copy link
Member

This is a completely valid point, but would involve restructuring to check if the watched module list required for the new bundle matches the previously watched module list of the old bundle, and then only rewatching in the case where it doesn't.

PR work along these lines is incredibly welcome!!

@Vanuan
Copy link
Author

Vanuan commented May 4, 2016

At least we can skip the node_modules folder as you probably won't be installing new modules frequently.
And for now we can just skip the rewatching everything part and address the point "2".

@Vanuan
Copy link
Author

Vanuan commented May 4, 2016

I've created amasad/sane#66

In either case it seems to me that we should change to *.js glob pattern and use some other directory for auto-generated files. Or it should at least be configurable.

@amasad
Copy link

amasad commented May 5, 2016

Globs are not meant to be single file names -- this results in o(mn) by just setting up the watch. We end up testing every file in the directory against every file in the glob using the glob library (which expects a glob pattern therefore is more expensive than a string match).

We can potentially add a function to have a user-defined filter function that returns true/false whether to watch a file or not. In that function you can use array.indexOf or regexp or something faster than looping.

Alternatively, something that doesn't require changes is that you watch all *.js files and then just filter the events you get from sane.

@guybedford
Copy link
Member

@amasad do you think Watchman would provide the same performance then with or without a glob filter of the files list on the folder? The idea was simply that SystemJS knows the exact files it needs to watch as it has the full tree, and it doesn't need to watch folders, so there must surely be some optimization that can be done given this constraint?

@amasad
Copy link

amasad commented May 5, 2016

It's actually more expensive to watch individual files. Watchman would be faster because that code will execute natively -- but it's still o(nm).
Most watchers would subscribe to events on directories and not individual files so it's not really an optimization -- it's expensive.

I suggest you start the watcher on the directory with a permissive glob but then filter your events according to your file list. Maybe construct a map with the filenames that way it's cheap.

@amasad
Copy link

amasad commented May 5, 2016

I should say the reason it's expensive to setup a watch per file -- in addition tk the time spent testing patterns -- is because the way it works is that node/watchman subscribes to a system event stream -- say fsevents on OS x -- on a given file. These subscriptions are expensive as you imagine the fs events daemon have to cycle every one of them on every file change. Also fs events is not that great and that's why watchman is better than the node watcher because it does all these optimizations to avoid setting up too many subscriptions. So watchman might just take your request to wat h individual files and just setup a watch on the directory and then filter the event stream for you. Sane doesn't do that do its better if that filter happens in your code -- that way you support sane in node, watchman, or polling modes

@slanted
Copy link

slanted commented May 5, 2016

I've got a watchman implementation that doesn't spike cpu at all (I also noticed what you mentioned @Vanuan I use a variable so it only watches once and it works great), and also a chokidar implementation that filters files like you mentioned. My team is currently trying it, but I don't really want to give anything until they've been using it for another couple of weeks. If you really want to try my alpha code, let me know and I'll shoot you something. Incidentally my code also detects locally linked directories and watches those as well. My CPU never goes above 25% on a macbook pro. Again, give me another week for flush out any potential bugs. But we've been using it for a couple of weeks now, and sofar so good.

@slanted
Copy link

slanted commented May 5, 2016

@guybedford - on chokidar I watch the full set of files like you mentioned for change events, and then I watch the directories for adds/deletes of just .js, .css, .json, and .html files with a separate watcher. So one watcher for changes and another for adds/deletes. Once a new file is added or deleted I add it to the change listener. Chokidar has a better interface for this than the sane watcher imo. Correct me if I'm wrong @amasad.

I'm going to update the watchman impl I have to do the same thing because I bet that its more efficient - but again I haven't run into any issues with CPU. If anyone wants to test it on windows to make sure I didn't mess up the paths, let me know.

@amasad
Copy link

amasad commented May 5, 2016

This is really a backwards way of doing things. Sorry I don't have time to keep explaining myself -- go see how React Native or ember CLI uses sane.

@slanted
Copy link

slanted commented May 6, 2016

On chokidar It's faster, what can I say. I'm gonna take the file glob out for sane and just watch the directories and do some performance tests. What you say about watching makes sense. I'll report back shortly.

@slanted
Copy link

slanted commented May 6, 2016

Ok here are my results. I'm on a Macbook Pro 2.5 Ghz Intel I7, 16 gigs of DDR3.

In each case I start the process with:

jspm build src/main.js bundle.js --skip-rollup --global-name libs -wid

Use Case 1

  • Watchman is installed and turned on.
  • Watching 'src' and 7 other locally linked directories.
  • I use the file glob of all files as is done in the current implementation, but I filter down the file glob for each directory, so only the files are watched that belong to that directory.
  • I start a watcher in src and all linked directories with the filtered file glob (8 watchers).
  • After starting the watchers, I don't start new watchers when running the build/bundle functions.

Here are the CPU spikes after changes (in %):
23.7, 9,25.3,8.9, 7.8, 20,25,6.2,21.3,6.8,15,39,9.6,22.5,5.7

Use Case 2

  • Watchman is installed and turned on
  • Watching 'src' and 7 other locally linked directories.
  • I start a watcher in src and all linked directories with a glob of *.js, and *.css as in the default sane example (8 watchers) as suggested above.

Again, here are the CPU spikes after changes (in %):
58.2, 13.7,22.8,5.8,11.8,4.9,24,24.7,22.3, 5.7,45.8, 11.5,14.3, 7.5, 30,12.0, 5.5

Use Case 3

  • Same as use case 2, but I added *.json, and *.html to the glob to see if it would make a difference.

CPU spike readings after changes (didnt do as many that time):
66,14,20, 5.8, 5.7,5.3,20, 18,10,24,22,5.4

All of the changes were done in the same 2 files in a linked dependency directory and all I do is add a word and save, delete a different word and save, etc.

I'm not seeing much of a change, but maybe I need a larger dataset. All in all I think the performance is about the same. Each CPU spike lasts about 2-3 seconds. I wait about 4-5 seconds before doing another edit, but I've done rapid edits and it doesn't spike the CPU any higher or make the spike last longer. I admit I have a pretty beefy laptop, but nothing special. If you like I can run the tests using the chokidar implementation and report that, and also run the examples when turning off watchman with sane.

FYI, I did see spikes to about 100-150% (4 cores) in the beta implementation (I'm running beta 13) when watching. So I think this is an improvement.

I have to say I like the code much better with @amasad's suggestion (Use case 2 and 3).

@guybedford
Copy link
Member

If I remove the glob I definitely see a slowdown in watchman from a nearly instantaneous watcher to a few seconds, which seems to indicate that Watchman is able to do globbing optimizations at a lower level than we can.

The solution I've gone with here in 1802e28 given this is to only provide a glob when using Watchman, and otherwise to not provide it at all.

Thanks everyone for your considerations here, further feedback is welcome and work will still be needed for symlink support and configuration watching, so help is useful too.

@Vanuan
Copy link
Author

Vanuan commented May 6, 2016

But it will still recreate a new watcher for each file change. After initial 3 seconds it'd be the same performance if you don't recreate it each time.

@Vanuan
Copy link
Author

Vanuan commented May 6, 2016

You might not see a difference on i7 and SSD and a lot of RAM, but not every developer's machine is as performant as yours

@slanted
Copy link

slanted commented May 6, 2016

@Vanuan yea I was worried about this. If you don't mind, what are you running on? I definitely think that we shouldn't be recreating a watcher for every file change, but I thought I read something that it doesn't matter. In my little implementation I only watch once. I'm so sorry I don't have something for this yet, but obviously I still need to test my code on other systems & OS. I re-read your initial post - so you're not using watchman? Why don't you install it and try this again? I've been running all of my examples with watchman turned on and its pretty good. If that's not an option I could shoot you a chokidar version you could test for me. :)

@Vanuan
Copy link
Author

Vanuan commented May 6, 2016

There's no precompiled version of watchman for Alpine Linux. I want a lightweight development environment, without gcc and stuff like that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants