Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delegate the generation of the dependency tree to an external module? #68

Closed
mrjoelkemp opened this issue Nov 2, 2015 · 21 comments
Closed
Assignees
Milestone

Comments

@mrjoelkemp
Copy link
Contributor

Big fan of madge! Thanks for the great work.

Semantically, it seems like madge is a visualization layer of an app's dependency tree. The generation of the tree is also done within madge though.

I wrote https://github.com/mrjoelkemp/node-dependency-tree as a way to generate the dependency tree of any JS codebase (amd, cjs, es6). It outputs a nested JSON structure (or a list of visited files) representing the tree of a file. Is there any interest in using that to power madge?

  1. It would allow madge to avoid the need to maintain the various syntaxes – reducing the size and surface area of the codebase.
  2. node-dependency-tree is actively maintained and I'm open to any of the needs of madge moving forward
  3. Madge would gain the ability to traverse SASS/Stylus (and soon Less) codebases
  4. Webpack loading support is planned shortly – which madge would get with a version bump

Thanks for your time.

@pahen
Copy link
Owner

pahen commented Nov 4, 2015

Hi and thanks!

This is something I've wanted to do for a long time since madge has grown to do a lot more than it was intended to do. As you say, I built it mainly for visualizing and find circular dependencies. It sounds very appealing to move out the generation of the dependency tree to a standalone module and I'm open to use your module if it solves the problem :-)

Do you have the insight you need to madge to make a PR? Or do you want me to have a look at it?

@mrjoelkemp
Copy link
Contributor Author

Do you have the insight you need to madge to make a PR? Or do you want me to have a look at it?

I don't have the entire context around the inner workings of madge, so you'd know best as to the architectural changes necessary for the move and any feature regressions that would be caused by the effort.

I'll have a look this weekend. In the meantime, I'm more than happy to help answer questions about dependency-tree or discuss what's needed in this thread.

I built it mainly for visualizing and find circular dependencies.

dependency-tree has the information that would allow us to deduce circular dependencies.

If module A depends on B and B depends on A, the tree would look something like:

{
  A: { 
    B: {}
  },
  B: {
    A: {}
  }
}

where we can cross-reference a module's dependency with that dependency's dependency list to see if the current module is a member. That should be a constant-time lookup.

@pahen
Copy link
Owner

pahen commented Nov 5, 2015

I'll look into it as soon as possible. I'm away the entire weekend, but hopefully I'll get some free time next week.

@pahen
Copy link
Owner

pahen commented Jun 26, 2016

I'll take look at this soon :-) I'm planning to do some refactoring of Madge and would love to fix this then.

@mrjoelkemp
Copy link
Contributor Author

I built a proof of concept to see how it would work: https://github.com/dependents/node-tree-pic. One big thing is that dependency-tree needs a flag to stop traversing into node_modules.

mrjoelkemp added a commit to dependents/node-dependency-tree that referenced this issue Jul 4, 2016
mrjoelkemp added a commit to dependents/node-dependency-tree that referenced this issue Jul 4, 2016
@pahen
Copy link
Owner

pahen commented Jul 8, 2016

I've tried out node-dependency-tree now @mrjoelkemp and it seems very promising :) You have made an amazing effort here! Good job! This will solve a lot of things I wanted to fix in madge for a long time. So the plan is to remove all parsing from madge now and stop scanning file/folders. This will result in alot of breaking changes in both API and CLI but since I haven't released v1.0 yet it is a good time to do it now :) I'll let you know when I have a branch ready for testing.

@mrjoelkemp
Copy link
Contributor Author

Thanks @pahen! :)

Happy to help add any functionality or fixes at the dependency-tree layer and below.

Also happy to discuss any of the breaking changes to see if there's a path forward. I remember thinking that https://github.com/mrjoelkemp/node-dependents was similar to one of madge's features that dependency-tree wouldn't provide. There might be a middle ground.

Super excited to see this integration. Thanks for your effort!

@pahen pahen self-assigned this Aug 5, 2016
@pahen pahen modified the milestone: 1.0 Aug 8, 2016
@pahen
Copy link
Owner

pahen commented Aug 11, 2016

I'm at the final stages of 1.0 and has began to do some testing and have some concerns about the performance. I expected it to be a little slower since we do much more AST stuff now but it's a pretty big difference. Here's the numbers on 3 different projects. Do you have any thoughts on this @mrjoelkemp? Any plans for improvements in dependency-tree on this front?

CommonJS (https://github.com/eslint/eslint)

0.6.0 time madge --format cjs ~/Code/eslint/lib
real 0m0.845s
user 0m0.754s
sys 0m0.123s

v1 time ./bin/cli.js ~/Code/eslint/bin/eslint.js
real 0m7.650s
user 0m7.988s
sys 0m0.186s

AMD (https://github.com/adobe/brackets)

0.6.0 time madge --format amd ~/Code/brackets/src
real 0m26.352s
user 0m25.926s
sys 0m0.298s

v1 time ./bin/cli.js ~/Code/brackets/src/brackets.js
real 1m52.582s
user 1m51.041s
sys 0m1.991s

ES6 (https://github.com/kentcdodds/es6-todomvc)

0.6.0 time madge --format es6 ~/Code/es6-todomvc/src/js
real 0m0.489s
user 0m0.393s
sys 0m0.072s

v1 time ./bin/cli.js ~/Code/es6-todomvc/src/js/app.js
real 0m1.159s
user 0m0.986s
sys 0m0.155s

@mrjoelkemp
Copy link
Contributor Author

Hey @pahen! Congrats on finishing the first release candidate of 1.0! 🎉

Could you also include the time that it takes for dependency tree alone to run on those files? This way we can separate dependency-tree's runtime from the application code changes in madge v1. There's a dependency-tree binary that you can run from the cli directly. It doesn't have support for a filter function, but you could hack that into your local node_modules version of dependency-tree.

I'd be curious to see how perf changes if we replace the module type sniff in precinct with the regex checks that you were doing before to determine if a file was amd, cjs, or es6.

  • Precinct reuses the ast from that sniff for the extraction of the dependencies, but it still requires two complete passes of every ast. We're still talking about a linear time operation (about the number of nodes in a file's AST), but the cost definitely adds up over a large number of files.
  • I have plans for dependency tree to support generating the tree of a directory. There's room to parallelize the extraction of the dependencies (the precinct and filing-cabinet operations) and then construct the tree from the list of extracted paths for file. Currently, it's all a serial process. This doesn't seem applicable in the single-file tree generation case though.

Would you be able to replace the module type sniff with a regex (hacking it within your node_modules is fine for this experiment) and report the performance change? I could try working on that tonight or tomorrow night if you'd like.

If the above experiment isn't fruitful, I'll have to do a deep dive in profiling all of the operations of dependency-tree to see what's causing the slowdown.

@pahen
Copy link
Owner

pahen commented Aug 11, 2016

Thanks! :)

Here's the time it takes for dependency-tree cli to run directly on the eslint project with node_modules excluded:

time ./bin/cli.js --directory ~/Code/eslint ~/Code/eslint/bin/eslint.js

real 0m7.867s
user 0m8.022s
sys 0m0.218s

So it seems madge isn't adding much overhead.

I'll do some deep dive and see if I can find out what taking most of the time. But it would be nice if you could take a dive too :)

@pahen
Copy link
Owner

pahen commented Aug 11, 2016

Hardcoding type to commonjs at https://github.com/mrjoelkemp/node-filing-cabinet/blob/master/index.js#L101 brought the time down to 2.3s so getModuleType.sync is very expensive. And hardcoding typeto commonjs at https://github.com/mrjoelkemp/node-precinct/blob/master/index.js#L23 too brougth the time down to 1.6s

@pahen
Copy link
Owner

pahen commented Aug 11, 2016

This is very helpful when profiling if you haven't tried it yet @mrjoelkemp

https://medium.com/@paul_irish/debugging-node-js-nightlies-with-chrome-devtools-7c4a1b95ae27#.yhdqido7a

@pahen
Copy link
Owner

pahen commented Aug 11, 2016

It's quite interesting that it takes 550ms before Node.js has completed the startup process. I believe that has to do with the amount of modules in use.

@pahen
Copy link
Owner

pahen commented Aug 12, 2016

One fix to dramatically improve performance dependents/module-definition#23 🎉

@pahen
Copy link
Owner

pahen commented Aug 12, 2016

One more issue found. See dependents/node-dependency-tree#43

@mrjoelkemp
Copy link
Contributor Author

Thanks again for your great work in tracking down these optimizations. Now that dependents/node-dependency-tree#43 has been fixed, can you please post the new perf stats to see how we did?

@pahen
Copy link
Owner

pahen commented Aug 16, 2016

You're welcome! Always satisfying to improve performance :) Here's the new stats:

CommonJS (https://github.com/eslint/eslint)
time madge --format cjs ~/Code/eslint/lib = 0m0.845s (madge 0.6.0)
time ./bin/cli.js ~/Code/eslint/bin/eslint.js = 0m7.650s (madge v1)
time ./bin/cli.js ~/Code/eslint/bin/eslint.js = 0m2.982s (madge v1 + module-definition fix)
time ./bin/cli.js ~/Code/eslint/bin/eslint.js = 0m1.856s (madge v1 + dependency-tree 5.5.0)

4x faster!

AMD (https://github.com/adobe/brackets)
time madge --format amd ~/Code/brackets/src = 0m26.352s (madge 0.6.0)
time ./bin/cli.js ~/Code/brackets/src/brackets.js = 1m52.582s (madge v1)
time ./bin/cli.js ~/Code/brackets/src/brackets.js = 1m15.433s (madge v1 + module-definition fix)
time ./bin/cli.js ~/Code/brackets/src/brackets.js = 0m54.920s (madge v1 + dependency-tree 5.5.0)

2x faster!

ES6 (https://github.com/kentcdodds/es6-todomvc)
time madge --format es6 ~/Code/es6-todomvc/src/js = 0m0.489s (madge 0.6.0)
time ./bin/cli.js ~/Code/es6-todomvc/src/js/app.js = 0m1.159s (madge v1)
time ./bin/cli.js ~/Code/es6-todomvc/src/js/app.js = 0m0.805s (madge v1 + module-definition fix)
time ./bin/cli.js ~/Code/es6-todomvc/src/js/app.js = 0m0.765s (madge v1 + dependency-tree 5.5.0)

1,5x faster!

@mrjoelkemp
Copy link
Contributor Author

So awesome! Thanks for the stats! :)

@pahen pahen mentioned this issue Aug 16, 2016
@pahen pahen closed this as completed in #96 Aug 19, 2016
@mrjoelkemp
Copy link
Contributor Author

@pahen The simple perf improvement in dependents/node-source-walk#17 shaves off a few seconds from dependency-tree's runtime:

FROM: 

real  0m7.480s
user  0m5.878s
sys 0m0.277s

TO:

real  0m4.769s
user  0m4.816s
sys 0m0.187s

I cut a patch on node-source-walk, so all users should see this with a fresh npm install. 👯

@pahen
Copy link
Owner

pahen commented Aug 24, 2016

That's awesome Joel :) Another great find in the hunt for blazing speed! 🎉 Did you found that slow iteration by profiling?

@mrjoelkemp
Copy link
Contributor Author

Thanks! Yes, I used node-nightly and even node's --prof flag to identify the hot spot.

Sadly, some of the babylon plugins like object spread are adding more than a half second in overall runtime. There's room to optimize those plugins, I assume.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants