-
Notifications
You must be signed in to change notification settings - Fork 843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stack takes too long to find out that there is no work to do #1235
Comments
We have a project currently underway to profile and optimize this case. Ping @drwebb. |
Yes, at least one of the calls in this case have been eliminated by a recent commit I made.=. Eliminating the other would mean changing the behavior of how a call to the withCabalLoader function works.
|
Keeping track of the progress: What took 2.6s with v0.1.7, takes 1.8s with v1.0.2 on the same machine:
I haven't had a look at the circumstances when the Hackage index is decoded, but I think that we should try to eleminate these calls when they aren't strictly needed. |
@sjakobi Thanks for keeping track, I agree that the decodes are worth looking into optimizing. I'm not sure at the moment which ones can be skipped. I'll mark it on my TODO to look for some low-hanging fruit in these functions this week. |
I believe #1892 will speed this up by 30% (0.3 seconds on my machine, total no-op |
1.2s on my machine! A solid 33% improvement from 1.8s before!
|
Here are some timings with stack-1.1.2, mostly because I recently discovered the awesome This rebuilds stack at the revision of the timings above:
Debug output:
My current development version is slightly faster, probably mostly due to
|
Thanks for keeping an eye on this! The 'no work' build of our project takes a little over 5 seconds on my (relatively beefy) machine, which is definitely noticeable. Are you interested in any logs/output of that? |
Oh!
Yes! The debug log such a run would be interesting to me. Depending on size you could possibly put it in a Gist. The |
It's not that big, I'll put it at the bottom.
The This is on a mac (OS X 10.10.5). Let me know if there's anything else that would help.
|
Thanks! This 4s hole in the log is quite curious: :)
I guess we'll have to improve the logging there. |
Yes, I tried manually running the ghc-pkg command, but that only takes about 0.9s, so it's not the full 4s. |
@hesselink: In #2351 I've added a few more logging statements which should give us a some insights into that gap in the logs. If you want, you can install the current HEAD version of stack with I don't feel any time pressure on this though, so you can also simply wait for the next release. Based on my own logs, my hypothesis is that those 4 seconds are spent parsing the cabal files in your 134 packages. This currently happens twice on each run of stack but should be fixed when #2326 is merged. I think we should also consider parsing these cabal files concurrently. |
Addressed in #2352. |
@hesselink: Now that v1.2.0 is out, could you post another debug log of a no-op build of your large project? |
Sure! There doesn't seem to be much improvement, but perhaps there's more details in the logs now. Since I'm on a different machine, I've done a build with 1.1.2 and 1.2.0:
|
Thanks for the logs! Most of the time is spent in this bit:
Not sure what exactly takes so much time there but we'll find out! |
I'm seeing similar timings locally, when I build the amazonka project which contains 78 packages:
|
Right,
Profiling would probably tell me more but currently I can't get past #2655. |
Here's the profiling summary for an amazonka "null build":
Apparently nearly half of the time and allocations is spent on calls to In addition to looking for optimizations in some of the functions above we could still consider processing packages concurrently, maybe a bit smarter than what I tried last time. |
One thing I noticed is that most of the
Maybe there's some potential for savings here. |
Yes, there was some! :) |
Here's a new profile of a null build of the amazonka project:
I'm a bit suspicious about the |
Seems likely that there are indeed some savings to be had by modifying functions like I'd suggest modifying it in the way you suggest, use |
I have tried to find a faster implementation for a bit but actually failed so far to make a noticeable difference. Instead I believe we can speed up stack by making (much) fewer calls to I've been looking at the Here's a snippet that shows the stat calls that happen during the execution of a single call to
So, during each such call to For amazonka, these calls add up to ~35% of the total execution time of a null build. I wonder why we don't simply create a list of the files in the package directory before the call and find the one that matches |
I suspect the reasoning was that things ought to be reasonably fast, and now we are discovering we'd like to optimize them. Feel free to try to eliminate the overhead even if it means switching to |
Also seeing this with a project using several github projects as extra-deps:
With no changes (so a ""null" build) I have: time stack -v build real 0m8.604s Finished in 0ms: getPackageFiles /mnt/share/src/legmedcordova/.stack-work/downloaded/TOoHcgJ62C7-/ghcjs-dom/ghcjs-dom.cabal So just under 7 seconds (80%) is in getPackageFiles. stack and strace log attached. I am wondering if the build time could be lowered by promoting the github checkouts to "normal" packages or if it would not matter? |
@jacknojo A few misc thoughts:
|
I am going to close this issue, given the passage of time. |
Steps to reproduce:
Expected:
Stack quickly realizes that there is no work to do and exits.
Actual:
This takes about 2.6s on my machine. Among other things stack decodes ~/.stack/indices/Hackage/00-index.cache three times which accounts for about 1.6s.
Is all this work really necessary to discover that there is no building to be done?
The text was updated successfully, but these errors were encountered: