-
Notifications
You must be signed in to change notification settings - Fork 704
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
build multiple packages in parallel #440
Comments
(Imported comment by SamAnklesaria on 2009-01-10) partial, hypothetical implimentation lacking suppressed output and command line flags |
(Imported comment by refold on 2011-03-29) Relevant mailing list thread: http://thread.gmane.org/gmane.comp.lang.haskell.cabal.devel/7473 |
(Imported comment by refold on 2011-06-10) Current status (for those interested): Building multiple packages in parallel was implemented, but the patches are not merged into the mainline as of yet; I'm now working on parallelising 'cabal build'. |
(Imported comment by refold on 2011-10-16) Implementation |
(Imported comment by refold on 2011-11-05) Attached are my patches that parallelise cabal-install's 'install' command. Sorry for sending them as a single large bundle - ideally I would like The patch series logically consists of three parts (in chronological order):
Implements the basic parallel framework as described here. Changes
Implements output serialisation - since we don't want the console
During this stage I was concentrated on testing and fixing bugs and My patches are also available in a separate Darcs repository. |
(Imported comment by refold on 2011-11-05) I've updated my parallel patches (see attachment). Patches apply cleanly to the current mainline. The parallel code path now always uses the external setup method (via Setup.hs), so the required changes to the Cabal lib are minimised. There are still some traces of output serialisation, though. Some numbers: $ time cabal install -j 1 alex happy real 1m19.236s user 1m1.330s sys 0m10.510s $ time cabal install -j 4 alex happy real 0m52.106s user 1m10.680s sys 0m15.030s $ time cabal install -j 1 yesod real 19m14.913s user 15m59.420s sys 1m25.650s $ time cabal install -j 4 yesod real 14m8.599s user 21m36.530s sys 4m5.650sI also tested the Nov 2011 version of the code (tries to use the internal setup method, requires pervasive changes to Cabal lib): $ time cabal install -j 4 alex happy real 0m45.503s user 1m4.040s sys 0m10.100s $ time cabal install -j 4 yesod real 10m41.840s user 17m6.560s sys 1m33.040sCompiling and linking all these Setup.hs files does add some noticeable overhead. If these patches get accepted, I'll start working on improving the UI. |
(Imported comment by refold on 2012-04-02) Parallel patches were moved to GitHub: git clone git://github.com/23Skidoo/cabal.git cabal-parallel-install cd cabal-parallel-install git checkout parallel-install |
Is it also planned to build profiling, shared and static libs in parallel? |
@thielema These are currently not built in parallel; I'll look at it after the patches are merged. |
This is now mostly done. Great work @23Skidoo ! Remaining is to reduce the output to a much condensed form (as shown in the ticket description) and logging each package's build log to a file that can be output on build failure. |
Now that the patches implementing build logging and better output are in, I think we should close this issue. Improvements to the parallel code (dynamic status indicator, parallel building of shared/profiling/... versions, module-level parallelism) should be dealt with as separate tickets. |
@23Skidoo Fine by me. Could you please open a new ticket for the final UI improvements? |
@tibbe Can you close this ticket? |
(Imported from Trac #447, reported by @dcoutts on 2009-01-10)
The latest version of the gentoo portage tool is rather slick. It can do parallel builds and it displays a nice summary on the command line, eg:
Note how they solve the problem of how to display what is going on when there are multiple builds happening. The answer is not to display it at all! This would have to go hand-in-hand with logging all builds so that we can still diagnose failures.Note the final line, that gets updated to display the current number of jobs running, the number completed etc. It also shows the load average. The job scheduler has two parameters, one is a maximum number of jobs (or unlimited) and the other is a load average. It will only launch new jobs if the load average is less than the given maximum. That allows it to interact reasonably well with builds that use make -j internally. In the example above I set the load average to be just slightly more than the number of CPUs I've got.
It looks to me like it serialises some bits, like installing, since saturating the disk with multiple parallel installs is generally of no benefit, indeed it can be slower. Also downloads seem to be serialised, again because there is probably little benefit to making multiple connections to the same server.
Anyway, the point is, cabal-install ought to be able to do all this. Some bits we can do now. We already have a graph representation of the install plan and we recalculate when a package fails to install.
We will need an improved download api, probably involving sending requests off to a dedicated download thread (which would serialise them).
The text was updated successfully, but these errors were encountered: