Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extra package database support #990

Merged
merged 3 commits into from
Sep 30, 2015

Conversation

vigoo
Copy link
Contributor

@vigoo vigoo commented Sep 14, 2015

This pull-request adds support for creating and using reusable extra package database with stack.

This test repository demonstrates how it would work:

  • stack build in lib1 builds the library into the local package database and nothing else
  • stack build in lib2 builds its dependencies (ansi-wl-pprint and everything it needs) and lib2 itself to its local package database.
  • stack build in app builds the executable using both lib1 and lib2 from the previously built package databases.

The change adds two new options to the stack yaml:

  • extra-package-dbs is a list of relative or absolute paths to extra package databases, which will be added on top of the global package database, before snapshot and local
  • With force-install-local we still use the snapshots to determine the package versions to be installed, but they are not installed to the shared snapshot package database but always to the current local package database. This is required to make the built package databases redistributable.

NOTE Currently this does not work. But it worked before I rebased it, so I'm opening the PR to get some help in fixing it before we can merge. To reproduce the issue, build this branch, and run the test script from the above linked repository, giving it the built stack executable as its parameter.

@borsboom
Copy link
Contributor

Thanks for the PR. Just to make sure I'm not missing any background, is there an issue where this idea was discussed? I have some misgivings about the feature, because it breaks one of the fundamental purposes of stack: reproducible builds. Since stack doesn't control what's in the extra package DBs, it means the build result depends on what happens to be on the developer's system already. Perhaps it would help if you can describe the problem that this is solving. @snoyberg, I'd value your opinion as well.

@snoyberg
Copy link
Contributor

@borsboom Check out this email thread:

https://groups.google.com/d/msg/haskell-stack/bCzgxqcbKKA/D5DuMSCTFwAJ

The idea here is that we're not making the situation any worse than we already have it from allowing packages in the global database, but we additionally gain the advantage of allowing easy sharing of binary package databases between systems.

@snoyberg snoyberg self-assigned this Sep 16, 2015
@snoyberg
Copy link
Contributor

When running this on my system, I see the following package databases:

--package-db=clear
--package-db=global
--package-db=/home/vagrant/Desktop/stack-sandbox-chain-test/lib1/.stack-work/install/x86_64-linux/lts-2.22/7.8.4/pkgdb/
--package-db=/home/vagrant/.stack/snapshots/x86_64-linux/lts-2.22/7.8.4/pkgdb/
--package-db=/home/vagrant/Desktop/stack-sandbox-chain-test/lib2/.stack-work/install/x86_64-linux/lts-2.22/7.8.4/pkgdb/

At the very least, the order of these databases is incorrect; the snapshot database ended up in the middle of the lib1 and lib2 databases, and the local database does not appear anywhere in that list. It should be ordered:

--package-db=clear
--package-db=global
--package-db=additional-extra-package-databases
--package-db=snapshot
--package-db=local

@vigoo
Copy link
Contributor Author

vigoo commented Sep 16, 2015

I think in the above case the script is building lib2 which means that
lib2's local is the local, so the order is correct. Isn't it?

On Wed, Sep 16, 2015, 05:45 Michael Snoyman notifications@github.com
wrote:

When running this on my system, I see the following package databases:

--package-db=clear
--package-db=global
--package-db=/home/vagrant/Desktop/stack-sandbox-chain-test/lib1/.stack-work/install/x86_64-linux/lts-2.22/7.8.4/pkgdb/
--package-db=/home/vagrant/.stack/snapshots/x86_64-linux/lts-2.22/7.8.4/pkgdb/
--package-db=/home/vagrant/Desktop/stack-sandbox-chain-test/lib2/.stack-work/install/x86_64-linux/lts-2.22/7.8.4/pkgdb/

At the very least, the order of these databases is incorrect; the snapshot
database ended up in the middle of the lib1 and lib2 databases, and the
local database does not appear anywhere in that list. It should be ordered:

--package-db=clear
--package-db=global
--package-db=additional-extra-package-databases
--package-db=snapshot
--package-db=local


Reply to this email directly or view it on GitHub
#990 (comment)
.

@vigoo
Copy link
Contributor Author

vigoo commented Sep 16, 2015

So there are 3 stack runs, and I think the correct package-db order for each is:

  • lib1: global -> snapshot -> local(lib1)
  • lib2: global -> local(lib1) -> snapshot -> local(lib2)
  • app: global -> local(lib1) -> local(lib2) -> snapshot -> local(app)

@snoyberg
Copy link
Contributor

The problem is that you're reinstalling global packages - which are depended on by the Cabal library - in the local database. This means that (at least for GHC 7.8, which your example is using) the local version shadows the global version, and the dependencies of Cabal are unavailable. Concretely, I'm seeing on my system that Cabal depends on bytestring-0.10.4.0-d6f1d17d717e8652498cab8269a0acd5, but bytestring-0.10.4.0-9a9bbd95e0709e7cafd40d5934840c75 is in the local database.

@vigoo
Copy link
Contributor Author

vigoo commented Sep 17, 2015

I saw that too but wasn't sure it's the real problem. I'm a but confused here. Why does the MiniBuildPlan coming from the snapshot resolver (here: https://github.com/commercialhaskell/stack/blob/master/src/Stack/Build/Source.hs#L179) contain these global packages at all? Is there an easy way to filter them out?

@snoyberg
Copy link
Contributor

Different GHC installations may place different packages in the global
database, so we never hardcode which packages should be considered global.
Instead, the installed list tells us which database packages are installed
in, and we use that to determine which packages are already installed and
therefore don't need to be reinstalled.

It seems like this PR is attempting to do more than we originally
discussed, specifically change the behavior of where packages are
installed. I'm not too thrilled with the implicit nature of this, though it
could be that I'm misunderstanding how it works.

On Thu, Sep 17, 2015, 12:13 PM Daniel Vigovszky notifications@github.com
wrote:

I saw that too but wasn't sure it's the real problem. I'm a but confused
here. Why does the MiniBuildPlan coming from the snapshot resolver (here:
https://github.com/commercialhaskell/stack/blob/master/src/Stack/Build/Source.hs#L179)
contain these global packages at all? Is there an easy way to filter them
out?


Reply to this email directly or view it on GitHub
#990 (comment)
.

@vigoo
Copy link
Contributor Author

vigoo commented Sep 17, 2015

I did not want to make it implicit, that's why added a separate option for it (force-install-local).
Without this my solution only works if all the dependent packages are added as extra dependencies and no snapshot is used. This is a limitation I can work with, but while I was adding the extra package database support it seemed relatively easy to keep the advantage of having the snapshot by forcing the installed packages to be installed to the local database.

If you think it's too complicated and/or against the way stack should work, I can remove the force-install-local parts from the PR. The disadvantage is then that I have to maintain in all my stack.yamls a consistent set of dependent libraries with explicitly specified versions. I could generate these from higher level (the gradle plugin in this case), by fetching the snapshot config from stackage to get the versions. What do you think?

@snoyberg
Copy link
Contributor

I wasn't aware that force-install-local was added and a separate feature. That's probably fine.

To be honest though, I'm still not convinced that this is the best way to achieve your goal of cached binary builds, but I have no objection to having the multiple package database support in stack.

@vigoo
Copy link
Contributor Author

vigoo commented Sep 17, 2015

I tried out what I described above, and it works. vigoo/stack-sandbox-chain-test@667cb75
I think the simplest and cleanest is to go this way, so I'll clean up this pull request soon and make sure the tests are passing.

@vigoo vigoo force-pushed the extra-pkg-db-support branch 2 times, most recently from cf585b6 to 1311564 Compare September 17, 2015 13:01
@vigoo vigoo force-pushed the extra-pkg-db-support branch from 1311564 to ecc89f8 Compare September 28, 2015 18:37
@vigoo
Copy link
Contributor Author

vigoo commented Sep 28, 2015

I rebased this again; is there anything else I can do to make this merged?

@borsboom
Copy link
Contributor

I'm OK with this, I think it'll probably come in handy even though I still think the example use case is a bad idea. @snoyberg, can you make the final determination and merge if you agree?

wc <- getWhichCompiler
(lhs1, dps) <- ghcPkgDump menv wc (fmap snd mdb)
$ conduitDumpPackage =$ sink
(lhs1, dps) <- ghcPkgDump menv wc (extraDBs ++ (fmap snd (maybeToList mdb)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be necessary to pass in the extraDBs I think, the packages in those databases should be inherited automatically from the passed in lhs0. Do things break without them?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you don't pass the extra databases to ghc-pkg it will report the packages depending on any of them in the actual database as broken

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stack itself does the dependencies check and ignores ghc-pkg's opinion on the matter. You can see that we currently do the databases one at a time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I'm checking what happens if I'm not passing them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems to work without it, I'll test a bit more and clean it up soon

@snoyberg
Copy link
Contributor

Mostly looks OK, I added some inline comments.

@vigoo
Copy link
Contributor Author

vigoo commented Sep 29, 2015

Commented issues fixed

snoyberg added a commit that referenced this pull request Sep 30, 2015
@snoyberg snoyberg merged commit 783232c into commercialhaskell:master Sep 30, 2015
@snoyberg
Copy link
Contributor

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants