Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature discussion - Stack System Root #2205

Open
da-x opened this issue May 28, 2016 · 11 comments
Open

Feature discussion - Stack System Root #2205

da-x opened this issue May 28, 2016 · 11 comments

Comments

@da-x
Copy link
Contributor

da-x commented May 28, 2016

I open this issue for tracking and discussing a new feature I'm implementing for Stack, which I describe here.

Your comments are appreciated.

There is a working implementation of the feature in here.

Intro

This change adds a new, optional configurable path that is available via STACK_SYSTEM_ROOT environment, and also override-able via a stack-system-root Stack config option, or a --stack-system-root command line argument.

The Stack System Root path is an additional directory bearing the same structure as ~/.stack, but is considered a read-only cache for anything that is stored under ~/.stack. When Stack
runs, it first checks for availability of stuff under the Stack System Root (downloads, installed GHC, indices, etc.), and only afterward tries to either download, build, or install stuff to ~/.stack.

Any existing STACK_ROOT directory can be used as a Stack System Root, for another user, requiring only read-only access from that user.

This allows for:

  • Reduced network accesses for each user.
  • Savings in home dir sizes (and first-time compilation speeds)
  • Faster compilations, because shared compiled packages can for each snapshot can be taken from the Stack System Root instead of re-compiled in the homedir.
  • Therefore, a system-wide caching of all Stack's network accesses for all the users on the system that depend on the same snapshot.
  • Further down the road, should make DEB and RPM packaging easier.

Example

As bob, we can build some package, let's say cpio-conduit, but set our STACK_ROOT to be shared with another user:

    bob@localhost $ export STACK_ROOT=/opt/shared/bob/stack-system-root
    bob@localhost $ cd cpio-conduit
    bob@localhost $ stack setup
    bob@localhost $ stack build

Then, as alice, we start empty, and set STACK_SYSTEM_ROOT to that same path, which is read-only for us.

    alice@localhost $ rm -rf ~/.stack
    alice@localhost $ export STACK_SYSTEM_ROOT=/opt/shared/bob/stack-system-root

If we build the same package using the same Stack snapshot, then in this case, everything should be taken from the already populated STACK_ROOT of bob, even though it's read only. There wouldn't even be network accesses. If we depend on anything that bob does not have in the shared STACK_ROOT, then our own writable STACK_ROOT would be populated as usual.

    alice@localhost $ cd cpio-conduit
    alice@localhost $ stack build
    [1 of 1] Compiling Main             ( /tmp/stack5146/Setup.hs, /tmp/stack5146/Setup.o )
    Linking /home/alice/.stack/setup-exe-cache/x86_64-linux/tmp-setup-Simple-Cabal-1.22.5.0-ghc-7.10.3 ...
    cpio-conduit-0.7.0: configure
    Configuring cpio-conduit-0.7.0...
    cpio-conduit-0.7.0: build
    Preprocessing library cpio-conduit-0.7.0...
    [1 of 1] Compiling Data.CPIO        ( src/Data/CPIO.hs, .stack-work/dist/x86_64-linux/Cabal-1.22.5.0/build/Data/CPIO.o )
    In-place registering cpio-conduit-0.7.0...
    cpio-conduit-0.7.0: copy/register
    Installing library in
    /home/alice/cpio-conduit/.stack-work/install/x86_64-linux/lts-5.8/7.10.3/lib/x86_64-linux-ghc-7.10.3/cpio-conduit-0.7.0-6ocIFacK3GqJwqA79J3bQu
    Registering cpio-conduit-0.7.0...

After this build, alice's own .stack is nearly empty (just some empty directories and an empty package.cache):

    alice@localhost $ find /home/alice/.stack
    /home/alice/.stack
    /home/alice/.stack/config.yaml
    /home/alice/.stack/snapshots
    /home/alice/.stack/snapshots/x86_64-linux
    /home/alice/.stack/snapshots/x86_64-linux/lts-5.8
    /home/alice/.stack/snapshots/x86_64-linux/lts-5.8/7.10.3
    /home/alice/.stack/snapshots/x86_64-linux/lts-5.8/7.10.3/pkgdb
    /home/alice/.stack/snapshots/x86_64-linux/lts-5.8/7.10.3/pkgdb/package.cache
    /home/alice/.stack/setup-exe-cache
    /home/alice/.stack/setup-exe-cache/x86_64-linux
    /home/alice/.stack/setup-exe-cache/x86_64-linux/setup-Simple-Cabal-1.22.5.0-ghc-7.10.3

Our ghc-pkg looks like:

    alice@localhost $ stack exec -- ghc-pkg list
    /opt/shared/bob/stack-system-root/programs/x86_64-linux/ghc-7.10.3/lib/ghc-7.10.3/package.conf.d
     Cabal-1.22.5.0
     array-0.5.1.0
     base-4.8.2.0
     bin-package-db-0.0.0.0
     binary-0.7.5.0
     bytestring-0.10.6.0
     containers-0.5.6.2
     deepseq-1.4.1.1
     directory-1.2.2.0
     filepath-1.4.0.0
     ghc-7.10.3
     ghc-prim-0.4.0.0
     haskeline-0.7.2.1
     hoopl-3.10.0.2
     hpc-0.6.0.2
     integer-gmp-1.0.0.0
     pretty-1.1.2.0
     process-1.2.3.0
     rts-1.0
     template-haskell-2.10.0.0
     terminfo-0.4.0.1
     time-1.5.0.1
     transformers-0.4.2.0
     unix-2.7.1.0
     xhtml-3000.2.1
    /opt/shared/bob/stack-system-root/snapshots/x86_64-linux/lts-5.8/7.10.3/pkgdb
     async-2.1.0
     attoparsec-0.13.0.1
     base16-bytestring-0.1.1.6
     blaze-builder-0.4.0.1
     conduit-1.2.6.4
     conduit-extra-1.1.11
     exceptions-0.8.2.1
     hashable-1.2.4.0
     lifted-base-0.2.3.6
     mmorph-1.0.6
     monad-control-1.0.0.5
     mtl-2.2.1
     network-2.6.2.1
     primitive-0.6.1.0
     random-1.1
     resourcet-1.1.7.3
     scientific-0.3.4.6
     stm-2.4.4.1
     streaming-commons-0.1.15.2
     text-1.2.2.0
     transformers-base-0.4.4
     transformers-compat-0.4.0.4
     vector-0.11.0.0
     zlib-0.6.1.1
    /home/alice/.stack/snapshots/x86_64-linux/lts-5.8/7.10.3/pkgdb
    /home/alice/cpio-conduit/.stack-work/install/x86_64-linux/lts-5.8/7.10.3/pkgdb
     cpio-conduit-0.7.0
@mgsloan
Copy link
Contributor

mgsloan commented May 29, 2016

Sounds like an excellent addition to me! Pinging @borsboom

It reminds me a little bit of the extra-package-dbs mechanism, which allow you to add additional package dbs between the global db and snapshot. Not sure if it's a simplification, but it may be worth reusing that mechanism / unifying it with that mechanism. See the discussion here of how that feature works.

@da-x
Copy link
Contributor Author

da-x commented May 29, 2016

It just occurred to me that for this feature, instead of introducing a new environment variable, perhaps we can instead make the existing STACK_ROOT.. stackable! This, by introducing a colon, E.g. STACK_ROOT=/path/a:path/b:path/c, where /path/a is the read-write top Stack root and /path/b and so forth are the read-only Stack roots. What do you think?

@sinelaw
Copy link

sinelaw commented May 29, 2016

This could be used to make Haskell Platform and Stack compatible. Users could install stack after installing HP and earn all the HP-installed packages for free. We just need HP to setup the packages in a structure that stack --stack-system-root can understand.

@mgsloan
Copy link
Contributor

mgsloan commented May 29, 2016

@da-x Particularly considering that stackable STACK_ROOTs point, I think we need to think rather carefully about this. I'm in favor of potentially having other spots for programs. However, having another spot for snapshot DBs gets rather dicey.

Lets say you build package A in RootA against a package B dependency in RootB. Now, you can only really reasonably use RootA when you are also using RootB, so it may as well have that dependency specified in the configuration of RootA. Also, if RootB is changed in destructive ways, it can break RootA's packages.

The best thing I can come up with to make this less prone to user error is the following implementation strategy:

Have some command like stack root parent <dir>. It would check if <dir> seems to be a reasonable $STACK_ROOT, and add a file $STACK_ROOT/parent-root if it exists. This file would contain <dir> along with the hash of <dir>. (that way, it's human readable but not easily modifiable)

If this command is run and parent-root already exists, then perhaps have an option --delete-all-packages, which will delete all the package dbs and change the root.

This implementation can support stacks! However, you'd need to check for cycles :)

@mgsloan
Copy link
Contributor

mgsloan commented May 29, 2016

@sinelaw Is that for new Haskell Platform or old? With old Haskell Platform it does reuse the packages if they are the correct versions, as they are in the global DB.

@da-x
Copy link
Contributor Author

da-x commented May 29, 2016

I think that for the general use case, it should be safe to assume that by convention nothing gets removed from the read-only root on which your read-write root depends on. Same for modifications that are not simply additions. On the bright side, a new version of a package or GHC is always an addition, so we are good there.

At worst case, if that assumption does get violated, at least for removals, you should only need to do stack build to let your read-write root recover the missing pieces locally. In the example you gave, we can have the global stack YAML option other-stack-roots, similarly to the stack-system-root I initially propose. This helps to avoid fiddling with environment variables, and have that dependency neatly specified.

On Linux, for the long-term, I imagine that the read-only roots can come from Debs or RPMs and reside in /usr/lib/haskell-stack/root, and also introduce some optional Stack awareness to apt and dnf which can be used to bring pre-built stuff of any Stack root component, saving considerable build time and external network access in build servers.

BTW, do we want to mappend the global stack YAMLs of all the dependent roots, finding them recursively via the list in STACK_ROOT and/or other-stack-roots?

@mgsloan
Copy link
Contributor

mgsloan commented May 29, 2016

I've thought a bit more on the stack root parent idea. It might make more sense to have a parent-stack-root: field in the config. However, this field should not be settable by project stack.yaml files. This'd be unprecedented - we don't currently have any fields that aren't settable in project files. I'd also still want something to prevent it from being changed without proper actions or sufficient warning.

At worst case, if that assumption does get violated, at least for removals, you should only need to do stack build to let your read-write root recover the missing pieces locally.

True, it should be able to recover in a sane manner.

BTW, do we want to mappend the global stack YAMLs of all the dependent roots, finding them recursively via the list in STACK_ROOT and/or other-stack-roots?

I suppose that would indeed make sense. I'm a little worried about it affecting stack's traceability - how easy it is for the user to figure out why something is happening. However, we already have another similar mechanism - extensible-snapshots - #863

So, I'm in favor of it. At some point we're going to need some kind of command that tells you what the accumulated config is and why. Till then, I think this is enough of an "expert-mode" feature that doing something like mappending them together is reasonable.

@carlpaten
Copy link
Contributor

I would love for this to happen. See this Stack Overflow submission:

Our school computers have Stack installed, but it's hard to use because user directories have very limited space. I'm wondering if there's a way to have a system-wide .stack folder, instead of having it in user directories.

@alexeymuranov
Copy link

This is probably a naïve question, but why can't stack work in this respect like Python's pip? I mean, if i want to install a Python package globally, i use sudo pip, if i want to install it in the user's home directory, i use pip --user. The user has access to both global packages and their own ones.

Alternatively, can't stack be based on Nix, with a common global shared storage?

@colonelpanic8
Copy link

Is there any progress on this?

@da-x
Copy link
Contributor Author

da-x commented Jun 14, 2018

Sorry, but I don't have any plans to continue implementing this. Anyone else is invited to resume and pick this up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants