Feature discussion - Stack System Root #2205

da-x · 2016-05-28T06:03:15Z

I open this issue for tracking and discussing a new feature I'm implementing for Stack, which I describe here.

Your comments are appreciated.

There is a working implementation of the feature in here.

Intro

This change adds a new, optional configurable path that is available via STACK_SYSTEM_ROOT environment, and also override-able via a stack-system-root Stack config option, or a --stack-system-root command line argument.

The Stack System Root path is an additional directory bearing the same structure as ~/.stack, but is considered a read-only cache for anything that is stored under ~/.stack. When Stack
runs, it first checks for availability of stuff under the Stack System Root (downloads, installed GHC, indices, etc.), and only afterward tries to either download, build, or install stuff to ~/.stack.

Any existing STACK_ROOT directory can be used as a Stack System Root, for another user, requiring only read-only access from that user.

This allows for:

Reduced network accesses for each user.
Savings in home dir sizes (and first-time compilation speeds)
Faster compilations, because shared compiled packages can for each snapshot can be taken from the Stack System Root instead of re-compiled in the homedir.
Therefore, a system-wide caching of all Stack's network accesses for all the users on the system that depend on the same snapshot.
Further down the road, should make DEB and RPM packaging easier.

Example

As bob, we can build some package, let's say cpio-conduit, but set our STACK_ROOT to be shared with another user:

    bob@localhost $ export STACK_ROOT=/opt/shared/bob/stack-system-root
    bob@localhost $ cd cpio-conduit
    bob@localhost $ stack setup
    bob@localhost $ stack build

Then, as alice, we start empty, and set STACK_SYSTEM_ROOT to that same path, which is read-only for us.

    alice@localhost $ rm -rf ~/.stack
    alice@localhost $ export STACK_SYSTEM_ROOT=/opt/shared/bob/stack-system-root

If we build the same package using the same Stack snapshot, then in this case, everything should be taken from the already populated STACK_ROOT of bob, even though it's read only. There wouldn't even be network accesses. If we depend on anything that bob does not have in the shared STACK_ROOT, then our own writable STACK_ROOT would be populated as usual.

    alice@localhost $ cd cpio-conduit
    alice@localhost $ stack build
    [1 of 1] Compiling Main             ( /tmp/stack5146/Setup.hs, /tmp/stack5146/Setup.o )
    Linking /home/alice/.stack/setup-exe-cache/x86_64-linux/tmp-setup-Simple-Cabal-1.22.5.0-ghc-7.10.3 ...
    cpio-conduit-0.7.0: configure
    Configuring cpio-conduit-0.7.0...
    cpio-conduit-0.7.0: build
    Preprocessing library cpio-conduit-0.7.0...
    [1 of 1] Compiling Data.CPIO        ( src/Data/CPIO.hs, .stack-work/dist/x86_64-linux/Cabal-1.22.5.0/build/Data/CPIO.o )
    In-place registering cpio-conduit-0.7.0...
    cpio-conduit-0.7.0: copy/register
    Installing library in
    /home/alice/cpio-conduit/.stack-work/install/x86_64-linux/lts-5.8/7.10.3/lib/x86_64-linux-ghc-7.10.3/cpio-conduit-0.7.0-6ocIFacK3GqJwqA79J3bQu
    Registering cpio-conduit-0.7.0...

After this build, alice's own .stack is nearly empty (just some empty directories and an empty package.cache):

    alice@localhost $ find /home/alice/.stack
    /home/alice/.stack
    /home/alice/.stack/config.yaml
    /home/alice/.stack/snapshots
    /home/alice/.stack/snapshots/x86_64-linux
    /home/alice/.stack/snapshots/x86_64-linux/lts-5.8
    /home/alice/.stack/snapshots/x86_64-linux/lts-5.8/7.10.3
    /home/alice/.stack/snapshots/x86_64-linux/lts-5.8/7.10.3/pkgdb
    /home/alice/.stack/snapshots/x86_64-linux/lts-5.8/7.10.3/pkgdb/package.cache
    /home/alice/.stack/setup-exe-cache
    /home/alice/.stack/setup-exe-cache/x86_64-linux
    /home/alice/.stack/setup-exe-cache/x86_64-linux/setup-Simple-Cabal-1.22.5.0-ghc-7.10.3

Our ghc-pkg looks like:

    alice@localhost $ stack exec -- ghc-pkg list
    /opt/shared/bob/stack-system-root/programs/x86_64-linux/ghc-7.10.3/lib/ghc-7.10.3/package.conf.d
     Cabal-1.22.5.0
     array-0.5.1.0
     base-4.8.2.0
     bin-package-db-0.0.0.0
     binary-0.7.5.0
     bytestring-0.10.6.0
     containers-0.5.6.2
     deepseq-1.4.1.1
     directory-1.2.2.0
     filepath-1.4.0.0
     ghc-7.10.3
     ghc-prim-0.4.0.0
     haskeline-0.7.2.1
     hoopl-3.10.0.2
     hpc-0.6.0.2
     integer-gmp-1.0.0.0
     pretty-1.1.2.0
     process-1.2.3.0
     rts-1.0
     template-haskell-2.10.0.0
     terminfo-0.4.0.1
     time-1.5.0.1
     transformers-0.4.2.0
     unix-2.7.1.0
     xhtml-3000.2.1
    /opt/shared/bob/stack-system-root/snapshots/x86_64-linux/lts-5.8/7.10.3/pkgdb
     async-2.1.0
     attoparsec-0.13.0.1
     base16-bytestring-0.1.1.6
     blaze-builder-0.4.0.1
     conduit-1.2.6.4
     conduit-extra-1.1.11
     exceptions-0.8.2.1
     hashable-1.2.4.0
     lifted-base-0.2.3.6
     mmorph-1.0.6
     monad-control-1.0.0.5
     mtl-2.2.1
     network-2.6.2.1
     primitive-0.6.1.0
     random-1.1
     resourcet-1.1.7.3
     scientific-0.3.4.6
     stm-2.4.4.1
     streaming-commons-0.1.15.2
     text-1.2.2.0
     transformers-base-0.4.4
     transformers-compat-0.4.0.4
     vector-0.11.0.0
     zlib-0.6.1.1
    /home/alice/.stack/snapshots/x86_64-linux/lts-5.8/7.10.3/pkgdb
    /home/alice/cpio-conduit/.stack-work/install/x86_64-linux/lts-5.8/7.10.3/pkgdb
     cpio-conduit-0.7.0

The text was updated successfully, but these errors were encountered:

mgsloan · 2016-05-29T04:39:05Z

Sounds like an excellent addition to me! Pinging @borsboom

It reminds me a little bit of the extra-package-dbs mechanism, which allow you to add additional package dbs between the global db and snapshot. Not sure if it's a simplification, but it may be worth reusing that mechanism / unifying it with that mechanism. See the discussion here of how that feature works.

da-x · 2016-05-29T04:58:00Z

It just occurred to me that for this feature, instead of introducing a new environment variable, perhaps we can instead make the existing STACK_ROOT.. stackable! This, by introducing a colon, E.g. STACK_ROOT=/path/a:path/b:path/c, where /path/a is the read-write top Stack root and /path/b and so forth are the read-only Stack roots. What do you think?

sinelaw · 2016-05-29T07:24:15Z

This could be used to make Haskell Platform and Stack compatible. Users could install stack after installing HP and earn all the HP-installed packages for free. We just need HP to setup the packages in a structure that stack --stack-system-root can understand.

mgsloan · 2016-05-29T09:30:02Z

@da-x Particularly considering that stackable STACK_ROOTs point, I think we need to think rather carefully about this. I'm in favor of potentially having other spots for programs. However, having another spot for snapshot DBs gets rather dicey.

Lets say you build package A in RootA against a package B dependency in RootB. Now, you can only really reasonably use RootA when you are also using RootB, so it may as well have that dependency specified in the configuration of RootA. Also, if RootB is changed in destructive ways, it can break RootA's packages.

The best thing I can come up with to make this less prone to user error is the following implementation strategy:

Have some command like stack root parent <dir>. It would check if <dir> seems to be a reasonable $STACK_ROOT, and add a file $STACK_ROOT/parent-root if it exists. This file would contain <dir> along with the hash of <dir>. (that way, it's human readable but not easily modifiable)

If this command is run and parent-root already exists, then perhaps have an option --delete-all-packages, which will delete all the package dbs and change the root.

This implementation can support stacks! However, you'd need to check for cycles :)

mgsloan · 2016-05-29T09:52:24Z

@sinelaw Is that for new Haskell Platform or old? With old Haskell Platform it does reuse the packages if they are the correct versions, as they are in the global DB.

da-x · 2016-05-29T10:22:27Z

I think that for the general use case, it should be safe to assume that by convention nothing gets removed from the read-only root on which your read-write root depends on. Same for modifications that are not simply additions. On the bright side, a new version of a package or GHC is always an addition, so we are good there.

At worst case, if that assumption does get violated, at least for removals, you should only need to do stack build to let your read-write root recover the missing pieces locally. In the example you gave, we can have the global stack YAML option other-stack-roots, similarly to the stack-system-root I initially propose. This helps to avoid fiddling with environment variables, and have that dependency neatly specified.

On Linux, for the long-term, I imagine that the read-only roots can come from Debs or RPMs and reside in /usr/lib/haskell-stack/root, and also introduce some optional Stack awareness to apt and dnf which can be used to bring pre-built stuff of any Stack root component, saving considerable build time and external network access in build servers.

BTW, do we want to mappend the global stack YAMLs of all the dependent roots, finding them recursively via the list in STACK_ROOT and/or other-stack-roots?

mgsloan · 2016-05-29T11:39:52Z

I've thought a bit more on the stack root parent idea. It might make more sense to have a parent-stack-root: field in the config. However, this field should not be settable by project stack.yaml files. This'd be unprecedented - we don't currently have any fields that aren't settable in project files. I'd also still want something to prevent it from being changed without proper actions or sufficient warning.

At worst case, if that assumption does get violated, at least for removals, you should only need to do stack build to let your read-write root recover the missing pieces locally.

True, it should be able to recover in a sane manner.

BTW, do we want to mappend the global stack YAMLs of all the dependent roots, finding them recursively via the list in STACK_ROOT and/or other-stack-roots?

I suppose that would indeed make sense. I'm a little worried about it affecting stack's traceability - how easy it is for the user to figure out why something is happening. However, we already have another similar mechanism - extensible-snapshots - #863

So, I'm in favor of it. At some point we're going to need some kind of command that tells you what the accumulated config is and why. Till then, I think this is enough of an "expert-mode" feature that doing something like mappending them together is reasonable.

carlpaten · 2016-12-02T16:17:57Z

I would love for this to happen. See this Stack Overflow submission:

Our school computers have Stack installed, but it's hard to use because user directories have very limited space. I'm wondering if there's a way to have a system-wide .stack folder, instead of having it in user directories.

alexeymuranov · 2017-02-25T19:28:32Z

This is probably a naïve question, but why can't stack work in this respect like Python's pip? I mean, if i want to install a Python package globally, i use sudo pip, if i want to install it in the user's home directory, i use pip --user. The user has access to both global packages and their own ones.

Alternatively, can't stack be based on Nix, with a common global shared storage?

colonelpanic8 · 2018-06-13T03:08:53Z

Is there any progress on this?

da-x · 2018-06-14T07:06:39Z

Sorry, but I don't have any plans to continue implementing this. Anyone else is invited to resume and pick this up.

mgsloan added type: enhancement type: discuss labels May 31, 2016

mgsloan added this to the P2: Should milestone May 31, 2016

mihaimaruseac mentioned this issue Jan 23, 2019

Moving a stack directory tree #4530

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature discussion - Stack System Root #2205

Feature discussion - Stack System Root #2205

da-x commented May 28, 2016 •

edited

Loading

mgsloan commented May 29, 2016

da-x commented May 29, 2016

sinelaw commented May 29, 2016

mgsloan commented May 29, 2016 •

edited

Loading

mgsloan commented May 29, 2016

da-x commented May 29, 2016

mgsloan commented May 29, 2016 •

edited

Loading

carlpaten commented Dec 2, 2016

alexeymuranov commented Feb 25, 2017

colonelpanic8 commented Jun 13, 2018

da-x commented Jun 14, 2018

Feature discussion - Stack System Root #2205

Feature discussion - Stack System Root #2205

Comments

da-x commented May 28, 2016 • edited Loading

Intro

Example

mgsloan commented May 29, 2016

da-x commented May 29, 2016

sinelaw commented May 29, 2016

mgsloan commented May 29, 2016 • edited Loading

mgsloan commented May 29, 2016

da-x commented May 29, 2016

mgsloan commented May 29, 2016 • edited Loading

carlpaten commented Dec 2, 2016

alexeymuranov commented Feb 25, 2017

colonelpanic8 commented Jun 13, 2018

da-x commented Jun 14, 2018

da-x commented May 28, 2016 •

edited

Loading

mgsloan commented May 29, 2016 •

edited

Loading

mgsloan commented May 29, 2016 •

edited

Loading