-
Notifications
You must be signed in to change notification settings - Fork 843
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make locking opt-in rather than default #950
Comments
I don't have a strong opinion about defaults as long as the option is there. As a first step, we haven't tried having it default-on but with an option to turn it off when there's a problem. I think we definitely at least need the option to turn it off as an escape hatch. Leaving it on by default does have the benefit of forcing us to fix problems. I'm back in the country afer ICFP now and can help look into some of those problems. Obviously, many tools persist with "concurrent behavior = undefined" for long periods of time (cabal did), but I think it would be nice to at least issue a warning or something, rather than letting the |
I also have the impression it is better to let the lock on by default so we have to fix bugs rather than ignore them. To improve user experience when stack fails to acquire the lock for too long, stack could print a message like
|
What bad things will happen if the lock is disabled? I assume it was implemented in the first place to avoid bad things like database corruption or something. It seems like it would be bad to disable it by default if horrible things can happen. |
I don't believe there's any case of horrible things happening. AFAICT, the worst case scenario is that the GHC package database gets overwritten, a package that was registered by process A is implicitly unregistered by process B, and then process A's build will fail because dependencies are missing. The solutions to that could be:
Even the second option still leaves some possibilities for a failed build, but I don't think there's anything that can go wrong that would prevent a second run of the same command from succeeding. |
Those failure modes do sound pretty benign. I thought the issue was more along the lines of breaking existing packages like in the bad-old-days. I suppose the simplest scenario is:
Even if A_1 and A_2 are the same version (as they should be, in the snapshot, not necessarily in concurrent builds in the same local project), they still get nondeterministic package IDs in there somewhere, right (which can make A_1 invalid for C_2)? I think @snoyberg's point was that one of the two above jobs will fail, and just need to be rerun. That's a benign failure. (But strictly speaking aren't there also more complicated setups where both fail?) Also, is there a way for individual package registrations to be non-atomic? Do they touch more than one file? Finally, with dynamic libraries, could these races and bad interleavings break already built binaries? |
I started playing with letting concurrent stack processes run together. I ran into the following error while doing a $ stack build
stack-0.1.5.0: build
Preprocessing library stack-0.1.5.0...
[39 of 70] Compiling Stack.PackageIndex ( src/Stack/PackageIndex.hs, .stack-work/dist/x86_64-linux/Cabal-1.22.4.0/build/Stack/PackageIndex.o )
ghc: panic! (the 'impossible' happened)
(GHC version 7.10.2 for x86_64-unknown-linux):
Loading temp shared object failed: /tmp/ghc120832_0/libghc_47.so: undefined symbol: stackzu4xMM19FZZwCV6PT9u8bj3To_StackziTypesziCompiler_zdwparseCompilerVersion_info
Please report this as a GHC bug: http://www.haskell.org/ghc/reportabug
-- While building package stack-0.1.5.0 using:
/u/rrnewton/.stack/setup-exe-cache/setup-Simple-Cabal-1.22.4.0-x86_64-linux-ghc-7.10.2 --builddir=.stack-work/dist/x86_64-linux/Cabal-1.22.4.0/ build lib:stack exe:stack --ghc-options -ddump-hi -ddump-to-file
Process exited with code: ExitFailure 1 |
I propose we make the locking support optional rather than on by default, as it's prevented many workflows from happening so far to be a maintenance burden and sometimes still does. It doesn't serve the regular desktop user very well IMHO, but it's a valid use-case that fits the CI/automation needs. I propose we have a flag or environment variable like
Or something like that to enable it.
Pinging @snoyberg @rrnewton.
The text was updated successfully, but these errors were encountered: