Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multiple writes at the same time on windows machine causes permission denied error #2617

Closed
hanjoosten opened this issue Sep 21, 2016 · 30 comments

Comments

@hanjoosten
Copy link

On my windows 7 machine, I tried to build my project with LTS-7.0 This failed for some reason. Evantually, I removed ALL previously installed haskell / ghc / cabal / stack stuff from my machine, making sure I could do a totally clean installation.

When I install my project, I get the following error message, when on of the many packages is being registered:

ghc-pkg.EXE: C:\sr\snapshots\480beed7\pkgdb\package.cache: you don't have permission to modify this file

Diagnose

There are two things that catch my eye:

  1. I wonder why anything is being written to a directory at C:\sr . In my case, I have admin rights on my machine, but in a lot of companies, this isn't by default.
  2. When I resubmit the command stack install, the list of packages being built is growing. However, sometime further it can happen again. I guess that there are multiple processes trying to write to C:\sr\snapshots\480beed7\pkgdb\package.cache at the same time. This results in a write error in the latter process.

Steps to reproduce

  1. clone a large haskell program like the one I have these problems
  2. build it on a multicore windows machine, using stack:
    1. stack setup
    2. stack install

Expected

I would expect the build to succeed by only calling stack install once.

Actual

During installation, I ran into the following error message:

Progress: 30/126
--  While building package asn1-parse-0.9.4 using:
      C:\sr\setup-exe-cache\x86_64-windows\setup-Simple-Cabal-1.24.0.0-ghc-8.0.1.exe --builddir=.stack-work\dist\b7fec021 register
    Process exited with code: ExitFailure 1
    Logs have been written to: D:\data\hjo20125\Git\Ampersand\.stack-work\logs\asn1-parse-0.9.4.log

    Configuring asn1-parse-0.9.4...
    Building asn1-parse-0.9.4...
    Preprocessing library asn1-parse-0.9.4...
    [1 of 1] Compiling Data.ASN1.Parse  ( Data\ASN1\Parse.hs, .stack-work\dist\b7fec021\build\Data\ASN1\Parse.o )

    Data\ASN1\Parse.hs:30:1: warning: [-Wunused-imports]
        The import of `Control.Applicative' is redundant
          except perhaps to import instances from `Control.Applicative'
        To import instances alone, use: import Control.Applicative()
    Installing library in
    C:\sr\snapshots\480beed7\lib\x86_64-windows-ghc-8.0.1\asn1-parse-0.9.4-472aFk4Rjki87YeAg9g5Ph
    Registering asn1-parse-0.9.4...
    setup-Simple-Cabal-1.24.0.0-ghc-8.0.1.exe:
    'C:\Users\hjo20125\AppData\Local\Programs\stack\x86_64-windows\ghc-8.0.1\bin\ghc-pkg.EXE'
    exited with an error:
    asn1-parse-0.9.4: Warning: haddock-interfaces:
    C:\sr\snapshots\480beed7\doc\asn1-parse-0.9.4\asn1-parse.haddock doesn't exist
    or isn't a file
    ghc-pkg.EXE: C:\sr\snapshots\480beed7\pkgdb\package.cache: you don't have
    permission to modify this file

D:\data\hjo20125\Git\Ampersand>

Because of the fact that there seems to be a write conflict, I tried to re-run stack install each time that this error occurs. Every time, some more packages are registered, and eventually my application is build succesfully.

Stack version

$ stack --version
Version 1.2.0, Git revision 123819b7d65df2ad7fe63fb5eb39a98536acb5f3 (4055 commits) x86_64 hpack-0.14.0

Method of installation

  • Official binary, downloaded from stackage.org or fpcomplete's package repository
@hanjoosten
Copy link
Author

For what it is worth: The result can be observed from Appveyor too.

@mgsloan
Copy link
Contributor

mgsloan commented Sep 26, 2016

I wonder why anything is being written to a directory at C:\sr . In my case, I have admin rights on my machine, but in a lot of companies, this isn't by default.

This is customizable via the STACK_ROOT environment variable. This is described in the docs.

When I resubmit the command stack install, the list of packages being built is growing. However, sometime further it can happen again. I guess that there are multiple processes trying to write to C:\sr\snapshots\480beed7\pkgdb\package.cache at the same time. This results in a write error in the latter process.

We have not seen such problems with concurrent writing of that. Perhaps it is only an issue on windows? Seems worth investigation. At least multiple builds eventually succeed.

I assume the problem persists even if you delete the stack root and start fresh?

@mgsloan mgsloan added this to the Support milestone Sep 26, 2016
@hanjoosten
Copy link
Author

I had the variable STACK_ROOT defined before I did the fresh install. I assume that there is no existence check before it is set. Before it installed, it pointed to D:\stackroot\ After installation, it pointed to c:\sr

The Ampersand project uses 100+ packages (transitive). Other team members with Windows machines experienced the same problem. Fortunately the workaround is to just issue the command stack install a couple of times. Every time some progress is made.

When I delete stackroot and start fresh, the issue persists indeed. Having to re-issue the command 2 to 3 times seems standard.

@hanjoosten
Copy link
Author

If there is anything I can do to help (testing?), just let me know.

@mgsloan
Copy link
Contributor

mgsloan commented Sep 26, 2016

I've split off an issue for the installer issue - #2638

We don't have other issues like this from windows users, so my theory is that your code base (or particular dependencies) do something unusual which is triggering this condition.

If you could construct a case that reproduces, that would be very helpful. One approach might be to trim down your project to just a cabal file + stack file, rename it, and post it up.

@hanjoosten
Copy link
Author

Yesterday I installed ghc-mod, because I want to try atom as an editor. So I issued the command stack install ghc-mod . There too, I ran into this same 'permission denied' problem. Therefor, I do not think that this is something particular to my code base. I do have an 8-core machine, with 32Gb of memory onboard. During compilation, I see quite a few ghc processes busy at the same time.

@mgsloan
Copy link
Contributor

mgsloan commented Sep 27, 2016

Cool! I will try next time I'm on a windows box.

@nightuser
Copy link

I'm experiencing the same problem (any package with lots of dependencies, like ghc-mod).

@ezyang
Copy link

ezyang commented Oct 18, 2016

haskell/cabal#4005 might be related.

@simonmichael
Copy link
Contributor

I'm seeing this in most of my appveyor builds of hledger, since switching them to GHC 8 I think. Eg: https://ci.appveyor.com/project/simonmichael/hledger/build/master-85.

@TerrorJack
Copy link

I'm experiencing the same issue on Windows x64. This occurs whenever you use stack build with a heavy build plan on a multi-core machine (stack build using global project also triggers it). And this bug hasn't been encountered in earlier versions of ghc before. Can also be observed on AppVeyor.

The temporary fix is to use -j1 whenever possible on Windows.

@domenkozar
Copy link
Contributor

I looked a bit closer around haddock phase building, but couldn't find anything obvious.

@domenkozar
Copy link
Contributor

Is someone available to look into this issue (paid)? domen@enlambda.com

@ndmitchell
Copy link
Contributor

The root cause is likely to be that ghc-pkg can only be called with a single writer or multiple readers simultaneously - any other combination results in these errors. The GHC build system Hadrian (https://github.com/snowleopard/hadrian) uses locks to avoid that problem. Fixing ghc-pkg to properly use file locks would be the real solution.

@domenkozar
Copy link
Contributor

@ndmitchell so this is actually a stack issue, looking at src/Stack/GhcPkg.hs

@ndmitchell
Copy link
Contributor

@domenkozar perhaps, perhaps not. cabal configure and cabal register also call ghc-pkg, so you have to synchronise with them as well. I believe ghc also calls ghc-pkg so that needs dealing with too.

@ezyang
Copy link

ezyang commented Jan 18, 2017

Oh, @ndmitchell, that's an interesting claim. Is a concurrent read-write really not possible on Windows? It will be hard for us to fix this, because when we shell out to GHC, GHC will read from the package database; we'd have to block ALL calls to GHC before running a ghc-pkg register.

@ndmitchell
Copy link
Contributor

It is possible on Windows, but the Haskell file handle functions deliberately chose to pass the flags explicitly requesting that you deny sharing. I reported this ~10 years ago to GHC, but can't find where I did so (the Linux functions pass different sharing flags).

@domenkozar
Copy link
Contributor

@domenkozar
Copy link
Contributor

For future reference, until this is fixed:

See workaround in input-output-hk/cardano-sl@61470aa

@domenkozar
Copy link
Contributor

See haskell/cabal#4005 (comment) for further development on this issue.

@arybczak
Copy link

The proper fix went into ghc-pkg and will be available in GHC 8.2.1 (see https://phabricator.haskell.org/rGHC0d86aa5904e5a06c93632357122e57e4e118fd2a for more details).

However, this still needs to be fixed in stack for older GHC versions. As pointed out in haskell/cabal#4005 (comment), we need to acquire global read/write lock while interacting with a specific package database, where configure/build phase is treated as a reader and install/register phase as a writer.

This will unfortunately limit overall build parallelism, so I presume it should be only done on Windows. The idea is to port GHC.IO.Handle.Lock from base-4.10.0.0 and acquire appropriate lock on a specific file within STACK_ROOT during each build phase.

I'll be providing PR that solves this. I'm not overly familiar with stack source code, so any pointers for what part of the code I should focus on would be welcome.

@ndmitchell
Copy link
Contributor

We've lived with this problem a really long time. It's great the fix is available upstream, but is it worth just waiting for GHC 8.2, rather than fixing stack?

@arybczak
Copy link

@ndmitchell Well, it could be argued that people who won't/can't use 8.2 will not benefit from the fix. Although the problem can be worked around , it is pretty annoying.

@philipcraig
Copy link

I will be happy to wait for ghc 8.2 to land for the fix, and happy for stack not to do anything about this

tfausak added a commit to tfausak/rattletrap that referenced this issue May 1, 2017
@varosi
Copy link

varosi commented May 4, 2017

As I see "need repro" label for that issue - it is easily reproducable - try to build Stack itself with stack build on clean caches.

@borsboom
Copy link
Contributor

borsboom commented Oct 27, 2017

Can anyone confirm whether GHC 8.2.1 fixes this?

@TerrorJack
Copy link

@borsboom It does.

@borsboom
Copy link
Contributor

Alright, closing this issue as resolved upstream.

@robinp
Copy link

robinp commented May 23, 2018

Note: passing the -j1 flag to stack (build/install) seems to mitigate.

monacoremo pushed a commit to monacoremo/postgrest that referenced this issue Jul 17, 2021
Appveyor failed on latest release https://ci.appveyor.com/project/begriffs/postgrest/build/1.0.5
according to commercialhaskell/stack#2617 (comment)
this can be fixed by adding `-j1`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests