-
Notifications
You must be signed in to change notification settings - Fork 703
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent Appveyor failure: "you don't have permission to modify this file" #4005
Comments
Seeing this as well on https://ci.appveyor.com/project/jagajaga/pos-haskell-prototype/build/1.0.57#L24558 |
@arybczak and I have a diagnosis, and @arybczak is working on a solution in ghc/ghc-pkg and possibly also a workaround to use with existing ghc versions. So, cabal-install and stack are careful to only run one ghc-pkg register at a time, however this is not enough to avoid the problem. The failing scenario goes like this:
This problem cannot be solved simply by changing the share mode. It would not be ok to have ghc open the file with a share mode that allows delete. That would mean ghc-pkg can overwrite the file, but would instead mean that ghc sees a corrupted version of the file (it'd either appear truncated or it'd get a read error). This problem cannot be solved by using the atomic overwrite trick, because that simply does not work on Windows. Windows supports atomic rename but does not allow one process to continue to read the old file while another has replaced the file with new content. The solution is proper reader/writer file locking. Both ghc and ghc-pkg have to cooperate to do reader/writer locking. This also will allow us to do the locking properly, making it actually safe to run ghc-pkg registration updates concurrently which ultimately will be better as all tools that call ghc-pkg can benefit from that. A workaround in cabal-install/stack is to switch from simply excluding writers from each other, to do reader/writer locking for registration, where configure/build counts as a reader. This will significantly delay when registration can be done, and may reduce overall build parallelism. |
Relevant GHC ticket filed by @arybczak: https://ghc.haskell.org/trac/ghc/ticket/13194 |
That GHC bug seems to have been fixed with 8.2, but our AppVeyor builds are on 8.0. On the other hand I don't think I've seen this bug happen, so maybe it stopped of its own accord? |
Are we by any chance still using AppVeyor? |
No,we dont, thanks for triaging ci issues |
This kind of failure happens fairly frequently on AppVeyor:
https://ci.appveyor.com/project/23Skidoo/cabal/build/%232365%20(master)
The obvious explanation is insufficient locking, but it's not altogether clear why: we DO take an MVar lock for copying and registering. Maybe there is some sort of concurrent reader/writer problem going on.
The text was updated successfully, but these errors were encountered: