-
Notifications
You must be signed in to change notification settings - Fork 67
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Windows Server 2019: LockFile lock is not released by OS when process exits #37
Comments
We are looking into this. |
I found another interesting fact about this error.If you try consistently CreateFile, LockFile, CloseFile, OpenFile with file in mounted dir, on OpenFile you get error, or application will infinitly waiting, but if you add Sleep after close file, it will be fine (for my machine delay ~34sec). This problem occurs only if file placed in mounted dir. Steps to reproduce the issue:
int main() {
}
More info:I try a lot of base images and i found this bug in:
I try Windows 10 Pro (1909, 2004) and Windows Server 2019 Core as host machine. But, on mcr.microsoft.com/windows/servercore: 1607 base image all working fine. P.s. For my scenario i can't make Windows MinioServer in Container. UPD: try mcr.microsoft.com/windows/servercore: 2004 and problem are here too |
Any news, after 2 month? |
@immuzz Shouldn't this be labelled as a bug? This is surely not expected behaviour. |
Now that docker-library/mongo#385 was closed in favor of this ticket, will this get worked on? It's a shame that MongoDB Windows containers literally have no chance of working correctly on their own at this point. |
@Kellendros007 Have you reproduced this on Windows OS 2004? |
@immuzz Now i try to reproduced on Windows 10 Pro 2004 (19041.572) Host,
but error is not resolved |
@Kellendros007 Our developers are looking into it. Thank you for confirming its reproducible on 2004 |
This issue has been open for 30 days with no updates. |
Please don't let the bot consider this issue stale :( it really has to get fixed once and for all |
@awakecoding It has moved along their Roadmap from "Backlog" to "Planned" at least, so it seems a fix is on the (distant) horizon. |
@thecloudtaylor any update on this? We've started hitting more issues with MongoDB in Windows containers, and our ugly workaround of manually deleting the WiredTiger.lock file with PowerShell before launching the container will not be able to save us this time: docker-library/mongo#435 We've been investigating issues with customers that have a working deployment of our application using a MongoDB container on Windows Server 2019, and they've been unable to get a fresh installation up and running on brand new Windows Server 2019 machines. MongoDB just fails after 20-30 minutes to restart and hit the WiredTiger.lock issue because it wasn't launched through our PowerShell wrapper, making it much harder to diagnose. It's really hard to justify telling our customers to run everything except MongoDB inside containers, especially since they never had to go through the trouble of manually setting up the database. They like it, and they don't want to switch to Linux, for most of these customers this is their first experience using Windows containers. They feel at home on Windows and we're happy to give them the true Windows experience. I have always been a strong advocate for Windows containers on Windows, but I really need a hand here, please. |
@awakecoding You are mentioning the issue reporter who has not touched this issue since he opened it almost six months ago; I doubt they have any updates for us. If anyone can give us an update, perhaps the assignee @immuzz can, but in any case you can keep track of the issue status on the roadmap here: https://github.com/microsoft/Windows-Containers/projects/1#card-43557545 Sounds like you or your customers are using Windows in production. If you really need to get traction on this, I would suggest you follow the path of your/their Windows licensing vendor, be it Microsoft Azure or a Microsoft Partner with an SPLA, and try to get to get them to apply some internal / horizontal pressure. Complaints on GitHub Issues are at the bottom of the food-chain unfortunately, certainly if there are only 3 participants outside of Microsoft. Good luck! |
@rossdotpink I guess I'll be at the bottom of the food chain then, trying to apply horizontal pressure would likely be very costly when this is really critical stuff that should be addressed no matter what. I see another critical issue that could very well be related to this one, basically admitting to the fact that Windows containers currently have no graceful shut down at all: #16 The lack of a graceful shutdown would definitely explain all the issues related to containerized MongoDB we've seen: docker-library/mongo#435 @immuzz any update on this? sounds like both issues could very well be related |
@awakecoding Let me check with the team owning this and get back to you. |
@immuzz thanks a lot, I appreciate it |
Adding voice to this issue. I have AzureDevOps build servers on Server 2004 experiencing the same issue. The hope was to use build containers instead of managing local installs but file lock breaks a number of build workflows. As these are build boxes, is there a modern core install that is confirmed as working? Container has to match the host, so I can't rotate the image version, I have to rebuild the host server, and would like to avoid multiple-rebuilds trying to find a working version |
So is servercore 1607 the most recent working version? Isnt that distro EOL? Does anyone know of a more recent working version? |
servercore:1607 (aka Windows Server 2016), is in Mainstream support until early 2022, and Extended support until early 2027. It's interesting that it varies by base image, I'd have expected it to vary by host version. Running most of those tests would have been using Hyper-V isolation, but the servercore:1809 test on Windows Server 2019 should have been process isolation, so I guess that doesn't make a difference either. So I guess a possible workaround is to run, e.g., MongoDB images based on servercore:1607 on newer Windows hosts in Hyper-V isolation. |
@Justin-DynamicD @TBBle wait... are you saying that this issue is not observed when using the older servercore:1607 base image with Hyper-V isolation in Windows Server 2019? In other words... this maddening issue is a regression? |
That's what #37 (comment) says at the end, as far as I understand it. |
pinging a request for update as well. |
This issue has been open for 30 days with no updates. |
2 similar comments
This issue has been open for 30 days with no updates. |
This issue has been open for 30 days with no updates. |
So it's April with no movement. This is a pretty big deal that simply makes Windows containers unreliable in their current state. Are Windows containers simply DOA? It is simpler to just sub a dockerfile with packer and forget they exist until the app can be ported to linux at this point. |
Sorry about the delay folks. Our devs were fixing the bug and happy to announce that its been fixed in bindflt. Its part of patch 4c. Please try it and let us know if its working for you. Otherwise I will close this issue and mark it as fixed in a couple of days. Thanks for being patient. |
@immuzz that's good news! Will this become available through Windows Update on the base Windows Server 2019 OS, or through an update to DockerMsftProvider on PSGallery? I just want to know how to get the fix as soon as it becomes available. |
It was released as part of KB5001391 which you can download/install from the catalog now. Next Tue (the 11th) the fix will roll up into the normal patch Tue content and go out through Windows update as well as subsequently updated Azure gallery images when those are available (typically a few days later). |
@thecloudtaylor, @immuzz, unfortunately, I have some bad news. Either I'm doing something wrong or the problem hasn't gone away. I am using windows 10 pro 21h1 with KB5001391 installed and base image:
|
In order to isolate the issue, have you tried it on Windows Server 2019? |
@thecloudtaylor just pointed out that the base layer wont get updated until Tuesday (May 11). Could you try after its updated and let us know |
@immuzz, ok i will try after patch tue on win 10 pro and win server 2019 |
I just tested with today's updated base on an 1809-based build of https://github.com/docker-library/mongo and it worked! (following the steps in my reproducer in docker-library/mongo#435 (comment)) 🎉 🥳 |
Thank you for the confirmation! I'm goint to close this one resloved! |
This is fantastic! Thank you for the effort put in this. |
Are we positive it is wise to use 21H1 hosts yet? See #117 |
Combining: moby/moby#39088 and docker-library/mongo#385 and StefanScherer/dockerfiles-windows#349
Description from @drnybble on (moby/moby#39088) - thank you!
Description
This problem is exhibited by running Mongo 4.0.8 with a named volume. If you restart Mongo it will not start because the WiredTiger.lock file remains locked. This is a general problem demonstrated by a sample Windows CLI program shown below.
Steps to reproduce the issue:
Using Visual Studio 2017, compile the Windows CLI application shown below. I used the C++ code generation option "Multi-threaded" so the VS redistributables did not need to be included in the Dockerfile.
Use the following Dockerfile:
Build it:
Create a named volume:
Run it:
Describe the results you received:
A file called Test.lock is located in the named volume C:\ProgramData\docker\volumes\locktest_data. Try to open it with Visual Studio Code, it fails with the error EBUSY. The file remains locked even though the owning process has exited.
Describe the results you expected:
As described by the documentation for LockFile (https://docs.microsoft.com/en-us/windows/desktop/api/fileapi/nf-fileapi-lockfile) the OS should release the lock when the process exits. If you run the LockFileTest.exe on the desktop you will find that the file is not locked after the process exits.
Additional information you deem important (e.g. issue happens only occasionally):
Workaround is that you can delete the file even though it is locked. So for Mongo you can delete the WiredTiger.lock file on startup if it exists.
The text was updated successfully, but these errors were encountered: