-
Notifications
You must be signed in to change notification settings - Fork 760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mssql-server-linux:ctp2-1 hangs when used in docker swarm #99
Comments
There arent any new required parameters. Can you take a look at the /var/opt/mssql/logs/errorlog file? Anything interesting in there? |
@twright-msft thank you for your quick response. I tried again this morning but there were no entries in the errorlog after I changed the container to ctp2-1 again. This is the most recent errorlog from ctp2-0 if it is of any help:
|
The ctp2-0 error log looks as expected. |
I can confirm that it works without the --mount option. I also tried to --mount a new and empty folder and it got stuck again... unfortunately I don't want to loose the database content when the container gets moved around the cluster, therefore the --mount option is crucial.
and crash.txt:
|
Can you please provide more details about your Docker Swarm environment? Are you running Swarm in your own environment or something like Azure Container Service? |
It is all self hosted on machines running centos 7. I also tried to disable selinux for debugging but it didn't have any effect. The storage is mounted via NFS, but to my understanding this should be transparent to the container and it is still working with ctp2-0. In addition the storage is used by other containers as well so it does not seem to be a permission issue. Furthermore some folders and files get created when I start the ctp2-1 image... Am I missing something? |
Just an idea - we ran into a permissions issue lately with trying to create directories on start on a RHEL-based container host. We resolved this by passing --privileged=true to docker run. Given that you are using an NFS mounted storage you might need to do something with --privileged or --device. See: https://docs.docker.com/engine/reference/run/#runtime-privilege-and-linux-capabilities |
some files and directories get created on container startup therefore read/write permissions appear to be fine to me. Unfortunately --privileged=true does not exist in docker swarm mode. In my opinion passing --privileged=true is a quick fix but not a solution for a production environment. |
Yes, we first resolved the issue by disabling SELinux. Then we discovered the --privileged flag and used that instead. Good to know that the --privileged is not available in Docker Swarm. I'm going to ask one of our engineers to take a look at this. He might have some ideas on what else we could look at. |
Are there dump files in /var/opt/mssql/log? |
We think this might be an issue with using NFS storage that showed up in CTP 2.1. We are investigating... |
Thanks again for all the help I feel like we are getting close to identifying the problem. If it is of any help this is my fstab file:
the /var/opt/mssql/log contains 3 files:
the core.sqlservr.06_06_2017_07_07_19.9.txt
and a folder called core.sqlservr.9.tmp do you need any output from this folder? |
If you do an "ls -la" from inside the container to look at the NFS mount point I saw a difference between the two versions.
and the ctp2-0
any new information regarding this issue? |
This is a confirmed issue. We are working on a fix. It will be available in the next public release. Unfortunately, the only workaround is to use the CTP 2.0 image. |
I just tried the RC1 version and I get a different error message this time. I thought I share my result, maybe this is of help to you. crash.txt
crash.json
paldumper-debug.log
gdb.log
thread_information.log
|
This issue is not resolved yet in the RC1. We have a fix in testing now. We are considering the following requirements for production use of NFS remote shares:
Are these reasonable requirements? |
Your requirements seem reasonable to me. Are there other recommended NFS options besides 'nolock'? |
Thanks. Not at this time. We are targeting releasing this fix in an RC2 release in the first week of Aug. |
@sonix07 do you still have that issue on rc2? |
@aerkefiende I've tried many times with different mounts but unfortunately it still does not work with the NFS mount. Mounting the local filesystem works, but mounting an NFS mounted folder from the host (with the nolock option) gives me the following error:
If I only mount /var/opt/mssql/data and /var/opt/mssql/log then I get the following output:
This is particularly weird to me |
Using NFS or Samba shares is a known issue in RC2 and RC1, fixed in post RC2. Sorry for the inconvenience!
…________________________________
From: sonix07 <notifications@github.com>
Sent: Friday, August 4, 2017 4:24:09 AM
To: Microsoft/mssql-docker
Cc: Travis Wright; Mention
Subject: Re: [Microsoft/mssql-docker] mssql-server-linux:ctp2-1 hangs when used in docker swarm (#99)
@aerkefiende<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Faerkefiende&data=02%7C01%7Ctwright%40microsoft.com%7Ca0a077c538974ef9125808d4db2b544c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636374426520733212&sdata=Gjqt0BuepO64XUgLYc0Wcfk%2FlgH7G7D%2FfH3r%2Fu2t298%3D&reserved=0> I've tried many times with different mounts but unfortunately it still does not work with the NFS mount. Mounting the local filesystem works, but mounting an NFS mounted folder from the host (with the nolock option) gives me the following error:
This is an evaluation version. There are [172] days left in the evaluation period.
This program has encountered a fatal error and cannot continue running.
The following diagnostic information is available:
Reason: 0x00000006
Status: 0x40000015
Message: Kernel bug check
Address: 0x6a42b4b0
Parameters: 0x6a62a250
Stacktrace: 000000006a4e8446 000000006a42b50b 000000006a41ecad
000000006a42aeab 000000006a4e69dd 000000006a4e5a1c
000000006a4e5890 000000006a4e57c5
Process: 9 - sqlservr
Thread: 13 (application thread 0x1000)
Instance Id: c0b01b06-3336-4609-bf40-ceab6be93ae4
Crash Id: da3beb17-019e-42e9-912a-d70ac8c6b008
Build stamp: a37664e45e4156e76a53fa282fd694cb49f70c2037515f5684e3ce6dfa7549bc
If I only mount /var/opt/mssql/data and /var/opt/mssql/log then I get the following output:
This is an evaluation version. There are [172] days left in the evaluation period.
2017-08-04 11:08:56.20 Server Setup step is copying system data file 'C:\templatedata\mastlog.ldf' to '/var/opt/mssql/data/mastlog.ldf'.
2017-08-04 11:08:56.25 Server Setup FAILED copying system data file 'C:\templatedata\mastlog.ldf' to '/var/opt/mssql/data/mastlog.ldf': 87(The parameter is incorrect.)
ERROR: BootstrapSystemDataDirectories() failure (HRESULT 0x80070057)
I've tripple checked the folder permissions and they are fine and owned by the right user. All folders where newly created and empty. The NFS is also used by other containers successfully (Including mssql ctp2-0). I also tried the new environment variables MSSQL_SA_PASSWORD=... instead of SA_PASSWORD=... without any success so far.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FMicrosoft%2Fmssql-docker%2Fissues%2F99%23issuecomment-320225199&data=02%7C01%7Ctwright%40microsoft.com%7Ca0a077c538974ef9125808d4db2b544c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636374426520733212&sdata=ROvYDqwjCAxZ8cZG2JabMaX5kwihe%2BEAyDrp02If7TY%3D&reserved=0>, or mute the thread<https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAOSzJlbTbaYliTusOlPM8ZqwYQI9IMclks5sUv9ZgaJpZM4NiXq_&data=02%7C01%7Ctwright%40microsoft.com%7Ca0a077c538974ef9125808d4db2b544c%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C636374426520733212&sdata=OsC6yeNzWLB%2FzZWpLr5BvMiK%2BJdr3OQ669zmv7x96B0%3D&reserved=0>.
|
@twright-msft will this be fixed for sure in the next version? I'm asking because we have about 60 days left on our CTP2-0 version and we need a reliable solution + time to test it before the trial expires. |
There will be another release before :CTP2-0 expires. |
I've been watching this issue for a while now. I'm feeling increasingly uncomfortable living with the uncertainty whether we'll have a working SQL server ready on time :( Obviously, the part of my mind that is more inclined to distressed problem-solving is already poring over alternative scenarios. That's stress I'd rather not have to deal with atm ;) |
What version are you on right now @urbanhusky? |
CTP2-0, same scenario and issue (docker swarm, NFS mounted) @twright-msft |
I am using rc2 version but still getting error when i do a volume mapping for NFS mounted directory. Below is the Error I have given full access rights to the nfs mounted directory. is this a known issue for NFS mounts? |
I'd like to give a few folks that are having this issue access to a private test image to confirm that we have fixed this issue. @urbanhusky @upadhyayap and anybody else that currently has this problem - please reach out to me via email so I can get you set up. twright microsoft com |
@twright-msft I cannot find your e-mail address, but I would gladly help test the image. |
My email address is: twright microsoft com |
@twright-msft I've already wrote you an e-mail a week ago (08/24/2017). Please share the image with me. |
We'd love to test the fix image if you still have a need as well. We ran into this issue yesterday and had some other issues when pegged to the ctp2-0 tag. Will have one of our devops engineers reach out via email. |
After testing the private image I can now confirm that this solves my issue. Will the change be included in the upcoming release? |
Great. Yes, this fix will be included in the next public release. |
@twright-msft works as expected with the |
Thanks for confirming. Closing... |
Upgrade from ctp2-0 stuck with this output:
This is an evaluation version. There are [172] days left in the evaluation period.
I also tried to delete the database but the container is still stuck with the same output. The result is the same on all cluster nodes.
Docker swarm command to start the service:
docker service create \ --name mssql \ --publish 1433:1433 \ -e 'ACCEPT_EULA=Y' \ -e 'SA_PASSWORD=NotGoingToPasteThePassword' \ --network sfswarm-nw \ --replicas 1 \ --mount type=bind,source=/data/nas/mssql,target=/var/opt/mssql \ microsoft/mssql-server-linux:ctp2-1
docker version:
` Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-16.el7.centos.x86_64
docker-common-1.12.6-11.el7.centos.x86_64
Go version: go1.7.4
Git commit: 3a094bd/1.12.6
Built: Fri Apr 14 13:46:13 2017
OS/Arch: linux/amd64
Server:
Version: 1.12.6
API version: 1.24
Package version: docker-common-1.12.6-16.el7.centos.x86_64
docker-common-1.12.6-11.el7.centos.x86_64
Go version: go1.7.4
Git commit: 3a094bd/1.12.6
Built: Fri Apr 14 13:46:13 2017
OS/Arch: linux/amd64`
I've been using mssql with Docker Swarm since ctp1-2 and it was working flawlessly. Am I missing something? Do I need to pass a new parameter with ctp2-1 that I'm not currently aware?
The text was updated successfully, but these errors were encountered: