Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DNS resolution fails after 20 hours uptime #70

Open
sigmonsays opened this issue Jul 30, 2014 · 17 comments
Open

DNS resolution fails after 20 hours uptime #70

sigmonsays opened this issue Jul 30, 2014 · 17 comments

Comments

@sigmonsays
Copy link

I have just set this up for a development environment and have ran into a problem after 20 hours.

  1. No containers have been stopped, skydns and skydock are up running the latest images.
  2. redis instance was functioning just fine
  3. waited 20 hours and now it is not resolving, yet it should be

I grabbed specific logs for the skydock service and it shows a stop and die message for the not running redis instance. restarting redis instance fixes the issue.

skydock is not usable in this condition.

Somehow instances are failing TTL and not being put back into DNS.

@sigmonsays
Copy link
Author

One more comment.

This seems to be occurring across all instance types and is not specific to a specific image (which is good news).

$
$ docker ps |grep mysql
1770a6428448 mysql:latest /bin/bash /srv/start 18 hours ago Up 13 hours 0.0.0.0:49159->3306/tcp mysql-1
$ docker ps |grep sky
b35b294a6dc4 crosbymichael/skydock:latest /go/bin/skydock -ttl 20 hours ago Up 13 hours skydock
326e25eee203 crosbymichael/skydns:latest skydns -http 0.0.0.0 20 hours ago Up 13 hours 8080/tcp, 172.17.42.1:53->53/udp skydns
$

@samos123
Copy link

I had a similar issue, but for me it was because I updated the image causing the original container to resolv to an unnamed image hence resulting in not being able to resolve that container which got it's image updated.

So to check whether you have the same image run docker ps and make sure that in the image column the image actually has a name instead of a hex code. If the original image name can't be seen in docker ps command then this may be the reason.

@mitar
Copy link
Contributor

mitar commented Aug 30, 2014

@samos123: You are talking about #43. This ticket seems something else.

I have the same issue. Resolving just stops. If I restart skydock container (docker restart skydock) it works again. I am not sure why this happens.

@miekg
Copy link

miekg commented Aug 30, 2014

What's so special about the 20 hours?
On 30 Aug 2014 18:49, "Mitar" notifications@github.com wrote:

@samos123 https://github.com/samos123: You are talking about #43
#43. This ticket seems
something else.

I have the same issue. Resolving just stops. If I restart skydock
container (docker restart skydock) it works again. I am not sure why this
happens.


Reply to this email directly or view it on GitHub
#70 (comment)
.

@mitar
Copy link
Contributor

mitar commented Aug 30, 2014

In my case is not 20 hours, but few hours. What is interesting is that this was working for few months now and now it started failing every few hours (after restart).

@miekg
Copy link

miekg commented Aug 30, 2014

So what changed?
On 30 Aug 2014 19:37, "Mitar" notifications@github.com wrote:

In my case is not 20 hours, but few hours. What is interesting is that
this was working for few months now and now it started failing every few
hours (after restart).


Reply to this email directly or view it on GitHub
#70 (comment)
.

@mitar
Copy link
Contributor

mitar commented Aug 30, 2014

Nothing. That's the problem. And there are no logs.

@miekg
Copy link

miekg commented Aug 30, 2014

Those should be relatively easy to add though.
On 30 Aug 2014 21:45, "Mitar" notifications@github.com wrote:

Nothing. That's the problem. And there are no logs.


Reply to this email directly or view it on GitHub
#70 (comment)
.

@mitar
Copy link
Contributor

mitar commented Aug 30, 2014

So, how should I debug this?

@miekg
Copy link

miekg commented Aug 30, 2014

Sprinkle the code with some logging and hope it will tell you something.
But skydock also still uses skydns1, which is becoming a bit annoying as
skydns2 has been out for some time.
On 30 Aug 2014 21:48, "Mitar" notifications@github.com wrote:

So, how should I debug this?


Reply to this email directly or view it on GitHub
#70 (comment)
.

@mitar
Copy link
Contributor

mitar commented Aug 30, 2014

SkyDNS works well. I don't have to restart it. Just SkyDock.

@miekg
Copy link

miekg commented Aug 30, 2014

Ah. OK. That narrows it down a bit, so yeah, I don't know anything better
than more logging at this point...
On 30 Aug 2014 21:56, "Mitar" notifications@github.com wrote:

SkyDNS works well. I don't have to restart it. Just SkyDock.


Reply to this email directly or view it on GitHub
#70 (comment)
.

@amattn
Copy link

amattn commented Sep 19, 2014

we have the same issue... some indeterminate time, approx between 2 and 24 hours, skydock just dies...

@iautom8things
Copy link

Has there been anything learned regarding this? I'm finding that I have the same issue; I constantly have to restart my containers to get this to work.

@mitar
Copy link
Contributor

mitar commented Jan 7, 2015

I started using https://github.com/blalor/docker-hosts instead.

@asbjornenge
Copy link
Contributor

Just to pull a shameless self plug I recently put together https://github.com/asbjornenge/rainbow-dock 🌈 🚀

@amattn
Copy link

amattn commented Jan 16, 2015

We migrated away from SkyDock, but one theory we had was that putting a laptop to sleep would skew the clocks enough for this issue to occur. using a tool like sleep watcher helped us keep clocks aligned after wakeup. It's a little brute force but did help us.

http://www.bernhard-baehr.de

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants