Skip to content
This repository has been archived by the owner on Sep 26, 2021. It is now read-only.

docker-machine env xxx hangs forever #1500

Closed
chouclee opened this issue Jul 10, 2015 · 50 comments
Closed

docker-machine env xxx hangs forever #1500

chouclee opened this issue Jul 10, 2015 · 50 comments

Comments

@chouclee
Copy link

The machine is running, and I could use docker-machine ssh xxx login that machine.
But docker-machine env xxx hangs forever. The debugging output said the "host is down". But
docker-machine ls gave me
"NAME ACTIVE DRIVER STATE URL SWARM
dev-test virtualbox Running tcp://192.168.99.100:2376 "

possible related issue: #1168
docker-machine version: v0.3.0
The full info:
Cmd: docker-machine -D env dev-test
Output:

shell: sh
executing: /usr/bin/VBoxManage showvminfo dev-test --machinereadable
STDOUT: name="dev-test"
groups="/"
ostype="Linux 2.6 / 3.x (64 bit)"
UUID="3928af02-deb5-4620-9433-0028c017030b"
CfgFile="/var/root/.docker/machine/machines/dev-test/dev-test/dev-test.vbox"
SnapFldr="/var/root/.docker/machine/machines/dev-test/dev-test/Snapshots"
LogFldr="/var/root/.docker/machine/machines/dev-test/dev-test/Logs"
hardwareuuid="3928af02-deb5-4620-9433-0028c017030b"
memory=1024
pagefusion="off"
vram=8
cpuexecutioncap=100
hpet="on"
chipset="piix3"
firmware="BIOS"
cpus=1
pae="on"
longmode="on"
synthcpu="off"
bootmenu="disabled"
boot1="dvd"
boot2="dvd"
boot3="disk"
boot4="none"
acpi="on"
ioapic="on"
biossystemtimeoffset=0
rtcuseutc="on"
hwvirtex="on"
nestedpaging="on"
largepages="on"
vtxvpid="on"
vtxux="on"
VMState="running"
VMStateChangeTime="2015-07-10T07:54:44.216000000"
monitorcount=1
accelerate3d="off"
accelerate2dvideo="off"
teleporterenabled="off"
teleporterport=0
teleporteraddress=""
teleporterpassword=""
tracing-enabled="off"
tracing-allow-vm-access="off"
tracing-config=""
autostart-enabled="off"
autostart-delay=0
defaultfrontend=""
storagecontrollername0="SATA"
storagecontrollertype0="IntelAhci"
storagecontrollerinstance0="0"
storagecontrollermaxportcount0="30"
storagecontrollerportcount0="30"
storagecontrollerbootable0="on"
"SATA-0-0"="/var/root/.docker/machine/machines/dev-test/boot2docker.iso"
"SATA-ImageUUID-0-0"="9ee6b7c5-49b9-4ec9-bc9e-6034d222da02"
"SATA-tempeject"="off"
"SATA-IsEjected"="off"
"SATA-1-0"="/var/root/.docker/machine/machines/dev-test/disk.vmdk"
"SATA-ImageUUID-1-0"="83cdb0e3-f525-44ee-9c1c-40dab5361d33"
"SATA-2-0"="none"
"SATA-3-0"="none"
"SATA-4-0"="none"
"SATA-5-0"="none"
"SATA-6-0"="none"
"SATA-7-0"="none"
"SATA-8-0"="none"
"SATA-9-0"="none"
"SATA-10-0"="none"
"SATA-11-0"="none"
"SATA-12-0"="none"
"SATA-13-0"="none"
"SATA-14-0"="none"
"SATA-15-0"="none"
"SATA-16-0"="none"
"SATA-17-0"="none"
"SATA-18-0"="none"
"SATA-19-0"="none"
"SATA-20-0"="none"
"SATA-21-0"="none"
"SATA-22-0"="none"
"SATA-23-0"="none"
"SATA-24-0"="none"
"SATA-25-0"="none"
"SATA-26-0"="none"
"SATA-27-0"="none"
"SATA-28-0"="none"
"SATA-29-0"="none"
natnet1="nat"
macaddress1="080027CA87FE"
cableconnected1="on"
nic1="nat"
nictype1="82540EM"
nicspeed1="0"
mtu="0"
sockSnd="64"
sockRcv="64"
tcpWndSnd="64"
tcpWndRcv="64"
Forwarding(0)="ssh,tcp,127.0.0.1,50762,,22"
hostonlyadapter2="vboxnet1"
macaddress2="0800276E6AB8"
cableconnected2="on"
nic2="hostonly"
nictype2="82540EM"
nicspeed2="0"
nic3="none"
nic4="none"
nic5="none"
nic6="none"
nic7="none"
nic8="none"
hidpointing="ps2mouse"
hidkeyboard="ps2kbd"
uart1="off"
uart2="off"
lpt1="off"
lpt2="off"
audio="none"
clipboard="disabled"
draganddrop="disabled"
SessionType="headless"
VideoMode="720,400,0"@0,0
vrde="off"
usb="off"
ehci="off"
SharedFolderNameMachineMapping1="Users"
SharedFolderPathMachineMapping1="/Users"
VRDEActiveConnection="off"
VRDEClients=0
vcpenabled="off"
vcpscreens=0
vcpfile="/var/root/.docker/machine/machines/dev-test/dev-test/dev-test.webm"
vcpwidth=1024
vcpheight=768
vcprate=512
vcpfps=25
GuestMemoryBalloon=0
GuestOSType="Linux26_64"
GuestAdditionsRunLevel=1
GuestAdditionsVersion="4.3.28 r100309"
GuestAdditionsFacility_VirtualBox Base Driver=50,1436514900104
GuestAdditionsFacility_Seamless Mode=0,1436514900104
GuestAdditionsFacility_Graphics Mode=0,1436514900104

STDERR: 
Using SSH client type: external
About to run SSH command:
ip addr show dev eth1
&{/usr/bin/ssh [/usr/bin/ssh -o PasswordAuthentication=no -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -i /var/root/.docker/machine/machines/dev-test/id_rsa -p 50762 docker@localhost ip addr show dev eth1] []  <nil> <nil> <nil> [] <nil> <nil> <nil> ?reflect.Value? false [] [] [] [] <nil>}
SSH cmd err, output: <nil>: 4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:6e:6a:b8 brd ff:ff:ff:ff:ff:ff
    inet 192.168.99.100/24 brd 192.168.99.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe6e:6ab8/64 scope link 
       valid_lft forever preferred_lft forever

SSH returned: 4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:6e:6a:b8 brd ff:ff:ff:ff:ff:ff
    inet 192.168.99.100/24 brd 192.168.99.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe6e:6ab8/64 scope link 
       valid_lft forever preferred_lft forever

END SSH

invalid certs detected; regenerating for 192.168.99.100:2376
command=configureAuth machine=dev-test
Using SSH client type: external
About to run SSH command:
cat /etc/os-release
&{/usr/bin/ssh [/usr/bin/ssh -o PasswordAuthentication=no -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -i /var/root/.docker/machine/machines/dev-test/id_rsa -p 50762 docker@localhost cat /etc/os-release] []  <nil> <nil> <nil> [] <nil> <nil> <nil> ?reflect.Value? false [] [] [] [] <nil>}
SSH cmd err, output: <nil>: NAME=Boot2Docker
VERSION=1.7.0
ID=boot2docker
ID_LIKE=tcl
VERSION_ID=1.7.0
PRETTY_NAME="Boot2Docker 1.7.0 (TCL 6.3); master : 7960f90 - Thu Jun 18 18:31:45 UTC 2015"
ANSI_COLOR="1;34"
HOME_URL="http://boot2docker.io"
SUPPORT_URL="https://github.com/boot2docker/boot2docker"
BUG_REPORT_URL="https://github.com/boot2docker/boot2docker/issues"

found compatible host: boot2docker
Using SSH client type: external
About to run SSH command:
sudo hostname dev-test && echo "dev-test" | sudo tee /var/lib/boot2docker/etc/hostname
&{/usr/bin/ssh [/usr/bin/ssh -o PasswordAuthentication=no -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -i /var/root/.docker/machine/machines/dev-test/id_rsa -p 50762 docker@localhost sudo hostname dev-test && echo "dev-test" | sudo tee /var/lib/boot2docker/etc/hostname] []  <nil> <nil> <nil> [] <nil> <nil> <nil> ?reflect.Value? false [] [] [] [] <nil>}
SSH cmd err, output: <nil>: dev-test

executing: /usr/bin/VBoxManage showvminfo dev-test --machinereadable
STDOUT: name="dev-test"
groups="/"
ostype="Linux 2.6 / 3.x (64 bit)"
UUID="3928af02-deb5-4620-9433-0028c017030b"
CfgFile="/var/root/.docker/machine/machines/dev-test/dev-test/dev-test.vbox"
SnapFldr="/var/root/.docker/machine/machines/dev-test/dev-test/Snapshots"
LogFldr="/var/root/.docker/machine/machines/dev-test/dev-test/Logs"
hardwareuuid="3928af02-deb5-4620-9433-0028c017030b"
memory=1024
pagefusion="off"
vram=8
cpuexecutioncap=100
hpet="on"
chipset="piix3"
firmware="BIOS"
cpus=1
pae="on"
longmode="on"
synthcpu="off"
bootmenu="disabled"
boot1="dvd"
boot2="dvd"
boot3="disk"
boot4="none"
acpi="on"
ioapic="on"
biossystemtimeoffset=0
rtcuseutc="on"
hwvirtex="on"
nestedpaging="on"
largepages="on"
vtxvpid="on"
vtxux="on"
VMState="running"
VMStateChangeTime="2015-07-10T07:54:44.216000000"
monitorcount=1
accelerate3d="off"
accelerate2dvideo="off"
teleporterenabled="off"
teleporterport=0
teleporteraddress=""
teleporterpassword=""
tracing-enabled="off"
tracing-allow-vm-access="off"
tracing-config=""
autostart-enabled="off"
autostart-delay=0
defaultfrontend=""
storagecontrollername0="SATA"
storagecontrollertype0="IntelAhci"
storagecontrollerinstance0="0"
storagecontrollermaxportcount0="30"
storagecontrollerportcount0="30"
storagecontrollerbootable0="on"
"SATA-0-0"="/var/root/.docker/machine/machines/dev-test/boot2docker.iso"
"SATA-ImageUUID-0-0"="9ee6b7c5-49b9-4ec9-bc9e-6034d222da02"
"SATA-tempeject"="off"
"SATA-IsEjected"="off"
"SATA-1-0"="/var/root/.docker/machine/machines/dev-test/disk.vmdk"
"SATA-ImageUUID-1-0"="83cdb0e3-f525-44ee-9c1c-40dab5361d33"
"SATA-2-0"="none"
"SATA-3-0"="none"
"SATA-4-0"="none"
"SATA-5-0"="none"
"SATA-6-0"="none"
"SATA-7-0"="none"
"SATA-8-0"="none"
"SATA-9-0"="none"
"SATA-10-0"="none"
"SATA-11-0"="none"
"SATA-12-0"="none"
"SATA-13-0"="none"
"SATA-14-0"="none"
"SATA-15-0"="none"
"SATA-16-0"="none"
"SATA-17-0"="none"
"SATA-18-0"="none"
"SATA-19-0"="none"
"SATA-20-0"="none"
"SATA-21-0"="none"
"SATA-22-0"="none"
"SATA-23-0"="none"
"SATA-24-0"="none"
"SATA-25-0"="none"
"SATA-26-0"="none"
"SATA-27-0"="none"
"SATA-28-0"="none"
"SATA-29-0"="none"
natnet1="nat"
macaddress1="080027CA87FE"
cableconnected1="on"
nic1="nat"
nictype1="82540EM"
nicspeed1="0"
mtu="0"
sockSnd="64"
sockRcv="64"
tcpWndSnd="64"
tcpWndRcv="64"
Forwarding(0)="ssh,tcp,127.0.0.1,50762,,22"
hostonlyadapter2="vboxnet1"
macaddress2="0800276E6AB8"
cableconnected2="on"
nic2="hostonly"
nictype2="82540EM"
nicspeed2="0"
nic3="none"
nic4="none"
nic5="none"
nic6="none"
nic7="none"
nic8="none"
hidpointing="ps2mouse"
hidkeyboard="ps2kbd"
uart1="off"
uart2="off"
lpt1="off"
lpt2="off"
audio="none"
clipboard="disabled"
draganddrop="disabled"
SessionType="headless"
VideoMode="720,400,0"@0,0
vrde="off"
usb="off"
ehci="off"
SharedFolderNameMachineMapping1="Users"
SharedFolderPathMachineMapping1="/Users"
VRDEActiveConnection="off"
VRDEClients=0
vcpenabled="off"
vcpscreens=0
vcpfile="/var/root/.docker/machine/machines/dev-test/dev-test/dev-test.webm"
vcpwidth=1024
vcpheight=768
vcprate=512
vcpfps=25
GuestMemoryBalloon=0
GuestOSType="Linux26_64"
GuestAdditionsRunLevel=1
GuestAdditionsVersion="4.3.28 r100309"
GuestAdditionsFacility_VirtualBox Base Driver=50,1436514900104
GuestAdditionsFacility_Seamless Mode=0,1436514900104
GuestAdditionsFacility_Graphics Mode=0,1436514900104

STDERR: 
Using SSH client type: external
About to run SSH command:
ip addr show dev eth1
&{/usr/bin/ssh [/usr/bin/ssh -o PasswordAuthentication=no -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -i /var/root/.docker/machine/machines/dev-test/id_rsa -p 50762 docker@localhost ip addr show dev eth1] []  <nil> <nil> <nil> [] <nil> <nil> <nil> ?reflect.Value? false [] [] [] [] <nil>}
SSH cmd err, output: <nil>: 4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:6e:6a:b8 brd ff:ff:ff:ff:ff:ff
    inet 192.168.99.100/24 brd 192.168.99.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe6e:6ab8/64 scope link 
       valid_lft forever preferred_lft forever

SSH returned: 4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:6e:6a:b8 brd ff:ff:ff:ff:ff:ff
    inet 192.168.99.100/24 brd 192.168.99.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe6e:6ab8/64 scope link 
       valid_lft forever preferred_lft forever

END SSH

Daemon not responding yet: dial tcp 192.168.99.100:2376: host is down
Daemon not responding yet: dial tcp 192.168.99.100:2376: host is down
Daemon not responding yet: dial tcp 192.168.99.100:2376: host is down
@pedroxs
Copy link

pedroxs commented Jul 10, 2015

I have similar issue with windows 7 OS. here is a gist of the log file.

@CpuID
Copy link

CpuID commented Jul 11, 2015

I am noticing "docker-machine env machinename" hang a lot also, especially when it is in my ~/.profile, causes new terminals to hang and not start 50% of the time approx. Currently just ctrl+c'ing the new tab shell, and spawning another one 2 seconds later tends to do the job.

Difficult to reproduce in a new/active terminal, tends to be far more often when spawning a new one.

OSX 10.10.4, Docker Machine 0.3.0. These issues were not present in 0.2.x.

@a0s
Copy link

a0s commented Jul 13, 2015

I have same issue with strange exit status 255

docker-machine -D env docker
shell: bash
executing: /Applications/VMware Fusion.app/Contents/Library/vmrun list
MAC address in VMX: 00:0c:29:b6:95:bb
IP found in DHCP lease table: 172.16.1.131
invalid certs detected; regenerating for 172.16.1.131:2376
command=configureAuth machine=docker
executing: /Applications/VMware Fusion.app/Contents/Library/vmrun list
MAC address in VMX: 00:0c:29:b6:95:bb
IP found in DHCP lease table: 172.16.1.131
Using SSH client type: external
About to run SSH command:
cat /etc/os-release
&{/usr/bin/ssh [/usr/bin/ssh -o PasswordAuthentication=no -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -i /Users/orangeudav/.docker/machine/machines/docker/id_rsa -p 22 docker@172.16.1.131 cat /etc/os-release] []  <nil> <nil> <nil> [] <nil> <nil> <nil> <nil> false [] [] [] [] <nil>}
<HAAAAAANGIIING HERE>
SSH cmd err, output: exit status 255:
Error getting SSH command: exit status 255
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://172.16.1.131:2376"
export DOCKER_CERT_PATH="/Users/orangeudav/.docker/machine/machines/docker"
export DOCKER_MACHINE_NAME="docker"
# Run this command to configure your shell:
# eval "$(docker-machine env docker)"

@chantra
Copy link
Contributor

chantra commented Jul 14, 2015

I am able to reproduce this issue on MacOSX for the virtualbox driver. The issue typically happens when the route to the private network is missing (the one used by docker cli).
In my case, one of the reason I suspect this route is missing is because I am using a VPN that tends to mess up with priv network routes. A fix to this is to run something along:

sudo route add -net 192.168.99.0/24 -interface vboxnet6

Where 192.168.99.0/24 is the network range used by boot2docker and vboxnet6 the interface assigned to boot2docker private network.

@inkel
Copy link

inkel commented Jul 15, 2015

I've tried @chantra's solution but didn't do the trick, docker-machine still hangs when in a new shell with any subcommand. Interrupting with ^C and trying again works, though.

@nathanleclaire
Copy link
Contributor

Hi all, anyone who is encountering this issue, one of the possible issues at play is that the Docker daemon is not running.

Can you please try / paste the output of the following commands (this is for VMs running boot2docker):

$ docker-machine -D ssh machinename sudo /etc/init.d/docker restart
...
$ docker-machine -D env machinename
...

Thanks!

@inkel
Copy link

inkel commented Jul 15, 2015

@nathanleclaire when I try to run that command, it hangs here:

$ docker-machine -D ssh dev sudo /etc/init.d/docker restart
executing: /usr/local/bin/VBoxManage showvminfo dev --machinereadable
STDOUT: name="dev"
groups="/"
ostype="Linux 2.6 / 3.x (64-bit)"
UUID="19ac89e0-98cf-49e8-b306-ca58878604a3"
CfgFile="/Users/inkel/.docker/machine/machines/dev/dev/dev.vbox"
SnapFldr="/Users/inkel/.docker/machine/machines/dev/dev/Snapshots"
LogFldr="/Users/inkel/.docker/machine/machines/dev/dev/Logs"
hardwareuuid="19ac89e0-98cf-49e8-b306-ca58878604a3"
memory=2048
pagefusion="off"
vram=8
cpuexecutioncap=100
hpet="on"
chipset="piix3"
firmware="BIOS"
cpus=4
pae="on"
longmode="on"
synthcpu="off"
bootmenu="disabled"
boot1="dvd"
boot2="dvd"
boot3="disk"
boot4="none"
acpi="on"
ioapic="on"
biossystemtimeoffset=0
rtcuseutc="on"
hwvirtex="on"
nestedpaging="on"
largepages="on"
vtxvpid="on"
vtxux="on"
VMState="running"
VMStateChangeTime="2015-07-15T21:20:06.395000000"
monitorcount=1
accelerate3d="off"
accelerate2dvideo="off"
teleporterenabled="off"
teleporterport=0
teleporteraddress=""
teleporterpassword=""
tracing-enabled="off"
tracing-allow-vm-access="off"
tracing-config=""
autostart-enabled="off"
autostart-delay=0
defaultfrontend=""
storagecontrollername0="SATA"
storagecontrollertype0="IntelAhci"
storagecontrollerinstance0="0"
storagecontrollermaxportcount0="30"
storagecontrollerportcount0="30"
storagecontrollerbootable0="on"
"SATA-0-0"="/Users/inkel/.docker/machine/machines/dev/boot2docker.iso"
"SATA-ImageUUID-0-0"="c5d1d610-61ba-4416-8721-f65383bd9595"
"SATA-tempeject"="off"
"SATA-IsEjected"="off"
"SATA-1-0"="/Users/inkel/.docker/machine/machines/dev/disk.vmdk"
"SATA-ImageUUID-1-0"="9719a1db-dfaf-441e-8d8f-47ee0b18293b"
"SATA-2-0"="none"
"SATA-3-0"="none"
"SATA-4-0"="none"
"SATA-5-0"="none"
"SATA-6-0"="none"
"SATA-7-0"="none"
"SATA-8-0"="none"
"SATA-9-0"="none"
"SATA-10-0"="none"
"SATA-11-0"="none"
"SATA-12-0"="none"
"SATA-13-0"="none"
"SATA-14-0"="none"
"SATA-15-0"="none"
"SATA-16-0"="none"
"SATA-17-0"="none"
"SATA-18-0"="none"
"SATA-19-0"="none"
"SATA-20-0"="none"
"SATA-21-0"="none"
"SATA-22-0"="none"
"SATA-23-0"="none"
"SATA-24-0"="none"
"SATA-25-0"="none"
"SATA-26-0"="none"
"SATA-27-0"="none"
"SATA-28-0"="none"
"SATA-29-0"="none"
natnet1="nat"
macaddress1="08002743999C"
cableconnected1="on"
nic1="nat"
nictype1="virtio"
nicspeed1="0"
mtu="0"
sockSnd="64"
sockRcv="64"
tcpWndSnd="64"
tcpWndRcv="64"
Forwarding(0)="ssh,tcp,127.0.0.1,54973,,22"
hostonlyadapter2="vboxnet5"
macaddress2="080027A3AA51"
cableconnected2="on"
nic2="hostonly"
nictype2="82540EM"
nicspeed2="0"
nic3="none"
nic4="none"
nic5="none"
nic6="none"
nic7="none"
nic8="none"
hidpointing="ps2mouse"
hidkeyboard="ps2kbd"
uart1="off"
uart2="off"
lpt1="off"
lpt2="off"
audio="none"
clipboard="disabled"
draganddrop="disabled"
SessionType="headless"
VideoMode="720,400,0"@0,0
vrde="off"
usb="off"
ehci="off"
SharedFolderNameMachineMapping1="Users"
SharedFolderPathMachineMapping1="/Users"
VRDEActiveConnection="off"
VRDEClients=0
vcpenabled="off"
vcpscreens=0
vcpfile="/Users/inkel/.docker/machine/machines/dev/dev/dev.webm"
vcpwidth=1024
vcpheight=768
vcprate=512
vcpfps=25
GuestMemoryBalloon=0
GuestOSType="Linux26_64"
GuestAdditionsRunLevel=1
GuestAdditionsVersion="4.3.20 r96996"
GuestAdditionsFacility_VirtualBox Base Driver=50,1436995222662
GuestAdditionsFacility_Seamless Mode=0,1436995222662
GuestAdditionsFacility_Graphics Mode=0,1436995222662

STDERR:
Using SSH client type: external
About to run SSH command:
sudo /etc/init.d/docker restart
&{/usr/bin/ssh [/usr/bin/ssh -o PasswordAuthentication=no -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -i /Users/inkel/.docker/machine/machines/dev/id_rsa -p 54973 docker@localhost sudo /etc/init.d/docker restart] []  <nil> <nil> <nil> [] <nil> <nil> <nil> ?reflect.Value? false [] [] [] [] <nil>}

Once if finishes (if it ever does) I'll paste what's next.

@nathanleclaire
Copy link
Contributor

@inkel Hm, it should run pretty much right away if it's going to succeed, so if it takes more than a few seconds something's wrong.

@inkel
Copy link

inkel commented Jul 15, 2015

It just finished with:

SSH cmd err, output: <nil>:

It took definitively more than a few seconds. I'm running again with time to see how much.

@inkel
Copy link

inkel commented Jul 15, 2015

$ docker-machine -D ssh dev sudo /etc/init.d/docker restart
…same long output…
real    5m1.197s
user    0m0.045s
sys 0m0.035s

@nathanleclaire
Copy link
Contributor

If you run :

$ /usr/bin/ssh -vvv -o PasswordAuthentication=no -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -i /Users/inkel/.docker/machine/machines/dev/id_rsa -p 54973 docker@localhost sudo /etc/init.d/docker restart

from the CLI on its own, what do you get?

@inkel
Copy link

inkel commented Jul 15, 2015

inkel@miralejos ~
$ /usr/bin/ssh -vvv -o PasswordAuthentication=no -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -i /Users/inkel/.docker/machine/machines/dev/id_rsa -p 54973 docker@localhost sudo /etc/init.d/docker restart 2>&1 | pbcopy
debug1: multiplexing control connection
debug3: fd 7 is O_NONBLOCK
debug3: fd 7 is O_NONBLOCK
debug1: channel 1: new [mux-control]
debug3: channel_post_mux_listener: new mux channel 1 fd 7
debug3: mux_master_read_cb: channel 1: hello sent
debug2: set_control_persist_exit_time: cancel scheduled exit
debug3: mux_master_read_cb: channel 1 packet type 0x00000001 len 4
debug2: process_mux_master_hello: channel 1 slave version 4
debug3: mux_master_read_cb: channel 1 packet type 0x10000004 len 4
debug2: process_mux_alive_check: channel 1: alive check
debug3: mux_master_read_cb: channel 1 packet type 0x10000002 len 92
debug2: process_mux_new_session: channel 1: request tty 0, X 1, agent 0, subsys 0, term "xterm", cmd "sudo /etc/init.d/docker restart", env 1
debug3: process_mux_new_session: got fds stdin 8, stdout 9, stderr 10
debug2: fd 9 setting O_NONBLOCK
debug3: fd 10 is O_NONBLOCK
debug1: channel 2: new [client-session]
debug2: process_mux_new_session: channel_new: 2 linked to control channel 1
debug2: channel 2: send open
debug2: callback start
debug2: client_session2_setup: id 2
debug1: Sending environment.
debug1: Sending env LANG = en_US.UTF-8
debug2: channel 2: request env confirm 0
debug1: Sending command: sudo /etc/init.d/docker restart
debug2: channel 2: request exec confirm 1
debug3: mux_session_confirm: sending success reply
debug2: callback done
debug2: channel 2: open confirm rwindow 0 rmax 32768
debug2: channel 2: rcvd adjust 2097152
debug2: channel_input_status_confirm: type 99 id 2
debug2: exec request accepted on channel 2
debug1: client_input_channel_req: channel 2 rtype exit-status reply 0
debug3: mux_exit_message: channel 2: exit message, evitval 0
debug1: client_input_channel_req: channel 2 rtype eow@openssh.com reply 0
debug2: channel 2: rcvd eow
debug2: channel 2: close_read
debug2: channel 2: input open -> closed
debug2: channel 2: rcvd eof
debug2: channel 2: output open -> drain
debug2: channel 2: obuf empty
debug2: channel 2: close_write
debug2: channel 2: output drain -> closed
debug2: channel 2: rcvd close
debug3: channel 2: will not send data after close
debug2: channel 2: send close
debug2: channel 2: is dead
debug2: channel 2: gc: notify user
debug3: mux_master_session_cleanup_cb: entering for channel 2
debug2: channel 1: rcvd close
debug2: channel 1: output open -> drain
debug2: channel 1: close_read
debug2: channel 1: input open -> closed
debug2: channel 2: gc: user detached
debug2: channel 2: is dead
debug2: channel 2: garbage collecting
debug1: channel 2: free: client-session, nchannels 3
debug3: channel 2: status: The following connections are open:
  #2 client-session (t4 r0 i3/0 o3/0 fd -1/-1 cc -1)

debug2: channel 1: obuf empty
debug2: channel 1: close_write
debug2: channel 1: output drain -> closed
debug2: channel 1: is dead (local)
debug2: channel 1: gc: notify user
debug3: mux_master_control_cleanup_cb: entering for channel 1
debug2: channel 1: gc: user detached
debug2: channel 1: is dead (local)
debug2: channel 1: garbage collecting
debug1: channel 1: free: mux-control, nchannels 2
debug3: channel 1: status: The following connections are open:

debug2: set_control_persist_exit_time: schedule exit in 300 seconds
inkel@miralejos ~
$

@inkel
Copy link

inkel commented Jul 15, 2015

I think I've made a mistake with my previous comment, this is the right output:

$ /usr/bin/ssh -vvv -o PasswordAuthentication=no -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -i /Users/inkel/.docker/machine/machines/dev/id_rsa -p 54973 docker@localhost sudo /etc/init.d/docker restart 2>&1
OpenSSH_6.2p2, OSSLShim 0.9.8r 8 Dec 2011
debug1: Reading configuration data /Users/inkel/.ssh/config
debug1: /Users/inkel/.ssh/config line 3: Applying options for *
debug1: Reading configuration data /etc/ssh_config
debug1: /etc/ssh_config line 20: Applying options for *
debug1: auto-mux: Trying existing master
debug1: Control socket "/Users/inkel/.ssh/master-docker@localhost:54973" does not exist
debug2: ssh_connect: needpriv 0
debug1: Connecting to localhost [::1] port 54973.
debug2: fd 5 setting O_NONBLOCK
debug1: connect to address ::1 port 54973: Connection refused
debug1: Connecting to localhost [127.0.0.1] port 54973.
debug2: fd 5 setting O_NONBLOCK
debug1: fd 5 clearing O_NONBLOCK
debug1: Connection established.
debug3: timeout: 10000 ms remain after connect
debug3: Incorrect RSA1 identifier
debug3: Could not load "/Users/inkel/.docker/machine/machines/dev/id_rsa" as a RSA1 public key
debug1: identity file /Users/inkel/.docker/machine/machines/dev/id_rsa type 1
debug1: identity file /Users/inkel/.docker/machine/machines/dev/id_rsa-cert type -1
debug1: Enabling compatibility mode for protocol 2.0
debug1: Local version string SSH-2.0-OpenSSH_6.2
debug1: Remote protocol version 2.0, remote software version OpenSSH_6.0
debug1: match: OpenSSH_6.0 pat OpenSSH*
debug2: fd 5 setting O_NONBLOCK
debug3: put_host_port: [localhost]:54973
debug3: load_hostkeys: loading entries for host "[localhost]:54973" from file "/dev/null"
debug3: load_hostkeys: loaded 0 keys
debug1: SSH2_MSG_KEXINIT sent
debug1: SSH2_MSG_KEXINIT received
debug2: kex_parse_kexinit: diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
debug2: kex_parse_kexinit: ssh-rsa-cert-v01@openssh.com,ssh-dss-cert-v01@openssh.com,ssh-rsa-cert-v00@openssh.com,ssh-dss-cert-v00@openssh.com,ssh-rsa,ssh-dss
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-gcm@openssh.com,aes256-gcm@openssh.com,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-gcm@openssh.com,aes256-gcm@openssh.com,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: hmac-md5-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-ripemd160-etm@openssh.com,hmac-sha1-96-etm@openssh.com,hmac-md5-96-etm@openssh.com,hmac-md5,hmac-sha1,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: hmac-md5-etm@openssh.com,hmac-sha1-etm@openssh.com,umac-64-etm@openssh.com,umac-128-etm@openssh.com,hmac-sha2-256-etm@openssh.com,hmac-sha2-512-etm@openssh.com,hmac-ripemd160-etm@openssh.com,hmac-sha1-96-etm@openssh.com,hmac-md5-96-etm@openssh.com,hmac-md5,hmac-sha1,umac-64@openssh.com,umac-128@openssh.com,hmac-sha2-256,hmac-sha2-512,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: none,zlib@openssh.com,zlib
debug2: kex_parse_kexinit: none,zlib@openssh.com,zlib
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit: first_kex_follows 0
debug2: kex_parse_kexinit: reserved 0
debug2: kex_parse_kexinit: ecdh-sha2-nistp256,ecdh-sha2-nistp384,ecdh-sha2-nistp521,diffie-hellman-group-exchange-sha256,diffie-hellman-group-exchange-sha1,diffie-hellman-group14-sha1,diffie-hellman-group1-sha1
debug2: kex_parse_kexinit: ssh-rsa,ssh-dss,ecdsa-sha2-nistp256
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: aes128-ctr,aes192-ctr,aes256-ctr,arcfour256,arcfour128,aes128-cbc,3des-cbc,blowfish-cbc,cast128-cbc,aes192-cbc,aes256-cbc,arcfour,rijndael-cbc@lysator.liu.se
debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-256-96,hmac-sha2-512,hmac-sha2-512-96,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: hmac-md5,hmac-sha1,umac-64@openssh.com,hmac-sha2-256,hmac-sha2-256-96,hmac-sha2-512,hmac-sha2-512-96,hmac-ripemd160,hmac-ripemd160@openssh.com,hmac-sha1-96,hmac-md5-96
debug2: kex_parse_kexinit: none,zlib@openssh.com
debug2: kex_parse_kexinit: none,zlib@openssh.com
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit:
debug2: kex_parse_kexinit: first_kex_follows 0
debug2: kex_parse_kexinit: reserved 0
debug2: mac_setup: found hmac-md5
debug1: kex: server->client aes128-ctr hmac-md5 none
debug2: mac_setup: found hmac-md5
debug1: kex: client->server aes128-ctr hmac-md5 none
debug1: SSH2_MSG_KEX_DH_GEX_REQUEST(1024<1024<8192) sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_GROUP
debug2: dh_gen_key: priv key bits set: 124/256
debug2: bits set: 526/1024
debug1: SSH2_MSG_KEX_DH_GEX_INIT sent
debug1: expecting SSH2_MSG_KEX_DH_GEX_REPLY
debug1: Server host key: RSA 4c:94:a7:2d:b9:b3:b5:a2:1d:37:89:c8:84:d9:ed:bf
debug3: put_host_port: [127.0.0.1]:54973
debug3: put_host_port: [localhost]:54973
debug3: load_hostkeys: loading entries for host "[localhost]:54973" from file "/dev/null"
debug3: load_hostkeys: loaded 0 keys
debug1: checking without port identifier
debug3: load_hostkeys: loading entries for host "localhost" from file "/dev/null"
debug3: load_hostkeys: loaded 0 keys
Warning: Permanently added '[localhost]:54973' (RSA) to the list of known hosts.
debug2: bits set: 519/1024
debug1: ssh_rsa_verify: signature correct
debug2: kex_derive_keys
debug2: set_newkeys: mode 1
debug1: SSH2_MSG_NEWKEYS sent
debug1: expecting SSH2_MSG_NEWKEYS
debug2: set_newkeys: mode 0
debug1: SSH2_MSG_NEWKEYS received
debug1: Roaming not allowed by server
debug1: SSH2_MSG_SERVICE_REQUEST sent
debug2: service_accept: ssh-userauth
debug1: SSH2_MSG_SERVICE_ACCEPT received
debug2: key: /Users/inkel/.docker/machine/machines/dev/id_rsa (0x7feeca600150), explicit
debug1: Authentications that can continue: publickey,password,keyboard-interactive
debug3: start over, passed a different list publickey,password,keyboard-interactive
debug3: preferred publickey,keyboard-interactive
debug3: authmethod_lookup publickey
debug3: remaining preferred: keyboard-interactive
debug3: authmethod_is_enabled publickey
debug1: Next authentication method: publickey
debug1: Offering RSA public key: /Users/inkel/.docker/machine/machines/dev/id_rsa
debug3: send_pubkey_test
debug2: we sent a publickey packet, wait for reply
debug1: Server accepts key: pkalg ssh-rsa blen 279
debug2: input_userauth_pk_ok: fp 8b:f1:19:49:34:18:61:8c:ba:cd:a5:65:99:aa:ce:ea
debug3: sign_and_send_pubkey: RSA 8b:f1:19:49:34:18:61:8c:ba:cd:a5:65:99:aa:ce:ea
debug1: read PEM private key done: type RSA
debug1: Authentication succeeded (publickey).
Authenticated to localhost ([127.0.0.1]:54973).
debug1: setting up multiplex master socket
debug3: muxserver_listen: temporary control path /Users/inkel/.ssh/master-docker@localhost:54973.bGmwLqk9AzPAnw76
debug2: fd 6 setting O_NONBLOCK
debug3: fd 6 is O_NONBLOCK
debug3: fd 6 is O_NONBLOCK
debug1: channel 0: new [/Users/inkel/.ssh/master-docker@localhost:54973]
debug3: muxserver_listen: mux listener channel 0 fd 6
debug1: control_persist_detach: backgrounding master process
debug2: control_persist_detach: background process is 15146
debug2: fd 6 setting O_NONBLOCK
debug1: forking to background
debug1: Entering interactive session.
debug2: set_control_persist_exit_time: schedule exit in 300 seconds
debug1: multiplexing control connection
debug3: fd 7 is O_NONBLOCK
debug3: fd 7 is O_NONBLOCK
debug1: channel 1: new [mux-control]
debug3: channel_post_mux_listener: new mux channel 1 fd 7
debug3: mux_master_read_cb: channel 1: hello sent
debug2: set_control_persist_exit_time: cancel scheduled exit
debug3: mux_master_read_cb: channel 1 packet type 0x00000001 len 4
debug2: process_mux_master_hello: channel 1 slave version 4
debug2: mux_client_hello_exchange: master version 4
debug3: mux_client_forwards: request forwardings: 0 local, 0 remote
debug3: mux_client_request_session: entering
debug3: mux_client_request_alive: entering
debug3: mux_master_read_cb: channel 1 packet type 0x10000004 len 4
debug2: process_mux_alive_check: channel 1: alive check
debug3: mux_client_request_alive: done pid = 15147
debug3: mux_client_request_session: session request sent
debug3: mux_master_read_cb: channel 1 packet type 0x10000002 len 92
debug2: process_mux_new_session: channel 1: request tty 0, X 1, agent 0, subsys 0, term "xterm", cmd "sudo /etc/init.d/docker restart", env 1
debug3: process_mux_new_session: got fds stdin 8, stdout 9, stderr 10
debug1: channel 2: new [client-session]
debug2: process_mux_new_session: channel_new: 2 linked to control channel 1
debug2: channel 2: send open
debug2: callback start
debug2: client_session2_setup: id 2
debug2: fd 5 setting TCP_NODELAY
debug3: packet_set_tos: set IP_TOS 0x08
debug1: Sending environment.
debug1: Sending env LANG = en_US.UTF-8
debug2: channel 2: request env confirm 0
debug1: Sending command: sudo /etc/init.d/docker restart
debug2: channel 2: request exec confirm 1
debug3: mux_session_confirm: sending success reply
debug2: callback done
debug2: channel 2: open confirm rwindow 0 rmax 32768
debug1: mux_client_request_session: master session id: 2
debug2: channel 2: rcvd adjust 2097152
debug2: channel_input_status_confirm: type 99 id 2
debug2: exec request accepted on channel 2
debug1: client_input_channel_req: channel 2 rtype exit-status reply 0
debug3: mux_exit_message: channel 2: exit message, evitval 0
debug1: client_input_channel_req: channel 2 rtype eow@openssh.com reply 0
debug2: channel 2: rcvd eow
debug2: channel 2: close_read
debug2: channel 2: input open -> closed
debug2: channel 2: rcvd eof
debug2: channel 2: output open -> drain
debug2: channel 2: obuf empty
debug2: channel 2: close_write
debug2: channel 2: output drain -> closed
debug2: channel 2: rcvd close
debug3: channel 2: will not send data after close
debug2: channel 2: send close
debug2: channel 2: is dead
debug2: channel 2: gc: notify user
debug3: mux_master_session_cleanup_cb: entering for channel 2
debug2: channel 1: rcvd close
debug2: channel 1: output open -> drain
debug2: channel 1: close_read
debug2: channel 1: input open -> closed
debug2: channel 2: gc: user detached
debug2: channel 2: is dead
debug2: channel 2: garbage collecting
debug1: channel 2: free: client-session, nchannels 3
debug3: channel 2: status: The following connections are open:
  #2 client-session (t4 r0 i3/0 o3/0 fd -1/-1 cc -1)

debug2: channel 1: obuf empty
debug2: channel 1: close_write
debug2: channel 1: output drain -> closed
debug2: channel 1: is dead (local)
debug2: channel 1: gc: notify user
debug3: mux_master_control_cleanup_cb: entering for channel 1
debug2: channel 1: gc: user detached
debug2: channel 1: is dead (local)
debug2: channel 1: garbage collecting
debug1: channel 1: free: mux-control, nchannels 2
debug3: channel 1: status: The following connections are open:

debug2: set_control_persist_exit_time: schedule exit in 300 seconds
debug3: mux_client_read_packet: read header failed: Broken pipe
debug2: Received exit status from master 0

@inkel
Copy link

inkel commented Jul 15, 2015

Mmm... after a few minutes, the following appeared on my screen:

debug1: ControlPersist timeout expired
debug1: channel 0: free: /Users/inkel/.ssh/master-docker@localhost:54973, nchannels 1
debug3: channel 0: status: The following connections are open:

debug3: fd 0 is not O_NONBLOCK
debug3: fd 1 is not O_NONBLOCK
Transferred: sent 3104, received 2480 bytes, in 301.1 seconds
Bytes per second: sent 10.3, received 8.2
debug1: Exit status -1

@nathanleclaire
Copy link
Contributor

@inkel Odd. Do you have a VPN or any unusual setup as far as your network goes that you can think of?

@inkel
Copy link

inkel commented Jul 15, 2015

No, not that I can think of. A few days ago I was still using 0.1.0 and worked perfectly, then updated to 0.3.0 and it started to worked funky as currently. I've another coworker, @fsaravia, who's suffering the same issue.

@inkel
Copy link

inkel commented Jul 16, 2015

@nathanleclaire after talking with my colleague I've found that I was using the following version, 'cause the one in homebrew was giving me this error:

$ docker-machine -v
docker-machine version 0.3.0 (0a251fe)

He is using the latest one, so I deleted that version, brew install docker-machine and now seems to work 😸 sorry for the trouble.

@nathanleclaire
Copy link
Contributor

@inkel Nice, glad you got it working.

@pedroxs
Copy link

pedroxs commented Jul 16, 2015

I just tried again with the most recent version of docker-machine with no success. I am runing on windows 7 with git shell. All steps and respective outputs are here.

The out.log file contains the main steps. Then there is a docker-machine-create.log and a docker-machine-env.log

After manually coping the cert files from /var/lib/boot2docker/tls/ to /c/Users/Pedro/.docker/machine/machines/test/

I did another try and the resulting log is docker-machine-env-2.log
Even with the exported environment variables at the end I am still not able to connect.
docker ps exited with timeout error

The vm did start and the docker service was ruining.
I can ssh on the machine and the docker service is listening on the correct port (2376)

docker@test:~$ netstat -an
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address           Foreign Address         State
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN
tcp        0      0 10.0.2.15:22            10.0.2.2:57901          ESTABLISHED
tcp        0      0 :::2376                 :::*                    LISTEN
tcp        0      0 :::22                   :::*                    LISTEN
Active UNIX domain sockets (servers and established)
Proto RefCnt Flags       Type       State         I-Node Path
unix  2      [ ACC ]     SEQPACKET  LISTENING      14092 /run/udev/control
unix  2      [ ACC ]     STREAM     LISTENING      17142 /var/run/acpid.socket
unix  2      [ ACC ]     STREAM     LISTENING      17656 /var/run/docker.sock
unix  3      [ ]         DGRAM                     14101
unix  3      [ ]         STREAM     CONNECTED      22738
unix  3      [ ]         STREAM     CONNECTED      22737
unix  3      [ ]         DGRAM                     14100

The interesting thing is that boot2docker works just fine.
Any more thoughts?
Thanks.

@chouclee
Copy link
Author

This issue is really hard to reproduce... After deleting all vmboxnet_x except vmboxnet0 and vmboxnet1, and rebooting the computer (rather than docker-machine), my issue was gone.

@inkel
Copy link

inkel commented Jul 20, 2015

After putting my computer to sleep a couple of times and then trying again I've started to have the same issue. When I had a momento to fully restart it I'll try again and see what happens.

@garystafford
Copy link

I have the latest versions of all Docker apps. I have gotten the issue on and off for weeks with Docker Machine on Linux Ubuntu building VirtualBox VMs. I just added Docker Swarm and the problem got several times worse (more frequent). I have spent two hours trying to create three swarm machines. If I reboot it usually fixes it, but that is not reasonable. Using docker-machine's --debug you can see where it's hanging as other commenters have pointed out. I see it hang in a few spots with Daemon not responding yet: dial tcp 192.168.99.XXX:2376: no route to host , but the timeout are always occurring at this point:

STDERR: 
Using SSH client type: external
About to run SSH command:
ip addr show dev eth1
&{/usr/bin/ssh [/usr/bin/ssh -o PasswordAuthentication=no -o IdentitiesOnly=yes -o StrictHostKeyChecking=no -o UserKnownHostsFile=/dev/null -o LogLevel=quiet -o ConnectionAttempts=3 -o ConnectTimeout=10 -i /home/gstafford/.docker/machine/machines/swarm-node-02/id_rsa -p 58710 docker@localhost ip addr show dev eth1] []  <nil> <nil> <nil> [] <nil> <nil> <nil> ?reflect.Value? false [] [] [] [] <nil>}
SSH cmd err, output: <nil>: 4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:9d:c5:3d brd ff:ff:ff:ff:ff:ff
    inet 192.168.99.105/24 brd 192.168.99.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe9d:c53d/64 scope link 
       valid_lft forever preferred_lft forever

SSH returned: 4: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 08:00:27:9d:c5:3d brd ff:ff:ff:ff:ff:ff
    inet 192.168.99.105/24 brd 192.168.99.255 scope global eth1
       valid_lft forever preferred_lft forever
    inet6 fe80::a00:27ff:fe9d:c53d/64 scope link 
       valid_lft forever preferred_lft forever

END SSH

Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Daemon not responding yet: dial tcp 192.168.99.105:2376: no route to host
Error creating machine: Maximum number of retries (60) exceeded
You will want to check the provider to make sure the machine and associated resources were properly removed.

@tehmaspc
Copy link

@garystafford what does your user's ssh config look like? see: #1591

@tjrivera
Copy link

@chantra's solution worked for me. indeed, the cause was my routes being clobbered by VPN software

@ChrisRut
Copy link

I just ran into this as well, @chantra's solution worked for me as well. Cisco AnyConnect 😡

@jmeickle
Copy link

Also Cisco AnyConnect. I don't suppose you could add a check that recommends fixing the routing table if it's been mangled by software like this? It's VERY frustrating to diagnose; I was still able to ssh into the docker machine, which booted successfully, and had to look at many tickets before I found the discussion in this one.

@nathanleclaire
Copy link
Contributor

I don't suppose you could add a check that recommends fixing the routing table if it's been mangled by software like this?

This is actually the exact solution I'm considering right now. In many cases, it seems that we can actually successfully finish the creation of the instance and get an IP address, it just happens to not be reachable.

@Mellbourn
Copy link

I also have the problem of unreachable docker after sleep.

How would you do @chantra's solution in Windows 10?

@iesen
Copy link

iesen commented Sep 21, 2015

i experienced this issue also after several sleeps.
restarting fixed the issue..

@sameerr25
Copy link

This happened to me too when I have docker-machine running and then I get on a VPN, command just hangs forever.
However, I got off the VPN did machine restart docker-machine restart <MACHINE-NAME>and it started working again.

@ronen
Copy link

ronen commented Sep 23, 2015

This happened to me too when I have docker-machine running and then I get on a VPN, command just hangs forever.
However, I got off the VPN did machine restart docker-machine restart and it started working again.

This just happened to me too: I had docker-machine running, got on VPN, got off VPN, and docker env <MACHINE-NAME> hangs

However I did docker-machine restart <MACHINE-NAME> and it didn't help -- docker env <MACHINE-NAME> is still hanging.

@canonical-zz
Copy link

I'm having this problem too on Mac OS 10.10.4 with the Cisco AnyConnect VPN.

  • Fresh boot, docker works fine
  • go on the VPN, back off
  • docker-machine env default just hangs

Tried docker-machine restart default, problem persists.

@chantra's solution seems to work for me. On my machine the interface was vboxnet0 (found using ifconfig from the mac terminal) Way easier than rebooting every time I need to go on the VPN.

@chantra
Copy link
Contributor

chantra commented Sep 24, 2015

Not sure I had commented what VPN I was using, but FWIW, I also using cisco
anyconnect....

On Wed, Sep 23, 2015 at 7:52 PM, Joe McGlynn notifications@github.com
wrote:

I'm having this problem too on Mac OS 10.10.4 with the Cisco AnyConnect
VPN.

  • Fresh boot, docker works fine
  • go on the VPN, back off
  • docker-machine env default just hangs

Tried docker-machine restart default, problem persists.

@chantra https://github.com/chantra's solution seems to work for me. On
my machine the interface was vboxnet0 (found using ifconfig from the mac
terminal) Way easier than rebooting every time I need to go on the VPN.


Reply to this email directly or view it on GitHub
#1500 (comment).

@ronen
Copy link

ronen commented Sep 24, 2015

FWIW I also was using Cisco AnyConnect VPN, on OS X 10.10.5 I just tried openconnect instead and it seems to avoid the problem.

@ChrisRut
Copy link

@ronen , Thanks for the tip about openconnect, I never knew that existed. GoodBye AnyConnect 😀

@ronen
Copy link

ronen commented Sep 25, 2015

@ChrisRut yeah I just learned about openconnect when trying to solve this problem. GoodBye AnyConnect indeed! I found openconnect to be a bit cumbersome to use though, so I threw together a quick wrapper that lets you just type "vpn up" and "vpn down". It's at https://gist.github.com/ronen/7d486adbde5d6bfd2472 if you're interested

@blaggacao
Copy link
Contributor

As it seems we have multiple issues on similar symtomps, I opened #1934 for those who experience problems on Windows 10 (or maybe other versions) hosts after sleep. In order to separate concerns, I kindly suggest relating this issue's title to VPN network issues.

@dracan
Copy link

dracan commented Oct 9, 2015

I get this issue after my host machine has gone to sleep, and I come back to it later. I end up deleting everything from my c:\users\dan.docker\machine folder (other than the cache), then recreate it all again. It's quicker doing that than rebooting the host machine! ;)

@dracan
Copy link

dracan commented Oct 9, 2015

Oh, and I also have to kill some processes before doing that. I tend to have 3 'VBoxHeadless.exe' processes, 3 'VBoxNetDHCP.exe', and a 'VirtualBox Interface'.

@mrumpf
Copy link

mrumpf commented Oct 14, 2015

I woke up my notebook and deleted a docker-machine instance with name "dev".
I ran the following commands to create two new virtual machines:

$ docker-machine create \
    --driver virtualbox \
    --engine-env HTTP_PROXY=http://10.206.246.20:8080 \
    --engine-env HTTPS_PROXY=http://10.206.246.20:8080 \
    --virtualbox-hostonly-cidr "169.254.0.20/16" \
    registry  
$ eval $(docker-machine env registry --shell=bash)
$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS               NAMES
$ docker run -d -p 5000:5000 --restart=always --name registry registry
$ docker ps
CONTAINER ID        IMAGE               COMMAND             CREATED             STATUS              PORTS                    NAMES
7001a595cc28        registry            "docker-registry"   27 seconds ago      Up 24 seconds       0.0.0.0:5000->5000/tcp   registry
$ docker-machine ip registry
169.254.0.100
$ docker-machine create \
    --driver virtualbox \
    --engine-env HTTP_PROXY=http://10.206.246.20:8080 \
    --engine-env HTTPS_PROXY=http://10.206.246.20:8080 \
    --engine-insecure-registry "$(docker-machine env registry):5000" \
    --virtualbox-hostonly-cidr "169.254.0.20/16" \
    dev
$ docker-machine ip dev
169.254.0.101
$ eval $(docker-machine env dev --shell=bash)

The docker-machine env dev --shell=bash hangs for some minutes and dumps the following to stderr:

C:\Program Files\ConEmu>docker-machine -D env dev --shell=bash > c:\tmp\docker-machine_env.log
Maximum number of retries (60) exceeded

The stdout log output can be found in this Gist.

The strange thing is that the correct output is dumped after the maximum number of retries is reached, see the end of the log.

This is the timed stdout output:

$ time docker-machine env dev --shell=bash
Maximum number of retries (60) exceeded
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://169.254.0.101:2376"
export DOCKER_CERT_PATH="C:\Users\mrumpf\.docker\machine\machines\dev"
export DOCKER_MACHINE_NAME="dev"
# Run this command to configure your shell:
# eval "$(C:\cygwin64\home\mrumpf\bin\docker-machine.exe env dev)"

real    4m8.802s
user    0m0.015s
sys     0m0.078s

And it keeps getting stranger...
The same command for the first virtual machine is executed without any issue:

$ time docker-machine env registry --shell=bash
export DOCKER_TLS_VERIFY="1"
export DOCKER_HOST="tcp://169.254.0.100:2376"
export DOCKER_CERT_PATH="C:\Users\mrumpf\.docker\machine\machines\registry"
export DOCKER_MACHINE_NAME="registry"
# Run this command to configure your shell:
# eval "$(C:\cygwin64\home\mrumpf\bin\docker-machine.exe env registry)"

real    0m2.129s
user    0m0.000s
sys     0m0.093s

@nathanleclaire
Copy link
Contributor

@mrumpf If you add a trailing slash to the *PROXY options e.g. --engine-env HTTP_PROXY=http://10.206.246.20:8080/, does it succeed or continue failing?

@remram44
Copy link

I'm getting this right now on Debian 8.2.

@jrep
Copy link

jrep commented Oct 29, 2015

Add Jupiter Junos Pulse to the list of VPN software that seems to cause this problem.

It strikes me as likely that any VPN would do this, since they all have the same fundamental definition to override "ordinary" routing.

Update: by using a "split connection" (an option that my installation allows. by connecting to a different VPN server), I seem to avoid this.

@remram44
Copy link

Wait what? I'm not using any kind of VPN or proxy...

@jrep
Copy link

jrep commented Oct 29, 2015

VPN is just one of many things that mess with network configuration. More generally, any change in network connections might (and that includes some things you might not think of as "networking"):

  • VPN enable/disable
  • Wired LAN connect/disconnect
  • WiFi enable / disable / zone change (like carrying a laptop to another area of the building)
  • Bluetooth device connect/disconnect (because of Bluetooth PAN)
  • even USB device connect/disconnect/on/off

The VPN case is noteworthy because it may offer the "split" option, which may avoid flummoxing the Docker connection from this cause. But split VPN isn't going to prevent all those other things from messing things up.

@remram44
Copy link

I'm using an old-fashioned RJ45 cable...

I don't understand why docker-machine is that sensitive to these events.

@dgageot
Copy link
Member

dgageot commented Oct 29, 2015

I tried rj45 plug/unplug, wifi switching. No issue on my side.

@jrep
Copy link

jrep commented Nov 3, 2015

Could there be a need for some processing time? I keep running into this with a shell function that does:

docker-machine stop "${VM}"
docker-machine start "${VM}"
docker-machine ssh "${VM}" sudo /etc/init.docker restart
eval $(docker-machine env "${VM}"

When I do these commands one at a time, with my fingers, it goes fine. But when run from the script, the same command sequence works once, than hangs (or takes several minutes).

Connecting to @mrumpf , this VM is the second in my docker-machine ls list.

@nathanleclaire
Copy link
Contributor

@jrep Ah, in that case, yes, you definitely need to wait a brief interval for the daemon to start up and begin accepting requests. It's why Machine has code to wait for Docker in between our daemon restarts during provisioning. Arguably, we should check for that on start as well.

@nathanleclaire
Copy link
Contributor

This issue is very long and contains a lot of digressions. If someone continues to encounter similar ones please open a new one at https://github.com/docker/machine/issues/new with detailed information including:

  • Which OS you are on
  • Anything atypical in networking configuration (VPN, proxy, SSH configuration settings, etc.),
  • Output of the misbehaving commands with the --debug flag
  • the VirtualBox logs from ~/.docker/machine/machines/name/name.

Thanks!

  • N

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests