Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

[proxy] intermittent failure in entry point test #812

Closed
rade opened this issue Jun 2, 2015 · 3 comments · Fixed by #820
Closed

[proxy] intermittent failure in entry point test #812

rade opened this issue Jun 2, 2015 · 3 comments · Fixed by #820

Comments

@rade
Copy link
Member

rade commented Jun 2, 2015

I'm seeing the following every now and then...

$ ./620_proxy_entrypoint_command_test.sh 
Proxy uses correct entrypoint and command with weavewait
...
test #3 "proxy docker_on 192.168.48.11 run -e 'WEAVE_CIDR=10.2.1.1/24' --entrypoint='grep' false ^1$ /sys/class/net/ethwe/carrier" failed:
    program terminated with code 1 instead of 0

Running the failing command repeatedly on the host produces a failure about half the time...

root@host1:~# docker -H localhost:12375 run -e 'WEAVE_CIDR=10.2.1.1/24' --entrypoint='grep' false ^1$ /sys/class/net/ethwe/carrier
1
root@host1:~# docker -H localhost:12375 run -e 'WEAVE_CIDR=10.2.1.1/24' --entrypoint='grep' false ^1$ /sys/class/net/ethwe/carrier
1
FATA[0000] Error response from daemon: Container bf6d0a5ae5735d4443613cc165e3e4ac4e62639e720120c98aead839db59092d died 
root@host1:~# docker -H localhost:12375 run -e 'WEAVE_CIDR=10.2.1.1/24' --entrypoint='grep' false ^1$ /sys/class/net/ethwe/carrier
1
FATA[0000] Error response from daemon: Container 885c2204b90952cc0274c196b63d5b890cc84750ff0934bf67153280e7a0be3f died 

The log of the failed container contains 1.

@rade rade added the bug label Jun 2, 2015
@rade
Copy link
Member Author

rade commented Jun 2, 2015

I've also had test#2 and test#5 fail.

@paulbellamy
Copy link
Contributor

weave attach adds the network interface, which causes weavewait to run the main command. Then, weave attach carries on doing network-stuff. If the container exits before weave attach is finished, then weave attach throws the Container <...> died error. Making the weavewait interval longer causes the race condition to be less likely, but doesn't fix it.

Options:

  1. Special-case and ignore that error (bleh)
  2. Make weave attach atomic, from container's perspective (probably not possible...)
  3. Change weavewait to block on USR2 signal, not the interface (IMO, the nicest option)

@rade rade added this to the next milestone Jun 2, 2015
@paulbellamy
Copy link
Contributor

So, we can't make weave attach "atomic" because the route needs to be setup after the interface is up, and the arp update needs to use the interface to ping our neighbours.

@rade rade closed this as completed in #820 Jun 2, 2015
rade added a commit that referenced this issue Jun 2, 2015
Use USR2 signal to tell weavewait when to continue

Fixes #812.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants