-
Notifications
You must be signed in to change notification settings - Fork 673
Don't delete record of IP allocation immediately after container death #1191
Conversation
As it stands, this change has the effect that you can starve the IP allocator by starting and stopping a lot of containers in succession. We could, if we run out of free addresses, start taking the oldest ones from the dead list. Similarly, we will be unable to donate space to a peer if we have lots of dead ones piled up. |
New idea: |
closed in favour of #1196 |
Maybe we do want to do it this way after all. |
4522ba3
to
d6d66f6
Compare
@@ -118,6 +118,22 @@ IP addresses of external services the hosts or containers need to | |||
connect to. The same IP range must be used everywhere, and the | |||
individual IP addresses must, of course, be unique. | |||
|
|||
If you restart a container, it will retain the same IP addresses on | |||
the weave network: |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Minor comments but generally looks fine. |
I believe I have addressed all comments. |
a2731b1
to
33fb077
Compare
I have filed the extra issues, rebased and updated the doc as suggested. |
33fb077
to
f98e7f9
Compare
bc43ae8
to
b55b737
Compare
This looks good, but I am seeing occasional failures of |
However when the test hangs the last output is this:
In this particular example, the hang occurred on only the second iteration! It looks like the test container isn't dieing for some reason? |
The |
Is it hanging in |
I'll check. Might be an idea to see if you can reproduce it too - it's failing for me about 20% of the time. |
It looks like it is; given that I've seen it hang on the second iteration, it would appear to be unrelated to address exhaustion... |
and call tryPendingOps() from it so we eventually re-try operations
so they will be reclaimed if the container is restarted. Still cancel pending operations immediately a container dies
but remove them if they are destroyed, to reduce starvation when users are creating and destroying containers quickly.
b55b737
to
e468399
Compare
Let me know when you've rebased on the weavewait race fix and I'll give it another try. |
Rebased and succeeded on CircleCI |
Don't delete record of IP allocation immediately after container death
This fixes #1047, without any special reference to the restart command, because the same container ID will get back the same address as long as you start/attach it within the timeout window of 5 seconds.
Perhaps the timeout should be configurable?