Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

IP Allocation Management #563

Merged
merged 7 commits into from
May 15, 2015
Merged

IP Allocation Management #563

merged 7 commits into from
May 15, 2015

Conversation

bboreham
Copy link
Contributor

Implements the scheme described at https://github.com/weaveworks/weave/wiki/IP-allocation-design, with address ranges controlled by a CRDT gossiped between peers.

The design doc on the Wiki will be deleted when this is merged.

weave launch takes a new -iprange argument specifying the subnet to work within.

Commands updated to take advantage of IPAM: weave run, weave start, weave attach, weave expose, weave hide

Future work

Currently there is no fall-back range if you don't specify one on the command-line; it would be better to pick from a set of possibilities (checking the range is currently unused before we start using it).

How to use IPAM for WeaveDNS? It needs its own special subnet.

How should we add support for subnets in general?

We get a bit of noise in the weave logs from containers going down, now that the weave script is calling ethtool and curl via containers. It looks like "Container 1c5a03204b5eea3aeb6f43ece41c3f260a87329bfd89ded34ea3056826b59814 died"

If we hear that a container has died we should knock it out of pending?

Interrogate Docker to check container exists and is alive at the point we assign an IP

To lessen the chance of simultaneous election of two leaders at start-up, peers could wait for a bit to see if more peers arrive before running an election. Also, we could look at the command-line parameters supplied to Weave, and wait longer in the case that we have been asked to connect to someone specific.

[It would be good to move PeerName out of package router into a more
central package so both ipam and router can depend on it.]

There is no specific logic to time-out requests such as "give me some space": the logic will be re-run and the request re-sent next time something else happens (e.g. it receives another request, or some
periodic gossip). This means we may send out requests more frequently than required, but this is innocuous and it keeps things simple. It is possible for nothing to happen, e.g. if everyone else has
disconnected. We could have a timer to re-consider things.

The operation to Split() one space into two, to give space to another Peer, is conceptually simple but the implementation is fiddly to maintain the various lists and limits within MutableSpace. Perhaps a different free-list implementation would make this easier.

@bboreham bboreham force-pushed the 22_ipam branch 2 times, most recently from f64df6a to aeffb70 Compare April 16, 2015 10:19
func (defaultTime) Now() time.Time { return time.Now() }

func (alloc *Allocator) setTimeProvider(tp timeProvider) {
// fixme

This comment was marked as abuse.

@bboreham bboreham force-pushed the 22_ipam branch 5 times, most recently from 383cc2e to 06f73ee Compare April 16, 2015 16:26
@bboreham
Copy link
Contributor Author

Commits re-organised and test-passing. Also extracted -pktdebug change out to a separate PR #567

@dpw
Copy link
Contributor

dpw commented Apr 16, 2015

I can't see what I said about -pktdebug now, but what I meant was split it out into its own commit, not into its own PR (where I see you have incurred the wrath of Matthias due to leaving -debug doing nothing).

@rade rade mentioned this pull request Apr 18, 2015
@dpw
Copy link
Contributor

dpw commented Apr 23, 2015

I've seen the following bug several times. I guess I can reproduce it about one time in three, though I have better and worse runs of luck. Just reproduced it with a build from this branch checked out sometime this morning.

I'm running weave on my laptop host OS (dwragg-laptop) and in a VM (ubuntu1404):

Host:

# ~dwragg/work/weaveworks/go/src/github.com/weaveworks/weave/weave launch -alloc 10.0.1.0/24

VM:

# ~dwragg/work/weaveworks/go/src/github.com/weaveworks/weave/weave launch -alloc 10.0.1.0/24 192.168.122.1

Host:

# ~dwragg/work/weaveworks/go/src/github.com/weaveworks/weave/weave expose

VM:

# ~dwragg/work/weaveworks/go/src/github.com/weaveworks/weave/weave expose
# ~dwragg/work/weaveworks/go/src/github.com/weaveworks/weave/weave status
weave router git-1f40f87d34f0
Encryption off
Our name is f6:43:c5:fe:24:e9(ubuntu1404)
Sniffing traffic on &{34 65535 ethwe 2a:37:24:d7:30:a7 up|broadcast|multicast}
MACs:
02:e0:38:b0:f1:4d -> 02:e0:38:b0:f1:4d(dwragg-laptop) (2015-04-23 12:37:12.861200086 +0000 UTC)
2a:37:24:d7:30:a7 -> f6:43:c5:fe:24:e9(ubuntu1404) (2015-04-23 12:37:10.136586417 +0000 UTC)
92:18:02:b2:25:77 -> f6:43:c5:fe:24:e9(ubuntu1404) (2015-04-23 12:37:10.660167879 +0000 UTC)
f6:43:c5:fe:24:e9 -> f6:43:c5:fe:24:e9(ubuntu1404) (2015-04-23 12:37:11.448019404 +0000 UTC)
0e:59:47:2c:9c:46 -> 02:e0:38:b0:f1:4d(dwragg-laptop) (2015-04-23 12:37:11.843317058 +0000 UTC)
Peers:
f6:43:c5:fe:24:e9(ubuntu1404) (v2) (UID 3027955244246475338)
   -> 02:e0:38:b0:f1:4d(dwragg-laptop) [192.168.122.1:6783]
02:e0:38:b0:f1:4d(dwragg-laptop) (v2) (UID 3288747158550528852)
   -> f6:43:c5:fe:24:e9(ubuntu1404) [172.17.42.1:46939]
Routes:
unicast:
f6:43:c5:fe:24:e9 -> 00:00:00:00:00:00
02:e0:38:b0:f1:4d -> 02:e0:38:b0:f1:4d
broadcast:
f6:43:c5:fe:24:e9 -> [02:e0:38:b0:f1:4d]
02:e0:38:b0:f1:4d -> []
Reconnects:

Allocator subnet 10.0.1.0+256
10.0.1.1 -> f6:43:c5:fe:24:e9(ubuntu1404) (0, 2, 126)
10.0.1.128 -> 02:e0:38:b0:f1:4d(dwragg-laptop) (0, 0, 127)
Address ranges we own:
  10.0.1.1+127 (1/0)
# /home/dwragg/work/weaveworks/go/src/github.com/weaveworks/weave/weave reset

Host:

# docker logs weave
[...]
panic: Trying to report free on space I don't own
[...]

Yes, by resetting weave on one machine, I killed the weave router on another. See https://gist.github.com/dpw/3766c53f9ae700f8b26e for full panic.

@bboreham
Copy link
Contributor Author

The panic after tombstone should be addressed by bboreham@5e10c23

@dpw
Copy link
Contributor

dpw commented Apr 28, 2015

Kicking off round 2 on the implementation: Although GetFor is gone, there is still a file called getfor.go.

@dpw
Copy link
Contributor

dpw commented Apr 28, 2015

The code under ipam/ring and ipam/space pass addresses around as net.IP, but usually need to store and operate on them as integers. So there is lots of clutter converting them back and forth, and odd things like utils.Add and utils.Subtract. And the code appears to assume 32-bit IPv4 addresses throughout, so it is not like the abstraction provided by net.IP is buying you anything.

So simply do type address uint32 in one place, and then use that type throughout the rest of the ipam code. Only use net.IP where it is really relevant (e.g. when parsing and formatting IP addresses in the HTTP API to the allocator).

@dpw
Copy link
Contributor

dpw commented Apr 28, 2015

It seems that space.Space and its associated functions are not actually public - allocator.go works purely with space.Set. It would be nice to make that explicit by lower casing, so that the true API to the Space code is clearer.

Actually, I'd go a bit further. space.Set should probably be called Space (as it is the abstraction exposed by the space module). What is currently called Space should be called something like range, except that is a reserved word in go, so spaceRange or something.

@dpw
Copy link
Contributor

dpw commented Apr 28, 2015

I think a rebase is in order so I can do another pass on some of the parts already commented on without it all becoming terribly confusing.

@bboreham bboreham force-pushed the 22_ipam branch 7 times, most recently from a450c37 to 62f145f Compare April 29, 2015 11:46
@dpw
Copy link
Contributor

dpw commented Apr 29, 2015

The separation of SpaceSet and Space seems to make things rather verbose, as we often end up having similar code in two forms (Allocate, NumFreeAddresses, etc.). It looks like fundamentally we are managing two sets of addresses: those this peer owns, and those this peer has allocated. I'm not sure why we need the separation to implement that, and by avoiding it, I think we can reduce the line count considerably.

(GiveUpSpace does use a heuristic which is based on knowledge of Spaces, but as noted in my line comment, I'm not sure what the rationale for that is.)

I can't see a strong need for the use of bitsets to represent allocated addresses. A peer will manage addresses for thousands rather than millions of containers, so we can afford to use a few bytes to track each one.

On the other hand, we do need to represent the ranges of addresses owned by a peer compactly, because a peer could own a whole /8 or bigger.

Given that there are no stringent performance requirements, we can use a common representation for both sets. The "owned" set suggests a representation based on ranges. I've sketched this idea out in https://gist.github.com/dpw/dd984ed746e80350e485 . I've tried to implement the main operations from SpaceSet; the few omissions should be easy to add. It's 212 lines, including some tests (although no doubt it could use a few more).

@dpw
Copy link
Contributor

dpw commented Apr 29, 2015

I have nothing against asserts. I like asserts. Some of my best friends are asserts. But...

Asserts are a bit like executable comments. And similarly to comments, the best assert is one that is not there because it would be obvious from the code. So an assertion should only be added when it cannot reasonably be made superfluous by improvements to the code around it.

That leaves plenty of cases where asserts are appropriate. And obviousness is subjective. But I feel that some parts of the IPAM code lean a bit too heavily on asserts.

And because I assume we'll see Assert spread to other parts of the codebase, I'd prefer it if the main variant of Assert did not take a message. The condition in an assert should generally be self-explanatory (if it's not, that is a hint that the code around it needs further consideration). Although it is unfortunate that golang does not have a built-in assert that would include the condition expression as the panic message, I doubt the message is vital: the function name and line number should be sufficient to track down the failing assert. My worry is that the message (combined with golang's cumbersome error handling) will create a temptation to use Assert for conditions that would be better regarded as errors. If Assert just takes a condition, it will be clear that it is only for sanity checks.

@tomwilkie
Copy link
Contributor

The asserts in ring were added to prevent future, potentially innocuous
looking changes from breaching invariants. There are subleties to the
algorithm that need to be expressed. For sure I've learnt on the side of
including more, but I don't think that causes any harm. None of them should
be triggerable by any combination of user actions.

I am concerned that a bug in ipam will take down the router; I'm thinking
perhaps catching these panics and closing the ipam actor thread (with a big
error in the logs) may be preferable? If we can make future requests error
gracefully.

I don't mind re:including a message in the assert. I don't see the harm in
being verbose, but I will go ahead and remove them.

On Wednesday, 29 April 2015, David Wragg notifications@github.com wrote:

I have nothing against asserts. I like asserts. Some of my best friends
are asserts. But...

Asserts are a bit like executable comments. And similarly to comments, the
best assert is one that is not there because it would be obvious from the
code. So an assertion should only be added when it cannot reasonably be
made superfluous by improvements to the code around it.

That leaves plenty of cases where asserts are appropriate. And obviousness
is subjective. But I feel that some parts of the IPAM code lean a bit too
heavily on asserts.

And because I assume we'll see Assert spread to other parts of the
codebase, I'd prefer it if the main variant of Assert did not take a
message. The condition in an assert should generally be self-explanatory
(if it's not, that is a hint that the code around it needs further
consideration). Although it is unfortunate that golang does not have a
built-in assert that would include the condition expression as the panic
message, I doubt the message is vital: the function name and line number
should be sufficient to track down the failing assert. My worry is that the
message (combined with golang's cumbersome error handling) will create a
temptation to use Assert for conditions that would be better regarded as
errors. If Assert just takes a condition, it will be clear that it is
only for sanity checks.


Reply to this email directly or view it on GitHub
#563 (comment).

@tomwilkie
Copy link
Contributor

Discussed this with David and he'd prefer to not special case assertions inside the IPAM code. I've written it anyway, but I'm not suggesting we include this code in this PR: tomwilkie@46570f1

} else {
return false
}
}

This comment was marked as abuse.

This comment was marked as abuse.

@rade
Copy link
Member

rade commented May 14, 2015

I've put the "allow Gossiper.Gossip to return nil" and "introduce PeerUID type" changes on master in ace56da and c7dd48d, respectively.

@bboreham bboreham force-pushed the 22_ipam branch 4 times, most recently from 2547271 to 17072ac Compare May 14, 2015 16:32
Tom Wilkie and others added 3 commits May 14, 2015 17:33
The basic data structure used to share information about
IP allocations between peers
Using the Ring CRDT and Weave gossip, provides an interface
for the weave script to allocate IP addresses in `weave run`,
`weave start` and `weave expose`.

See docs/ipam.md for implementation details, and site/features.md
for user-visible changes.
}
}
if httpAddr == "" || iprangeCIDR == "" {
router.NewGossip("IPallocation", &ipam.DummyAllocator{})

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

This comment was marked as abuse.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants