Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle scaling / replacing of nodes in conjunction with provisioning ansible / weave network addresses / config #355

Open
tayzlor opened this issue Jun 30, 2015 · 2 comments

Comments

@tayzlor
Copy link
Member

tayzlor commented Jun 30, 2015

When adding nodes to the cluster we will run into problems with the way we assign the weave bridge address, and scaling. We need to think about the approach for this.
Some options off the top of my head (but needs way more thought)

  • form a tigher integration with terraform to derive the values of the bridges / CIDR (possibly).
  • use a fact from the machine being provisioned which is dynamic / unique per host
  • use the backend service discovery (consul) to populate the values, where consul would hold the master state of the cluster at any point

We also need to handle if nodes are removed / replaced in the cluster, without screwing up the rest of the cluster configuration and breaking the weave network.

This is discussed over here -
#351 (comment)
#351
#342

@tayzlor tayzlor changed the title Handle scaling nodes provisioning ansible / weave network issues Handle scaling / replacing of nodes in conjunction with provisioning ansible / weave network addresses / config Jun 30, 2015
@pmbauer
Copy link
Contributor

pmbauer commented Jun 30, 2015

The above bullet list covers it well - there are two choices for maintaining allocations.

  1. use a fact from the machine that is unique to compute the weave bridge and CIDR blocks
  2. use shared, coordinated state to track the allocations for purposes of consistency and GCing bridge and block values

For one data point, coreos fabric uses the 2nd approach. If Apollo took this route, we'd want to use consul (with consistent reads) to allocate the bridge and blocks when machines are provisioned and likely a GC process for reclaiming unused blocks.

Another data point, Udacity chose to use the 1st approach (since we only ever plan to target AWS), computing the bridge and CIDR blocks based on the VPC IP assigned to each host by AWS. This requires some careful CIDR math and VPC subnet construction.

You can use the same strategy described below with different starting VPC subnet offsets and masks to create multiple (or larger) clusters in a VPC. The example below scales to 1024 hosts and 1024*254 containers (260096)

e.g.
Given an AWS Subnet:
CIDR: 10.0.4.0/22
nr of Addresses: 1024
Range: 10.0.4.0 - 10.0.7.255

And an assigned IP: 10.0.X.Y

We can compute the weave CIDR block for each host by shifting up the last two segments of the AWS-assigned IP.

Weave CIDR block: 10.X.Y.0/24
Weave Bridge: 10.X.0.Y/14 (note that Y is never 0)

Advantages:
1. not coordinated
2. elastic - we can add servers at will
3. cidr blocks allocated to a server that goes down are automatically GCd - they will be re-used when AWS re-assigns that server's IP to a new server

Here's code for it: https://github.com/udacity/Apollo/commit/e459832a30a097ba58cffb0b1c4710cc459b19be

@tayzlor
Copy link
Member Author

tayzlor commented Jul 20, 2015

Could potentially be addressed by #395

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants