Skip to content
This repository has been archived by the owner on Sep 30, 2020. It is now read-only.

Create etcd and workers in private subnets, controllers in public subnet #152

Closed
andrejvanderzee opened this issue Dec 11, 2016 · 27 comments

Comments

@andrejvanderzee
Copy link
Contributor

andrejvanderzee commented Dec 11, 2016

My desired HA setup has three AZs each with a public and private subnet. The private subnets have its own route table that routes external, non-VPC traffic to the public NAT gateway in the same AZ. The public subnets route non-VPC traffic to an IGW.

Preferably I want to place etcd and worker nodes in the private subnets, and the controllers in the public subnet (only the controllers require external exposure via ELB).

I have been experimenting with kube-aws (really love it!) but it looks like my goal is not supported. Would it make sense to add options for controllers, etcd and workers that allows creation in existing subnets (we can create our own VPCs, subnets and routes easily)? Something like this:

subnets:
  - type: etcd
    subnetID: <privateID1>
  - type: etcd
    subnetID: <privateID2>
  - type: etcd
    subnetID: <privateID3>
  - type: controller
    subnetID: <publicID1>
  - type: controller
    subnetID: <publicID2>
  - type: controller
    subnetID: <publicID3>
  - type: worker
    subnetID: <privateID1>
  - type: worker
    subnetID: <privateID2>
  - type: worker
    subnetID: <privateID3>

When someone provides availabilityZone and instanceCIDR instead of subnetID kube-aws could create the subnets like it does now.

@andrejvanderzee andrejvanderzee changed the title Place etcd and workers in private subnets, controllers in public subnet Create etcd and workers in private subnets, controllers in public subnet Dec 11, 2016
@pieterlange
Copy link
Contributor

I'd rather have controllers and etcd nodes be deployed through some default nodepools.

You can still deploy your desired situation by manually adding the subnets to the stack-template.json.

@andrejvanderzee
Copy link
Contributor Author

It looks like nodepools are under development now, and it looks awesome :-) I hope it supports having a dedicated route table for each nodepool, as we have a clear use case for private subnets with dedicated route table in each AZ.

If you guys need any testing on nodepools, I would be glad to help.

@neoandroid
Copy link
Contributor

neoandroid commented Dec 13, 2016

I have a PoC to achieve what you want here:
https://github.com/neoandroid/kube-aws/tree/masters-private-subnet

I've added two new parameters on cluster.yaml (i.e. workerPrivateSubnet and controllerPrivateSubnet) to know if the user wants a private subnet or not. After that the user can define the private subnets in: workerSubnets and controllerSubnets parameters. Controllers are still reachable from everywhere via the ELB. The etcd nodes are on the public subnet but it would be very easy to do the same and move them to a private subnet, probably the same subnet where controllers live.
Forgot to mention that everything in private subnets goes through NAT gateways and you can use already pre-allocated EIPs for the NAT gateways.

It would be great if this PoC can be merged here, what do you think?? If you like the idea I can make a PR so we can discuss details before merging (e.g. add missing tests, discuss support for other scenarios,...)

@andrejvanderzee
Copy link
Contributor Author

@neoandroid If the controllers can still be reached from a public-facing ELB, as you mentioned, then I would prefer them to be in the private subnet too. Wasn't sure this was possible. Furthermore I think it is important that every private subnet has its own NAT gateway for HA.

@neoandroid
Copy link
Contributor

neoandroid commented Dec 13, 2016

It's up to you to choose if you want your controllers in their own private subnet or not, the option is already in the PoC.
The NAT gateways are in the public subnet, and you have one NAT gateway per AZ so if you also put controllers and workers in all AZ you'll get an HA solution.

@andrejvanderzee
Copy link
Contributor Author

andrejvanderzee commented Dec 13, 2016

OK that exactly covers my use-case 👍 I guess its up to @pieterlange @mumoshu to evaluate if this is "the right path"?

If I can be of any help in testing and/or code reviewing, I would be happy to make some time.

@cknowles
Copy link
Contributor

There's a little overlap with #44, a lot of which is covered by the node pools work (but not all). We've started to unify how node pools and default config works a little. I think since there are many different use cases around topology we likely need to make kube-aws flexible to these but also show best practice in key cases such as NAT+private subnet+ELB. If best practice is a case of having a particular node pool setup, perhaps we can provide different templates based on the preferred network topology.

@neoandroid
Copy link
Contributor

Indeed, I was also tracking that issue. I don't mind changing the way we introduce private subnets support for the different pieces (and it would be great if we can select whether a "piece" is on a private subnet or not).
If the PoC can help, fine, otherwise I'm happy to help developing support for the use-case described here in the way we decide it's better.

@cknowles
Copy link
Contributor

cknowles commented Dec 13, 2016

Awesome. We've got a very similar setup to what you've done, I think the only real difference is our VPC and shared infra such as NATs are setup as a separate bit of CloudFormation as it forms the base of a few other deployments such as RDS and Lambdas.

What are you acceptable prerequisites that need to be there prior to running kube-aws? Is it just the EIP reservation?

@andrejvanderzee nodepools does indeed support a route table for each nodepool

Oh, the main reason we'd want a node in a public subnet is if you want to SSH to it. Ideally that'd be a bastion but right now it's only easy in the setup if the controller is used.

@andrejvanderzee
Copy link
Contributor Author

Our preference would be to let kube-aws create everything up to the VPC, but we do not mind setting up private/public subnets and NAT gateways. We do prefer to push all nodes to the private subnets and indeed we will need a bastion-node for SSH access in each public subnet.

Will start trying the nodepools soon to see how far it will get me.

@andrejvanderzee
Copy link
Contributor Author

andrejvanderzee commented Dec 13, 2016

Another feature that is common in our clusters is to have one Kubernetes Ingress controller (such as Traefik) in each AZ to regulate outside traffic into the cluster. Probably I would prefer to have one dedicated, exclusive node in each AZ. Not sure if its out-of-scope to kube-aws, but it would be nice to be able to create and/or provision Ingress nodes + ELB automatically.

@neoandroid
Copy link
Contributor

For us it's fine with only the EIP allocations.
Whats steps do you think we should take now to add this functionality to kube-aws? Shall we first explore (i.e. develop a PoC) to achieve the same result but using node pools? ... ?

@mumoshu
Copy link
Contributor

mumoshu commented Dec 15, 2016

Thanks for bringing this up 🙇

@neoandroid I like your PoC 👍

Would you mind submitting a pull request so that we all can start reviewing?

I guess I can comment on it regarding:

  • How to extend its support for subnet distinction to node pools
    • Basically, stack-template.json and cluster.yaml in both config/templates and nodepool/config/templates should be updated in the same way
  • preferable configuration file format from the perspective of a maintainer
    worker:
      # I'd like to make it explicitly prefixed with `private` so that our users can't accidentally mix private/public subnets which would results in complete chaos
      # I'd also like to make it under the `worker:` configuration key so that we can be explicit that it is specific to workers
      privateSubnets:
      - instanceCIDR: ...
         availabilityZone: ...
         #There's already a `routeTableId` configuration key with the same semantic in the top level of cluster.yaml
         #But it would also be added here as an alias if that is friendlier
         #routeTableId: ...
         natGateway:
           id: (specified to reuse an existing NAT gateway)
           eipAllocationId: (specified to create a NAT gateway while reusing an existing EIP for it)
           # Omit both of them to create a NAT gateway with newly assigned EIP
         #In the future, we can even reuse an existing subnet by specifying its id...
         #subnetId: ...

@andrejvanderzee @c-knowles Would you mind reviewing @neoandroid's PR once it is ready to be reviewed?

I'm not exactly sure your requirements but anyways @neoandroid's PoC seemed general enough to support various use-cases. I'm looking forward for it to become the basis for further improvements on this area.

@neoandroid
Copy link
Contributor

PR opened in #169
I will add later today the support for etcd cluster in its private subnet and after that I will start looking at adding the same support for node pools.
The configuration file format you propose looks good to me so I'll go that way.

@andrejvanderzee
Copy link
Contributor Author

@neoandroid Let me know when node pools are integrated, then I will start testing my use case and do a code review.

@cknowles
Copy link
Contributor

@andrejvanderzee I think dedicated nodes for an Ingres controller should be achievable in kube-aws once the nodepools work along with the node taints work is fully baked. You'd roll out a node pool just for the Ingress controller, taint those nodes and then allow your Ingress controller to tolerate those taints.

@mumoshu I'm wondering two things for review purposes. Do we intend to keep supporting new functionality outside of node pools? When I did my original pulls around subnets, I was directed to node pools as a way to solve the issues and I think node pool will also help here in terms of having one node pool for etcd and one for controller. Perhaps that becomes the default setup? Second thing I'm thinking is that I'd prefer to allow setting of a network topology which defines the options which are mandatory/optional. I think this will help clarify it for the users of kube-aws but also help up perform validation. I don't think we need to do this right now for this change but something we should think about how we can separate this part out.

@andrejvanderzee
Copy link
Contributor Author

@c-knowles Nodepools with tainted nodes would indeed solve the issue of having dedicated ingress controller nodes. Looking forward to those features :-)

@mumoshu
Copy link
Contributor

mumoshu commented Dec 18, 2016

@andrejvanderzee FYI, node pools and tainted nodes are already usable since 0.9.2 😉
Please see the doc and look for the experimenta.taints configuration key in cluster.yaml for more info.

@andrejvanderzee
Copy link
Contributor Author

@mumoshu Tried but failed so I opened issue #176

@mumoshu
Copy link
Contributor

mumoshu commented Dec 22, 2016

@c-knowles For the first thing, almost yes, especially for workers. As I believe we don't have the necessity to launch worker nodes inside the main cluster anymore, I'm ok if new features for worker nodes be available only in node pools.

However, node pools are meant to be for workers. Separating etcd and controller nodes to respective cloudformation stack(s)/sub-cluster(s) isn't something could be done shortly IMHO.
So new features may still be added to main clusters. More concretely, worker.privateSubnets could be supported only in node pools while etcd.privateSubnets and controller.privateSubnets should be supported only in main clusters for the time being.

@mumoshu
Copy link
Contributor

mumoshu commented Dec 22, 2016

@c-knowles For the second thing, would you mind describing a bit more concrete overview/examples of "allow setting of network topology` part? I feel to like the idea.

@cknowles
Copy link
Contributor

cknowles commented Jan 8, 2017

@mumoshu Specifically I mean a user of kube-aws would be allow to choose a type network topology such as "everything private" or "everything public" and then all our network options would be defined from there plus what options the user has to provide. We could even pose this setup as a sequence of questions which the user answer in order to complete a valid cluster YAML(s).

@mumoshu
Copy link
Contributor

mumoshu commented Feb 1, 2017

@c-knowles
Since v0.9.4-rc.1, "everything private" can be achieved with mapPublicIPs: false + routeTableId: your-preconfigured-rtb or

subnets:
- name: privateSubnet1
  instanceCIDR: xxx
  availabilityZone: xxx
  private: true

worker:
  subnets:
  - name: privateSubnet1
controller:
  subnets:
  - name: privateSubnet1
  loadBalancer:
    private: true
etcd:
  subnets:
  - name: privateSubnet1

More shorter configuration syntax could be added but I'm not yet sure the concrete picture of it.

@mumoshu
Copy link
Contributor

mumoshu commented Feb 1, 2017

@andrejvanderzee I guess your original req could be achieved since v0.9.4-rc.1 with something like:

subnets:
- name: privateSubnet1
  instanceCIDR: xxx
  availabilityZone: xxx
  private: true
- name: publicSubnet1
  instanceCIDR: xxx
  availabilityZone: xxx
  # deaults to false = public subnet
  #private: false

worker:
  subnets:
  - name: privateSubnet1
controller:
  subnets:
  - name: publicSubnet1
  loadBalancer:
    private: false
etcd:
  subnets:
  - name: privateSubnet1

Again, more shorter configuration syntax could be added but I'm not yet sure the concrete picture of it!

@mumoshu
Copy link
Contributor

mumoshu commented Feb 1, 2017

Or if you'd like to reuse subnets, please configure those subnets appropriately beforehand and:

subnets:
- name: privateSubnet1
  #instanceCIDR: xxx
  # instead of instanceCIDR, specify id and ensure that the subnet already has a correctly configured route table assigned
  id: subnet-xxxxxxxx
  availabilityZone: xxx
  private: true
- name: publicSubnet1
  #instanceCIDR: xxx
  # instead of instanceCIDR, specify id and ensure that the subnet already has a correctly configured route table assigned
  id: subnet-xxxxxxxx
  availabilityZone: xxx
  # deaults to false = public subnet
  #private: false

worker:
  subnets:
  - name: privateSubnet1
controller:
  subnets:
  - name: publicSubnet1
  loadBalancer:
    private: false
etcd:
  subnets:
  - name: privateSubnet1

@mumoshu
Copy link
Contributor

mumoshu commented Feb 1, 2017

Please see comments for the subnets: key in the updated cluster.yaml for more options!

@mumoshu
Copy link
Contributor

mumoshu commented Feb 16, 2017

I'm closing this as the requested feature is already available since v0.9.3-rc.1.
Thanks everyone for your contributions!

@mumoshu mumoshu closed this as completed Feb 16, 2017
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants