Add a bastion host and place instances + kube apiserver behind it #1

bendrucker · 2018-03-06T20:21:52Z

This adds private networking to our typhoon fork. It makes the following changes relative to upstream typhoon:

Private subnet in each availability zone (including egress gateways for ipv4 + 6)
Moves workers into the private subnet
Moves controllers and the apiserver load balancer into the private subnet
Launches a bastion instance in the public subnet to allow administrative access
Configures terraform to run ssh provisioning steps via the bastion

This change set is the minimum diff from upstream to achieve private networking. Next steps:

Lock down security groups. Because everything was public they allow access to certain ports from 0.0.0.0/0. We can restrict them to our VPC CIDR block.
Rebase against upstream. Struggling with the nuances of NLBs ate up time I planned to use to try to sync up with typhoon upstream. We now have fairly significant diffs and I'd like to try to rebase to get a sense for how it'll go.

* it's the equivalent of a nat gateway, but for ipv6 * a nat gw will return an error if used as an ipv6 route

Can't use NLB because: * During bootstrap a single instance is in service * NLB cannot direct traffic to the originating instance * It preserves IPs Can't use ALB because: * Health check is HTTP/HTTPS * apiserver will reject health checks w/ 401 (unauthorized) * kubernetes/kubernetes#43784

bendrucker · 2018-03-13T00:23:45Z

Here is the subnet layout:

name	az	cidr
bastion-test-private-0	us-east-1a	10.0.128.0/20
bastion-test-private-1	us-east-1b	10.0.144.0/20
bastion-test-private-2	us-east-1c	10.0.160.0/20
bastion-test-private-3	us-east-1d	10.0.176.0/20
bastion-test-private-4	us-east-1e	10.0.192.0/20
bastion-test-private-5	us-east-1f	10.0.208.0/20
bastion-test-public-0	us-east-1a	10.0.0.0/20
bastion-test-public-1	us-east-1b	10.0.16.0/20
bastion-test-public-2	us-east-1c	10.0.32.0/20
bastion-test-public-3	us-east-1d	10.0.48.0/20
bastion-test-public-4	us-east-1e	10.0.64.0/20
bastion-test-public-5	us-east-1f	10.0.80.0/20

Changes still coming:

Launch bastion in ASG behind load balancer

bendrucker · 2018-03-15T03:36:02Z

@eladidan Complete with a load balancer/ASG now, with desired,min,max=1 (configurable). I can walk you through how I test this as well as leave a stack up for you to look at. The name is bastion-test and it's in the k8s-playground sub-account in us-east-1.

* copy-secrets hangs for ~1 min waiting for an instance * it succeeds but it's more clear to wait for the asg before trying

eladidan

@bendrucker very superficail review. Let's meet and discuss this.
I'm concerned about how we're going to manage keys and users, and want to make sure we don't regress in terms of functionality with what we have with tunnelbox, though do not want to add tons of scope here - I'd like to align on long term vision here.

eladidan · 2018-03-16T00:41:14Z

aws/container-linux/kubernetes/bastion.tf

+  lifecycle {
+    # override the default destroy and replace update behavior
+    create_before_destroy = true
+    ignore_changes        = ["image_id"]


To avoid applying updates via AMI changes. Otherwise a new AMI means terraform will want to destroy everything and re-create it. We can (and should) automatically apply patches to existing instances instead of updating to new AMIs as a general practice.

Typhoon recommends https://github.com/coreos/container-linux-update-operator

We can (and should) automatically apply patches to existing instances instead of updating to new AMIs as a general practice.

this is a decision that goes way beyond the scope of this PR, and has not been our policy up until today. Let's not get in the habit of making policy changes snuck in as implementation details in a PR.

The bastion instance is not part of the kubernetes cluster, so I don't see how the operator reference is relevant?

AMI updates are infrequent so I don't think doing a rolling update of the ASG (bringing instances down 1 by 1 and creating new ones) is bad practice, and makes infrastructure more immutable.

More info, per our discussion just now. This is consistent with how the controllers and workers are managed in typhoon. The data source will return the latest stable AMI.

Typhoon tells terraform to ignore these changes in order to not require a destroy/recreate any time there's an update. There's no way to defer these changes—any terraform apply will want to make them and it's easy to get be forced into recreating resources when you want to apply another change.

We don't have to do in place updates either. We can:

Deploy a new cluster when we want to use new AMIs. Changes are only ignored for the given instance of a module.

Create a variable (map) containing AMI IDs per region and then updating that manually. That's the most consistent with aether/CloudFormation.

Like some of the other things you've mentioned, totally valid, but I think we should land private networking and come back to it.

I don't think doing a rolling update of the ASG (bringing instances down 1 by 1 and creating new ones) is bad practice

CloudFormation provides additional functionality for updating autoscaling groups (including rolling updates). Terraform chose not to try to replicate this, more here: hashicorp/terraform#1552

One option (probably mentioned in that thread somewhere) is to use terraform to create a CloudFormation stack solely to take advantage of UpdatePolicy for ASGs.

eladidan · 2018-03-16T00:41:24Z

aws/container-linux/kubernetes/bastion.tf

+
+  lifecycle {
+    # override the default destroy and replace update behavior
+    create_before_destroy = true


what does this mean?

create_before_destroy overrides the default terraform lifecycle for resource replacement. Normally terraform will destroy the old resource and then create a new one. Setting that attribute means it will instead launch the replacement resource before destroying the previous one.

In this case, a change requiring replacement of the ASG will be rolled out by creating a new ASG and then destroying the old one once the new one is created.

eladidan · 2018-03-16T00:43:14Z

aws/container-linux/kubernetes/bastion.tf

+}
+
+resource "aws_launch_configuration" "bastion" {
+  image_id      = "${data.aws_ami.coreos.image_id}"


since this is not running kubernetes or docker, maybe use AWS linux AMI?

That means using different entirely different configuration strategy than the other machines. Most importantly I copied the container linux configs from controllers/workers. Configuring AWS linux is different enough that I really don't think it's worth it.

I do.
We would want different auditing tools, and different dependencies installed on bastion host.
I understand the extra scope here, so willing to push back to different PR, but since purpose of this instance is vastly different, and is publically accessible over ssh (via NLB) then we should specialize the instance to that purpose

eladidan · 2018-03-16T00:45:20Z

aws/container-linux/kubernetes/bastion.tf

+    cidr_blocks = ["0.0.0.0/0"]
+  }
+
+  egress {


needed? isn't this default?

In AWS yes, but terraform overrides that (which I prefer)

hashicorp/terraform#1765

good to know, thanks

eladidan · 2018-03-16T00:47:26Z

aws/container-linux/kubernetes/bastion.tf

+  name = "${format("bastion.%s.%s.", var.cluster_name, var.dns_zone)}"
+  type = "A"
+
+  # AWS recommends their special "alias" records for ELBs


modify comment, LB not ELB

eladidan · 2018-03-16T00:59:00Z

aws/container-linux/kubernetes/network.tf

  subnet_id      = "${element(aws_subnet.public.*.id, count.index)}"
 }
+
+resource "aws_subnet" "private" {


@hulbert please review as well

eladidan · 2018-03-19T23:23:47Z

aws/container-linux/kubernetes/network.tf

@@ -52,6 +52,60 @@ resource "aws_subnet" "public" {
 resource "aws_route_table_association" "public" {
  count = "${length(data.aws_availability_zones.all.names)}"


define ${length(data.aws_availability_zones.all.names)} as local

This is repeated a few times and I'm looking to minimize the diff here. Similar to another comment I'd like to keep private networking and general refactoring/modifications to typhoon separate.

eladidan · 2018-03-19T23:30:44Z

aws/container-linux/kubernetes/elb.tf

-    instance_port     = 443
-    instance_protocol = "tcp"
+# Network Load Balancer for apiservers
+resource "aws_lb" "apiserver" {


"apiserver" could create a lot of confusion with our actual API. Let's maybe rename to "k8s_apiserver" ?

Definitely possible but I think we need to save this discussion. I'd rather wait on changing other bits of typhoon that don't have to do with private networking if we have concerns about naming confusion.

eladidan · 2018-03-19T23:31:46Z

aws/container-linux/kubernetes/network.tf

+}
+
+
+resource "aws_eip" "nat" {


I think you may need the EIP to explicitly depend on the IGW

nat?

Can you expand?

I think you may need the EIP to explicitly depend on the IGW

Backwards (gateways use an EIP and need it allocated first). The egress gateway is actually for ipv6 and the NAT gateway is for ipv4.

Can you expand?

why name the elastic ip nat?

Backwards (gateways use an EIP and need it allocated first). The egress gateway is actually for ipv6 and the NAT gateway is for ipv4.

from my understanding an epi address is accessible via the igw and not the other way around

Oh. Because it's going to be associated with the NAT gateway. I don't need to reference it anywhere else. The NAT gateway references it as allocation_id and that "claims" it.

Sorry, missing a word up there. The NAT gateway consumes the EIP. The Internet gateway allows instances in the public subnet (including the NAT) to access the internet.

I believe the confusing bit is that the internet gateway provides NAT for instances in the public subnet. And the NAT gateway and egress-only internet gateway provide routing for instances within the private subnet.

eladidan · 2018-03-19T23:32:42Z

aws/container-linux/kubernetes/network.tf

+
+resource "aws_egress_only_internet_gateway" "egress_igw" {
+  vpc_id = "${aws_vpc.network.id}"
+}


bendrucker · 2018-03-20T02:45:10Z

I want to make sure we don't regress in terms of functionality with what we have with tunnelbox

I'd love to have tunnelbox provide the primary day to day egress for engineers connecting to the cluster. Definitely a discussion for tomorrow. Tunnelbox works great for port forwarding but it doesn't seem straightforward to use it as a jump host for SSH. I think that would mean handling SSH certificate auth on the instances and getting replicating the complicated interactions with vault that tunnelbox relies on.

eladidan

@bendrucker this looks awesome.
Assuming that the network diagram will go in separate PR, this looks gtg!

* using indented arrays means the template call means matching indentation

bendrucker · 2018-03-23T22:14:14Z

@eladidan Tested multiple keys and caught a bug there. The yaml file is indented so joining over \n- isn't enough. Luckily a later step reads the yaml (and I believe converts it to JSON for actual use) and so terraform plan failed because of the indentation issue. Swapped over to [] and tested that the changes are considered valid and would trigger a rebuild of all the instances. We can double check by having you connect as core@ after I rebuild the playground cluster.

bendrucker added 18 commits March 6, 2018 12:06

create bastion instance, security groups, container linux config

ed22caa

remove controllers/workers from public subnet

d28d0d3

add controllers + workers to bastion_internal sg

3264259

Remove public IP from controllers

349b888

add bastion_{host,user} to provisioner connections

84830f3

bastion: fix cl template_file resource name

6f0bef4

Add bastion_type var

67f6443

bastion: fix user_data

efcbdde

deploy kube controller and worker components into private subnet

bf2b102

add dns and tags to bastion for discoverability

58af913

fix tag name for private subnets

ce83787

add an egress internet gateway for ipv6

d3a9913

* it's the equivalent of a nat gateway, but for ipv6 * a nat gw will return an error if used as an ipv6 route

elb: convert apiserver to internal nlb

0f9d958

ssh: use private_ip

852baa2

dns: use bastion.$CLUSTER

0920644

NLB + IP target! Gaurav ftw

ddb62f4

Reduce spacing between public and private subnets

c65c6b9

bendrucker requested a review from eladidan March 13, 2018 00:20

create bastion in load balanced asg with size=1

609e828

bendrucker changed the title ~~WIP: Add a bastion host and place instances + kube apiserver behind it~~ Add a bastion host and place instances + kube apiserver behind it Mar 15, 2018

set bastion asg size to var.bastion_count

75bdaba

wait for 1 available bastion before considering asg ready

3f2a94f

* copy-secrets hangs for ~1 min waiting for an instance * it succeeds but it's more clear to wait for the asg before trying

eladidan suggested changes Mar 19, 2018

View reviewed changes

bendrucker added 2 commits March 19, 2018 17:47

remove unnecessary lb alias record comment

61424d5

newline

644ef35

add support for multiple ssh keys (untested)

4c6ca61

eladidan approved these changes Mar 23, 2018

View reviewed changes

use a [] yaml array to specify the list of authorized public keys

b90776d

* using indented arrays means the template call means matching indentation

bendrucker merged commit be283d4 into master Mar 23, 2018

bendrucker deleted the bastion-host branch March 23, 2018 22:16

bendrucker restored the bastion-host branch March 28, 2018 21:35

erikbryant deleted the bastion-host branch October 8, 2020 16:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a bastion host and place instances + kube apiserver behind it #1

Add a bastion host and place instances + kube apiserver behind it #1

bendrucker commented Mar 6, 2018 •

edited

Loading

bendrucker commented Mar 13, 2018

bendrucker commented Mar 15, 2018

eladidan left a comment

eladidan Mar 16, 2018

bendrucker Mar 20, 2018

eladidan Mar 20, 2018

bendrucker Mar 20, 2018

bendrucker Mar 20, 2018

eladidan Mar 16, 2018

bendrucker Mar 20, 2018

eladidan Mar 16, 2018

bendrucker Mar 20, 2018 •

edited

Loading

eladidan Mar 20, 2018

eladidan Mar 16, 2018

bendrucker Mar 20, 2018

eladidan Mar 20, 2018

eladidan Mar 16, 2018

eladidan Mar 16, 2018

eladidan Mar 19, 2018

bendrucker Mar 20, 2018

eladidan Mar 19, 2018

bendrucker Mar 20, 2018

eladidan Mar 19, 2018

eladidan Mar 19, 2018

bendrucker Mar 20, 2018

eladidan Mar 20, 2018

eladidan Mar 20, 2018

bendrucker Mar 20, 2018 •

edited

Loading

bendrucker Mar 20, 2018

eladidan Mar 19, 2018

bendrucker commented Mar 20, 2018

eladidan left a comment

bendrucker commented Mar 23, 2018

		@@ -52,6 +52,60 @@ resource "aws_subnet" "public" {
		resource "aws_route_table_association" "public" {
		count = "${length(data.aws_availability_zones.all.names)}"

Add a bastion host and place instances + kube apiserver behind it #1

Add a bastion host and place instances + kube apiserver behind it #1

Conversation

bendrucker commented Mar 6, 2018 • edited Loading

bendrucker commented Mar 13, 2018

bendrucker commented Mar 15, 2018

eladidan left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bendrucker Mar 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bendrucker Mar 20, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bendrucker commented Mar 20, 2018

eladidan left a comment

Choose a reason for hiding this comment

bendrucker commented Mar 23, 2018

bendrucker commented Mar 6, 2018 •

edited

Loading

bendrucker Mar 20, 2018 •

edited

Loading

bendrucker Mar 20, 2018 •

edited

Loading