Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add alpha support for local libvirt (cl only) #266

Closed
wants to merge 1 commit into from

Conversation

squeed
Copy link

@squeed squeed commented Jul 6, 2018

This modifies the bare-metal platform slightly for direct libvirt provisioning. It uses terraform-provider-libvirt, and can bring up a test cluster in under 4 minutes.

Testing

Run through the Libvirt howto. Works on my machine :-).

Copy link
Member

@dghubble dghubble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey Casey, thanks for the unexpected effort. This may idle for a while however.

There are other topics I want to address before supporting a new platform (as exciting as that is). I'll also need to do plenty of messing about with https://github.com/dmacvicar/terraform-provider-libvirt since I last gave it a shot.

Make a note of the absolute path to this image, you'll need it later.


## DNS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems overly complicated. I'm not keen to have users setup local split horizon or to answer questions about it when its inevitably "not working" for someone.


cluster_name = "hestia"
base_image_path = "/home/user/coreos.img"
domain = "hestia.k8s"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Match bare-metal.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, but Hestia is the goddess of the home. Because it's local development. eh? eh?

domain = "hestia.k8s"

controller_names = ["node1", "node2"]
controller_ips = ["192.168.120.10", "192.168.120.11"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No IPs. Users should not be permitted to hardcode IPs.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having stable IPs for the controllers makes some things easier for DNS. They could be autogenerated from the node_ip_pool. The only downside to this is that they could change if the controller count changes.


In this tutorial, boot and provision a Kubernetes v1.11.0 using a local libvirt instance.

The libvirt archirecture is meant for development and testing, rather than long-term use. It does not support load-balancing between multiple API servers, for example.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phrasing is crucial here. Development/testing can't be grounds for the architecture or conventions to differ between Typhoon's platforms. We can say its only suitable for local user though, that's fine.

All Typhoon clusters should allow multi-master setups. This can be treated just like bare-metal, where load balancing is left to how the end user chooses to resolve / balance across masters.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the phrasing

sudo dnf install libvirt-client qemu-img
```

## CoreOS Base image
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typhoon consistently refers to these as "Container Linux Images". It accomodates Flatcar and future forks.

default = "default"
}

variable "libvirt_endpoint" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

default = "br42"
}

variable "libvirt_storage_pool" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally this gets removed too

default = "qemu:///system"
}

variable "dns_server" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do need something like this; libvirt requires an upstream DNS server.


# optional

variable "cluster_domain_suffix" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can remove this too. Its only justified use was in real production clusters.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind if I keep it? Just to limit divergence from other platforms.

description = "The domain name to use for the cluster (e.g. k8s.example.com)"
}

variable "k8s_hostname" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:(

@squeed
Copy link
Author

squeed commented Jul 9, 2018

Yeah, I threw this together quickly to scratch a personal itch and figured I'd throw over a PR. It's cool if it stews for a bit.

DNS is tricky. The ideal is to be able to spin up a cluster without overly complicated external configuration, like a load-balancer or external DNS records. We're also limited by libvirt and terraform-provider-libvirt - we can't set up CNAMEs or add multiple IPs to a single name.

The absolute requirements are:

  • console.cluster.k8s resolves to a controller machine

The nice-to-haves are:

  • nodeN.cluster.k8s resolves to the current IP of the up virtual machine
  • No static IPs
  • Minimal host-side configuration

Solutions I can think of:

  • Fixed IPs, user manually updates /etc/hosts. Node IPs need to be stable, but can be autogenerated by Terraform
    • Pros: Easy to understand
    • Cons: Annoying busywork, resizing cluster is awkward (nodes may change IPs)
  • User overrides their /etc/resolv.conf to pass all queries to the libvirt dnsmasq.
    • Pros: Don't need NetworkManager. No fixed IPs
    • Cons: Can't run more than one cluster. Fragile. Causes a local DNS outage when the cluster goes down. (I've done this in the past, and it's no fun)
  • Terraform writes /etc/NetworkManager/dnsmasq.d/... directly
    • Pros: Can have any console name, matching bare metal. Slightly less fragile.
    • Cons: Still requires split-horizon

I can't immediately think of a way to avoid split-horizon. If you want world-resolvable domain names, then you should probably be using the bare-metal platform anyways?

@squeed
Copy link
Author

squeed commented Jul 9, 2018

Okay, I managed to cut down on most of the DNS insanity. Here's the solution I came up with:

  • I returned to k8s_domain_name, which works the same as with bare-metal
  • If you set libvirt_create_k8s_domain_name, it will configure the libvirt dnsmasq to respond to k8s_domain_name and return the first controller IP. If you don't set it, then you're on your own, same as bare-metal.

We still have to set a single domain for all machine hostnames, since the etcd servers rely on that.

This modifies the bare-metal platform slightly for direct libvirt
provisioning. It uses terraform-provider-libvirt, and can bring up a
test cluster in under 4 minutes.
@MalloZup
Copy link

@squeed ciao which network option you will need for this project ? speaking from libvirt-terraform perspective.

@squeed
Copy link
Author

squeed commented Aug 30, 2018

@MalloZup the big thing that would make this a lot easier would be the ability to manage domain names via terraform-libvirt. For example, right now I have to shell out to virsh: https://github.com/poseidon/typhoon/pull/266/files#diff-32eed82232b8dd46a8490b50515327ebR28

domain = "${var.machine_domain}"
addresses = ["${var.node_ip_pool}"]

dns_forwarder {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to update for terraform-provider-libvirt 0.5.1:

dns {
  forwarders = [
    {
      address = "${var.dns_server}"
    }
  ]
}

@remoe
Copy link

remoe commented Jan 20, 2019

I've ported the latest Typhoon (Kube 1.13.2) to libVirt / CoreOS. It works great!

uname -a
Linux master.example.k8s 4.14.88-coreos #1 SMP  x86_64 QEMU Virtual CPU version 2.5+ GenuineIntel GNU/Linux

NAMESPACE     NAME                                      READY   STATUS    RESTARTS   AGE
kube-system   calico-node-bb6xf                         1/1     Running   0          20h
kube-system   calico-node-cwvwz                         1/1     Running   0          20h
kube-system   calico-node-ptk6d                         1/1     Running   1          20h
kube-system   coredns-59468df95-gp82k                   1/1     Running   0          20h
kube-system   coredns-59468df95-jlfdv                   1/1     Running   0          20h
kube-system   kube-apiserver-xqx5f                      1/1     Running   2          20h
kube-system   kube-controller-manager-6d468b4bc-hzs9t   1/1     Running   0          20h
kube-system   kube-controller-manager-6d468b4bc-x8f89   1/1     Running   0          20h
kube-system   kube-proxy-2ts52                          1/1     Running   0          20h
kube-system   kube-proxy-dxmps                          1/1     Running   0          20h
kube-system   kube-proxy-kwm8p                          1/1     Running   0          20h
kube-system   kube-scheduler-8444bcf5-2jd9g             1/1     Running   0          20h
kube-system   kube-scheduler-8444bcf5-gpnhq             1/1     Running   0          20h
kube-system   pod-checkpointer-6p5hk                    1/1     Running   0          20h
kube-system   pod-checkpointer-6p5hk-master.example.k8s 1/1     Running   0          20h

NAME                  STATUS   ROLES               AGE   VERSION
master.example.k8s    Ready    controller,master   21h   v1.13.2
worker1.example.k8s   Ready    node                21h   v1.13.2
worker2.example.k8s   Ready    node                21h   v1.13.2

I used a libVirt NAT network for the private network. For the bastion-host I used a bridge network to allow access from LAN. At first I've tried to run a HTTP/HTTPS-Proxy to route external request to bastion host. But this HTTP_PROXY/NO_PROXY never worked with Kubernetes control plane. So the best was to use a gateway for this.

I used Terraform 0.11.11 and libVirt provider 0.5.1. The base terraform looks like similar to this:

module "libvirt-example" {
  source = "//libvirt/container-linux/kubernetes"
  
  cluster_name    = "mycluster"
  os_image = "images/coreos_production_qemu_image.img"

  controller_names = ["master"]
  worker_names = [ "worker1", "worker2" ]

  machine_domain = "company.k8s"
  k8s_domain_name = "master.company.k8s"
  machine_public_domain = "company-public.k8s"

  ssh_authorized_key = "..."
  ssh_private_key = "${file("~/.ssh/id_rsa")}"

  asset_dir = "/assets"
}

Sorry, I currently don't have a public repository of it.

@dghubble
Copy link
Member

I'm unlikely to expand support to libvirt, just don't use it enough among all the regular clusters

@dghubble dghubble closed this Sep 11, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants