From efcce417beee846752fd3cc166856fc25de357e3 Mon Sep 17 00:00:00 2001 From: Bo Huang Date: Fri, 21 Aug 2020 08:31:34 -0700 Subject: [PATCH] Migrate to Fedora CoreOS (#50) * Update Grafana from v6.4.4 to v6.5.0 * https://grafana.com/docs/guides/whats-new-in-v6-5/ * Update Grafana from v6.5.0 to v6.5.1 * https://github.com/grafana/grafana/releases/tag/v6.5.1 * Fix DigitalOcean controller and worker ipv4/ipv6 outputs (#594) * Fix controller and worker ipv4/ipv4 outputs to be lists of strings * With Terraform v0.11 syntax, an enclosing list was required to coerce the output to be a list of strings * With Terraform v0.12 syntax, the enclosing list shouldn't be needed * Update mkdocs-material from v4.5.0 to v4.5.1 * Introduce cluster creation without local writes to asset_dir * Allow generated assets (TLS materials, manifests) to be securely distributed to controller node(s) via file provisioner (i.e. ssh-agent) as an assets bundle file, rather than relying on assets being locally rendered to disk in an asset_dir and then securely distributed * Change `asset_dir` from required to optional. Left unset, asset_dir defaults to "" and no assets will be written to files on the machine that runs terraform apply * Enhancement: Managed cluster assets are kept only in Terraform state, which supports different backends (GCS, S3, etcd, etc) and optional encryption. terraform apply accesses state, runs in-memory, and distributes sensitive materials to controllers without making use of local disk (simplifies use in CI systems) * Enhancement: Improve asset unpack and layout process to position etcd certificates and control plane certificates more cleanly, without unneeded secret materials Details: * Terraform file provisioner support for distributing directories of contents (with unknown structure) has been limited to reading from a local directory, meaning local writes to asset_dir were required. https://github.com/poseidon/typhoon/issues/585 discusses the problem and newer or upcoming Terraform features that might help. * Observation: Terraform provisioner support for single files works well, but iteration isn't viable. We're also constrained to Terraform language features on the apply side (no extra plugins, no shelling out) and CoreOS / Fedora tools on the receive side. * Take a map representation of the contents that would have been splayed out in asset_dir and pack/encode them into a single file format devised for easy unpacking. Use an awk one-liner on the receive side to unpack. In pratice, this has worked well and its rather nice that a single assets file is transferred by file provisioner (all or none) Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/162 * Add/update docs for asset_dir and kubeconfig usage * Original tutorials favored including the platform (e.g. google-cloud) in modules (e.g. google-cloud-yavin). Prefer naming conventions where each module / cluster has a simple name (e.g. yavin) since the platform is usually redundant * Retain the example cluster naming themes per platform * Reduce apiserver metrics cardinality and extraneous labels * Stop mapping node labels to targets discovered via Kubernetes nodes (e.g. etcd, kubelet, cadvisor). It is rarely useful to store node labels (e.g. kubernetes.io/os=linux) on these metrics * kube-apiserver's apiserver_request_duration_seconds_bucket metric has a high cardinality that includes labels for the API group, verb, scope, resource, and component for each object type, including for each CRD. This one metric has ~10k time series in a typical cluster (btw 10-40% of total) * Removing the apiserver request duration outright would make latency alerts a NoOp and break a Grafana apiserver panel. Instead, drop series that have a "group" label. Effectively, only request durations for core Kubernetes APIs will be kept (e.g. cardinality won't grow with each CRD added). This reduces the metric to ~2k unique series * Reduce kube-controller-manager pod eviction timeout from 5m to 1m * Reduce time to delete pods on unready nodes from 5m to 1m * Present since v1.13.3, but mistakenly removed in v1.16.0 static pod control plane migration Related: * https://github.com/poseidon/terraform-render-bootstrap/pull/148 * https://github.com/poseidon/terraform-render-bootstrap/pull/164 * Update Kubernetes from v1.16.3 to v1.17.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md/#v1170 * Update systemd services for the v0.17.x hyperkube * Binary asset locations within the upstream hyperkube image changed https://github.com/kubernetes/kubernetes/pull/84662 * Fix Container Linux and Flatcar Linux kubelet.service (rkt-fly with fairly dated CoreOS kubelet-wrapper) * Fix Fedora CoreOS kubelet.service (podman) * Fix Fedora CoreOS bootstrap.service * Fix delete-node kubectl usage for workers where nodes may delete themselves on shutdown (e.g. preemptible instances) * Update Calico from v3.10.1 to v3.10.2 * https://docs.projectcalico.org/v3.10/release-notes/ * Update CHANGES and tutorial notes for release * Update recommended Terraform and provider plugin versions * Update the rough count of resources created per cluster since its not been refreshed in a while (will vary based on cluster options) * Fix minor example typo in README * Update mkdocs-material from v4.5.1 to v4.6.0 * Update Grafana from v6.5.1 to v6.5.2 * https://github.com/grafana/grafana/releases/tag/v6.5.2 * Update kube-state-metrics from v1.8.0 to v1.9.0-rc.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0-rc.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0-rc.0 * Add Kubelet kubeconfig output for DigitalOcean * Allow the raw kubelet kubeconfig to be consumed via Terraform output * Update kube-state-metrics from v1.9.0-rc.1 to v1.9.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0-rc.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0-rc.0 * Update CoreDNS from v1.6.5 to v1.6.6 * https://coredns.io/2019/12/11/coredns-1.6.6-release/ * Update Prometheus from v2.14.0 to v2.15.0 * https://github.com/prometheus/prometheus/releases/tag/v2.15.0 * Update Prometheus from v2.15.0 to v2.15.1 * https://github.com/prometheus/prometheus/releases/tag/v2.15.1 * Update Calico from v3.10.2 to v3.11.1 * https://docs.projectcalico.org/v3.11/release-notes/ * Rename CLC files and favor Terraform list index syntax * Rename Container Linux Config (CLC) files to *.yaml to align with Fedora CoreOS Config (FCC) files and for syntax highlighting * Replace common uses of Terraform `element` (which wraps around) with `list[index]` syntax to surface index errors * Inline Container Linux kubelet.service, deprecate kubelet-wrapper * Change kubelet.service on Container Linux nodes to ExecStart Kubelet inline to replace the use of the host OS kubelet-wrapper script * Express rkt run flags and volume mounts in a clear, uniform way to make the Kubelet service easier to audit, manage, and understand * Eliminate reliance on a Container Linux kubelet-wrapper script * Typhoon for Fedora CoreOS developed a kubelet.service that similarly uses an inline ExecStart (except with podman instead of rkt) and a more minimal set of volume mounts. Adopt the volume improvements: * Change Kubelet /etc/kubernetes volume to read-only * Change Kubelet /etc/resolv.conf volume to read-only * Remove unneeded /var/lib/cni volume mount Background: * kubelet-wrapper was added in CoreOS around the time of Kubernetes v1.0 to simplify running a CoreOS-built hyperkube ACI image via rkt-fly. The script defaults are no longer ideal (e.g. rkt's notion of trust dates back to quay.io ACI image serving and signing, which informed the OCI standard images we use today, though they still lack rkt's signing ideas). * Shipping kubelet-wrapper was regretted at CoreOS, but remains in the distro for compatibility. The script is not updated to track hyperkube changes, but it is stable and kubelet.env overrides bridge most gaps * Typhoon Container Linux nodes have used kubelet-wrapper to rkt/rkt-fly run the Kubelet via the official k8s.gcr.io hyperkube image using overrides (new image registry, new image format, restart handling, new mounts, new entrypoint in v1.17). * Observation: Most of what it takes to run a Kubelet container is defined in Typhoon, not in kubelet-wrapper. The wrapper's value is now undermined by having to workaround its dated defaults. Typhoon may be better served defining Kubelet.service explicitly * Typhoon for Fedora CoreOS developed a kubelet.service without the use of a host OS kubelet-wrapper which is both clearer and eliminated some volume mounts * Disable Kubelet 127.0.0.1.10248 healthz endpoint * Kubelet runs a healthz server listening on 127.0.0.1:10248 by default. Its unused by Typhoon and can be disabled * https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/ * Enable kube-proxy metrics and allow Prometheus scrapes * Configure kube-proxy --metrics-bind-address=0.0.0.0 (default 127.0.0.1) to serve metrics on 0.0.0.0:10249 * Add firewall rules to allow Prometheus (resides on a worker) to scrape kube-proxy service endpoints on controllers or workers * Add a clusterIP: None service for kube-proxy endpoint discovery * Reduce Prometheus addon's node-exporter tolerations * Change node-exporter DaemonSet tolerations from tolerating all possible NoSchedule taints to tolerating the master taint and the not ready taint (we'd like metrics regardless) * Users who add custom node taints must add their custom taints to the addon node-exporter DaemonSet. As an addon, its expected users copy and manipulate manifests out-of-band in their own systems * Ensure /etc/kubernetes exists following Kubelet inlining * Inlining the Kubelet service removed the need for the kubelet.env file declared in Ignition. However, on some platforms, this removed the guarantee that /etc/kubernetes exists. Bare-Metal and DigitalOcean distribute the kubelet kubeconfig through Terraform file provisioner (scp) and place it in (now missing) /etc/kubernetes * https://github.com/poseidon/typhoon/pull/606 * Fix bare-metal and DigitalOcean Ignition to ensure the desired directory exists following first boot from disk * Cloud platforms with worker pools distribute the kubeconfig through Ignition user data (no impact or need) * Update Prometheus from v2.15.1 to v2.15.2 * https://github.com/prometheus/prometheus/releases/tag/v2.15.2 * Allow terraform-provider-google v3.x plugin versions * Typhoon Google Cloud is compatible with `terraform-provider-google` v3.x releases * No v3.x specific features are used, so v2.19+ provider versions are still allowed, to ease migrations * Update kube-state-metrics from v1.9.0 to v1.9.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.1 * Remove unneeded Kubelet /var/run mount on Fedora CoreOS * /var/run symlinks to /run (already mounted) * Fix bare-metal instruction for watching install to disk * Original instructions were to watch install to disk by SSH'ing via port 2222 following Typhoon v1.10.1. Restore that message, since the version number in the instruction was incorrectly bumped on each release * Update AWS Fedora CoreOS AMI filter for fedora-coreos-31 * Select the most recent fedora-coreos-31 AMI on AWS, instead of the most recent fedora-coreos-30 AMI (Nov 27, 2019) * Evaluated with fedora-coreos-31.20200108.2.0-hvm * Update Kubernetes from v1.17.0 to v1.17.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md#v1171 * Update Calico from v3.11.1 to v3.11.2 * https://docs.projectcalico.org/v3.11/release-notes/ * Fix link in maintenance docs * Also a fix version mention, since Terraform v0.12 was added in Typhoon v1.15.0 * Update Grafana from v6.5.2 to v6.5.3 * https://github.com/grafana/grafana/releases/tag/v6.5.3 * Update kube-state-metrics from v1.9.1 to v1.9.2 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.2 * Update bare-metal Fedora CoreOS image location * Use Fedora CoreOS production download streams (change) * Use live PXE kernel and initramfs images * https://getfedora.org/coreos/download/ * Update docs example to use public images (cache is still recommended at large scale) and stable stream * Update nginx-ingress from v0.26.1 to v0.27.1 * Change runAsUser from 33 to 101 for new alpine-based image * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.27.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.27.1 * Update Kubernetes from v1.17.1 to v1.17.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md#v1172 * Update kube-state-metrics from v1.9.2 to v1.9.3 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.3 * Promote Fedora CoreOS from preview to alpha in docs * Add an announcement to the website as well * Fix minor typo in announcement date * Update Grafana from v6.5.3 to v6.6.0 * https://github.com/grafana/grafana/releases/tag/v6.6.0 * Update nginx-ingress from v0.27.1 to v0.28.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.28.0 * Add module for Fedora CoreOS on Google Cloud * Add Typhoon Fedora CoreOS on Google Cloud as alpha * Add docs on uploading the Fedora CoreOS GCP gzipped tarball to Google Cloud storage to create a boot disk image * Update kube-state-metrics from v1.9.3 to v1.9.4 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.4 * Update Calico from v3.11.2 to v3.12.0 * https://docs.projectcalico.org/release-notes/#v3120 * Remove reverse packet filter override, since Calico no longer relies on the setting * https://github.com/coreos/fedora-coreos-tracker/issues/219 * https://github.com/projectcalico/felix/pull/2189 * Update Grafana from v6.6.0 to v6.6.1 * https://github.com/grafana/grafana/releases/tag/v6.6.1 * Update docs generation packages * Update mkdocs-material from v4.6.0 to v4.6.2 * Add guide for Typhoon with Flatcar Linux on Google Cloud * Add docs on manually uploading a Flatcar Linux GCE/GCP gzipped tarball image as a Compute Engine image for use with the Typhoon container-linux module * Set status of Flatcar Linux on Google Cloud to alpha * Update Fedora CoreOS kernel arguments to align with upstream * Align bare-metal kernel arguments with upstream docs * Add missing initrd argument which can cause issues if not present. Fix #638 * Add tty0 and ttyS0 consoles (matches Container Linux) * Remove unused coreos.inst=yes Related: https://docs.fedoraproject.org/en-US/fedora-coreos/bare-metal/ * Update Kubernetes from v1.17.2 to v1.17.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.17.md#v1173 * Set docker log driver to json-file on Fedora CoreOS * Fix the last minor issue for Fedora CoreOS clusters to pass CNCF's Kubernetes conformance tests * Kubelet supports a seldom used feature `kubectl logs --limit-bytes=N` to trim a log stream to a desired length. Kubelet handles this in the CRI driver. The Kubelet docker shim only supports the limit bytes feature when Docker is configured with the default `json-file` logging driver * CNCF conformance tests started requiring limit-bytes be supported, indirectly forcing the log driver choice until either the Kubelet or the conformance tests are fixed * Fedora CoreOS defaults Docker to use `journald` (desired). For now, as a workaround to offer conformant clusters, the log driver can be set back to `json-file`. RHEL CoreOS likely won't have noticed the non-conformance since its using crio runtime * https://github.com/kubernetes/kubernetes/issues/86367 Note: When upstream has a fix, the aim is to drop the docker config override and use the journald default * Promote Fedora CoreOS AWS/bare-metal to beta * Remove alpha warnings from docs headers * Update recommended Terraform versions and providers * Sync the documented Terraform versions and provider plugin versions to those that are actively used/tested by the author * Update CHANGELOG sections and links * Add guide for Typhoon with Flatcar Linux on DigitalOcean * Add docs on manually uploading a Flatcar Linux DigitalOcean bin image as a custom image and using a data reference * Set status of Flatcar Linux on DigitalOcean to alpha * IPv6 is not supported for DigitalOcean custom images * Update Prometheus from v1.15.2 to v1.16.0 * https://github.com/prometheus/prometheus/releases/tag/v2.16.0 * Change Kubelet /var/lib/calico mount to read-only (#643) * Kubelet only requires read access to /var/lib/calico Signed-off-by: Suraj Deshmukh * Update CoreDNS from v1.6.6 to v1.6.7 * https://coredns.io/2020/01/28/coredns-1.6.7-release/ * Update mkdocs-material from v4.6.2 to v4.6.3 * Fix worker_node_labels for initial Fedora CoreOS * Add Terraform strip markers to consume beginning and trailing whitespace in templated Kubelet arguments for podman (Fedora CoreOS only) * Fix initial `worker_node_labels` being quietly ignored on Fedora CoreOS cloud platforms that offer the feature * Close https://github.com/poseidon/typhoon/issues/650 * Update Grafana from v6.6.1 to v6.6.2 * https://github.com/grafana/grafana/releases/tag/v6.6.2 * Update kube-state-metrics from v1.9.4 to v1.9.5 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.5 * Update nginx-ingress from v0.28.0 to v0.29.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.29.0 * Update nginx-ingress from v0.29.0 to v0.30.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.30.0 * Update node-exporter from v0.18.1 to v1.0.0-rc.0 * Update mdadm alert rule; node-exporter adds `state` label to `node_md_disks` and removes `node_md_disks_active` * https://github.com/prometheus/node_exporter/releases/tag/v1.0.0-rc.0 * Use a route table with separate (rather than inline) routes * Allow users to extend the route table using a data reference and adding route resources (e.g. unusual peering setups) * Note: Internally connecting AWS clusters can reduce cross-cloud flexibility and inhibits blue-green cluster patterns. It is not recommended * Update etcd from v3.4.3 to v3.4.4 * https://github.com/etcd-io/etcd/releases/tag/v3.4.4 * Add automatic worker deletion on Fedora CoreOS clouds * On clouds where workers can scale down or be preempted (AWS, GCP, Azure), shutdown runs delete-node.service to remove a node a prevent NotReady nodes from lingering * Add the delete-node.service that wasn't carried over from Container Linux and port it to use podman * Change Container Linux etcd-member to fetch with docker:// * Quay has historically generated ACI signatures for images to facilitate rkt's notions of verification (it allowed authors to actually sign images, though `--trust-keys-from-https` is in use since etcd and most authors don't sign images). OCI standardization didn't adopt verification ideas and checking signatures has fallen out of favor. * Fix an issue where Quay no longer seems to be generating ACI signatures for new images (e.g. quay.io/coreos/etcd:v.3.4.4) * Don't be alarmed by rkt `--insecure-options=image`. It refers to disabling image signature checking (i.e. docker pull doesn't check signatures either) * System containers for Kubelet and bootstrap have transitioned to the docker:// transport, so there is precedent and this brings all the system containers on Container Linux controllers into alignment * Refresh Prometheus alerts and Grafana dashboards * Add 2 min wait before KubeNodeUnreachable to be less noisy on premeptible clusters * Add a BlackboxProbeFailure alert for any failing probes for services annotated `prometheus.io/probe: true` * Upgrade terraform-provider-azurerm to v2.0+ * Add support for `terraform-provider-azurerm` v2.0+. Require `terraform-provider-azurerm` v2.0+ and drop v1.x support since the Azure provider major release is not backwards compatible * Use Azure's new Linux VM and Linux VM Scale Set resources * Change controller's Azure disk caching to None * Associate subnets (in addition to NICs) with security groups (aesthetic) * If set, change `worker_priority` from `Low` to `Spot` (action required) Related: * https://www.terraform.io/docs/providers/azurerm/guides/2.0-upgrade-guide.html * Accept initial worker node labels and taints map on bare-metal * Add `worker_node_labels` map from node name to a list of initial node label strings * Add `worker_node_taints` map from node name to a list of initial node taint strings * Unlike cloud platforms, bare-metal node labels and taints are defined via a map from node name to list of labels/taints. Bare-metal clusters may have heterogeneous hardware so per node labels and taints are accepted * Only worker node names are allowed. Workloads are not scheduled on controller nodes so altering their labels/taints isn't suitable ``` module "mercury" { ... worker_node_labels = { "node2" = ["role=special"] } worker_node_taints = { "node2" = ["role=special:NoSchedule"] } } ``` Related: https://github.com/poseidon/typhoon/issues/429 * Add support for Flatcar Linux on Azure * Accept `os_image` "flatcar-stable" and "flatcar-beta" to use Kinvolk's Flatcar Linux images from the Azure Marketplace Note: Flatcar Linux Azure Marketplace images require terms be accepted before use * Update Calico from v3.12.0 to v3.13.1 * https://docs.projectcalico.org/v3.13/release-notes/ * Update Kubernetes from v1.17.3 to v1.17.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.17.md#v1174 * Update recommended Terraform versions and providers * Sync the documented Terraform versions and provider plugin versions to those that are actively used/tested by the author * Remove Container Linux Update Operator (CLUO) addon * Stop providing example manifests for the Container Linux Update Operator (CLUO) * CLUO requires patches to support Kubernetes v1.16+, but the project and push access is rather unowned * CLUO hasn't been in active use in our clusters and won't be relevant beyond Container Linux. Not to say folks can't patch it and run it on their own. Examples just aren't provided here Related: https://github.com/coreos/container-linux-update-operator/pull/197 * Promote Fedora CoreOS AWS and Google Cloud * Promote Fedora CoreOS AWS to stable * Promote Fedora CoreOS GCP to beta * Update etcd from v3.4.4 to v3.4.5 * https://github.com/etcd-io/etcd/releases/tag/v3.4.5 * Update Prometheus from v2.16.0 to v2.17.0-rc.3 * https://github.com/prometheus/prometheus/releases/tag/v2.17.0-rc.3 * Update Grafana from v6.6.2 to v6.7.1 * https://github.com/grafana/grafana/releases/tag/v6.7.1 * Switch from upstream hyperkube image to individual images * Kubernetes plans to stop releasing the hyperkube container image * Upstream will continue to publish `kube-apiserver`, `kube-controller-manager`, `kube-scheduler`, and `kube-proxy` container images to `k8s.gcr.io` * Upstream will publish Kubelet only as a binary for distros to package, either as a DEB/RPM on traditional distros or a container image on container-optimized operating systems * Typhoon will package the upstream Kubelet (checksummed) and its dependencies as a container image for use on CoreOS Container Linux, Flatcar Linux, and Fedora CoreOS * Update the Typhoon container image security policy to list `quay.io/poseidon/kubelet`as an official distributed artifact Hyperkube: https://github.com/kubernetes/kubernetes/pull/88676 Kubelet Container Image: https://github.com/poseidon/kubelet Kubelet Quay Repo: https://quay.io/repository/poseidon/kubelet * Fix image tag for Container Linux AWS workers * #669 left one reference to the original SHA tagged image before the v1.17.4 image tag was applied * Update Prometheus from v2.17.0-rc.3 to v2.17.0 * https://github.com/prometheus/prometheus/releases/tag/v2.17.0 * Rename DigitalOcean image variable to os_image * Rename variable `image` to `os_image` to match the naming used for the same purpose on other supported platforms (e.g. AWS, Azure, GCP) * Deprecate asset_dir variable and remove docs * Remove docs for the `asset_dir` variable and deprecate it in CHANGES. It will be removed in an upcoming release * Typhoon v1.17.0 introduced a new mechanism for managing and distributing generated assets that stopped relying on writing out to disk. `asset_dir` became optional and defaulted to being unset / off (recommended) * Add Fedora CoreOS to issue template and docs * Update several Container Linux references to start referring to Flatcar Linux * Update docs and mentions of Fedora CoreOS * Update Kubernetes from v1.17.4 to v1.18.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md * Update docs from Kubernetes v1.17.4 to v1.18.0 * Set docker log driver to journald on Fedora CoreOS * Before Kubernetes v1.18.0, Kubelet only supported kubectl `--limit-bytes` with the Docker `json-file` log driver so the Fedora CoreOS default was overridden for conformance. See https://github.com/poseidon/typhoon/pull/642 * Kubelet v1.18+ implemented support for other docker log drivers, so the Fedora CoreOS default `journald` can be used again Rel: https://github.com/kubernetes/kubernetes/issues/86367 * Update Prometheus from v2.17.0 to v2.17.1 * https://github.com/prometheus/prometheus/releases/tag/v2.17.1 * Add CoreOS Container Linux EOL recommendation to CHANGES * Recommend that users who have not yet tried Fedora CoreOS or Flatcar Linux do so. Likely, Container Linux will reach EOL and platform support / stability ratings will be in a mixed state. Nevertheless, folks should migrate by September. * Fix delete-node.service kubectl service exec's * Fix delete-node service that runs on worker (cloud-only) shutdown to delete a Kubernetes node. Regressed in #669 (unreleased) * Use rkt `--exec` to invoke kubectl binary in the kubelet image * Use podman `--entrypoint` to invoke the kubectl binary in the kubelet image * Fix Fedora CoreOS AMI to filter for stable images * Fix issue observed in us-east-1 where AMI filters chose the latest testing channel release, rather than the stable chanel * Fedora CoreOS AMI filter selects the latest image with a matching name, x86_64, and hvm, excluding dev images. Add a filter for "Fedora CoreOS stable", which seems to be the only distinguishing metadata indicating the channel * Add support for Fedora CoreOS snippets * Refresh snippets customization docs * Requires terraform-provider-ct v0.5+ * Allow bootstrap re-apply for Fedora CoreOS GCP * Problem: Fedora CoreOS images are manually uploaded to GCP. When a cluster is created with a stale image, Zincati immediately checks for the latest stable image, fetches, and reboots. In practice, this can unfortunately occur exactly during the initial cluster bootstrap phase. * Recommended: Upload the latest Fedora CoreOS image regularly * Mitigation: Allow a failed bootstrap.service run (which won't touch the done ConditionalPathExists) to be re-run by running `terraforma apply` again. Add a known issue to CHANGES * Update docs to show the current Fedora CoreOS stable version to reduce likelihood users see this issue Longer term ideas: * Ideal: Fedora CoreOS publishes a stable channel. Instances will always boot with the latest image in a channel. The problem disappears since it works the same way AWS does * Timer: Consider some timer-based approach to have zincati delay any system reboots for the first ~30 min of a machine's life. Possibly just configured on the controller node https://github.com/coreos/zincati/pull/251 * External coordination: For Container Linux, locksmith filled a similar role and was disabled to allow CLUO to coordinate reboots. By running atop Kubernetes, it was not possible for the reboot to occur before cluster bootstrap * Rely on https://github.com/coreos/zincati/issues/115 to delay the reboot since bootstrap involves an SSH session * Use path-based activation of zincati on controllers and set that path at the end of the bootstrap process Rel: https://github.com/coreos/fedora-coreos-tracker/issues/239 * Change default kube-system DaemonSet tolerations * Change kube-proxy, flannel, and calico-node DaemonSet tolerations to tolerate `node.kubernetes.io/not-ready` and `node-role.kubernetes.io/master` (i.e. controllers) explicitly, rather than tolerating all taints * kube-system DaemonSets will no longer tolerate custom node taints by default. Instead, custom node taints must be enumerated to opt-in to scheduling/executing the kube-system DaemonSets * Consider setting the daemonset_tolerations variable of terraform-render-bootstrap at a later date Background: Tolerating all taints ruled out use-cases where certain nodes might legitimately need to keep kube-proxy or CNI networking disabled Related: https://github.com/poseidon/terraform-render-bootstrap/pull/179 * Fix bootstrap regression when networking="flannel" * Fix bootstrap error for missing `manifests-networking/crd*yaml` when `networking = "flannel"` * Cleanup manifest-networking directory left during bootstrap * Regressed in v1.18.0 changes for Calico https://github.com/poseidon/typhoon/pull/675 * Rename Container Linux snippets variable for consistency * Rename controller_clc_snippets to controller_snippets (cloud platforms) * Rename worker_clc_snippets to worker_snippets (cloud platforms) * Rename clc_snippets to snippets (bare-metal) * Update flannel from v0.11.0 to v0.12.0 * https://github.com/coreos/flannel/releases/tag/v0.12.0 * Fix UDP outbound and clock sync timeouts on Azure workers * Add "lb" outbound rule for worker TCP _and_ UDP traffic * Fix Azure worker nodes clock synchronization being inactive due to timeouts reaching the CoreOS / Flatcar NTP pool * Fix Azure worker nodes not providing outbount UDP connectivity Background: Azure provides VMs outbound connectivity either by having a public IP or via an SNAT masquerade feature bundled with their virtual load balancing abstraction (in contrast with, say, a NAT gateway). Azure worker nodes have only a private IP, but are associated with the cluster load balancer's backend pool and ingress frontend IP. Outbound traffic uses SNAT with this frontend IP. A subtle detail with Azure SNAT seems to be that since both inbound lb_rule's are TCP only, outbound UDP traffic isn't SNAT'd (highlights the reasons Azure shouldn't have conflated inbound load balancing with outbound SNAT concepts). However, adding a separate outbound rule and disabling outbound SNAT on our ingress lb_rule's we can tell Azure to continue load balancing as before, and support outbound SNAT for worker traffic of both the TCP and UDP protocol. Fixes clock synchronization timeouts: ``` systemd-timesyncd[786]: Timed out waiting for reply from 45.79.36.123:123 (3.flatcar.pool.ntp.org) ``` Azure controller nodes have their own public IP, so controllers (and etcd) nodes have not had clock synchronization or outbound UDP issues * Fix terraform fmt * Refresh Prometheus rules/alerts and Grafana dashboards * Refresh upstream Prometheus rules and alerts and Grafana dashboards * All Loki recording rules for convenience * Update Grafana from v6.7.1 to v6.7.2 * https://github.com/grafana/grafana/releases/tag/v6.7.2 * Update etcd from v3.4.5 to v3.4.7 * https://github.com/etcd-io/etcd/releases/tag/v3.4.7 * https://github.com/etcd-io/etcd/releases/tag/v3.4.6 * Update Kubernetes from v1.18.0 to v1.18.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1181 * Add support for Fedora CoreOS on DigitalOcean * Add `digital-ocean/fedora-coreos/kubernetes` module * DigitalOcean custom uploaded images do not permit droplet IPv6 networking * Update CHANGES for v1.18.1 release * Change order of modules in the README * Fix docs TOC to include Fedora CoreOS DigitalOcean * Change `container-linux` module preference to Flatcar Linux * No change to Fedora CoreOS modules * For Container Linx AWS and Azure, change the `os_image` default from coreos-stable to flatcar-stable * For Container Linux GCP and DigitalOcean, change `os_image` to be required since users should upload a Flatcar Linux image and set the variable * For Container Linux bare-metal, recommend users change the `os_channel` to Flatcar Linux. No actual module change. * Add support for Fedora CoreOS on Azure * Add `azure/fedora-coreos/kubernetes` module * Fix Fedora CoreOS Azure MTU with Calico * With Calico VXLAN on Fedora CoreOS the 1450 MTU should be used * Update Kubernetes from v1.18.1 to v1.18.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#changelog-since-v1181 * Remove temporary workaround for v1.18.0 apply issue * In v1.18.0, kubectl apply would fail to apply manifests if any single manifest was unable to validate. For example, if a CRD and CR were defined in the same directory, apply would fail since the CR would be invalid as the CRD wouldn't exist * Typhoon temporary workaround was to separate CNI CRD manifests and explicitly apply them first. No longer needed in v1.18.1+ * Kubernetes v1.18.1 restored the prior behavior where kubectl apply applies as many valid manifests as it can. In the example above, the CRD would be applied and the CR could be applied if the kubectl apply was re-run (allowing for apply loops). * Upstream fix: https://github.com/kubernetes/kubernetes/pull/89864 * Revert Flatcar Linux Azure to manual upload images * Initial support for Flatcar Linux on Azure used the Flatcar Linux Azure Marketplace images (e.g. `flatcar-stable`) in https://github.com/poseidon/typhoon/pull/664 * Flatcar Linux Azure Marketplace images have some unresolved items https://github.com/poseidon/typhoon/issues/703 * Until the Marketplace items are resolved, revert to requiring Flatcar Linux's images be manually uploaded (like GCP and DigitalOcean) * Fix bootstrap mount to use shared volume SELinux label * Race: During initial bootstrap, static control plane pods could hang with Permission denied to bootstrap secrets. A manual fix involved restarting Kubelet, which relabeled mounts The race had no effect on subsequent reboots. * bootstrap.service runs podman with a private unshared mount of /etc/kubernetes/bootstrap-secrets which uses an SELinux MCS label with a category pair. However, bootstrap-secrets should be shared as its mounted by Docker pods kube-apiserver, kube-scheduler, and kube-controller-manager. Restarting Kubelet was a manual fix because Kubelet relabels all /etc/kubernetes * Fix bootstrap Pod to use the shared volume label, which leaves bootstrap-secrets files with SELinux level s0 without MCS * Also allow failed bootstrap.service to be re-applied. This was missing on bare-metal and AWS * Fix race condition creating DigitalOcean firewall rules * DigitalOcean firewall rules should reference Terraform tag resources rather than using tag strings. Otherwise, terraform apply can fail (neeeds rerun) if a tag has not yet been created * Update Prometheus from v2.17.1 to v2.17.2 * https://github.com/prometheus/prometheus/releases/tag/v2.17.2 * Remove extraneous sudo from layout asset unpacking * Update Calico from v3.13.1 to v3.13.3 * https://docs.projectcalico.org/v3.13/release-notes/ * Enable Kubelet TLS bootstrap and NodeRestriction * Enable bootstrap token authentication on kube-apiserver * Generate the bootstrap.kubernetes.io/token Secret that may be used as a bootstrap token * Generate a bootstrap kubeconfig (with a bootstrap token) to be securely distributed to nodes. Each Kubelet will use the bootstrap kubeconfig to authenticate to kube-apiserver as `system:bootstrappers` and send a node-unique CSR for kube-controller-manager to automatically approve to issue a Kubelet certificate and kubeconfig (expires in 72 hours) * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the `system:node-bootstrapper` ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr nodeclient ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr selfnodeclient ClusterRole * Enable NodeRestriction admission controller to limit the scope of Node or Pod objects a Kubelet can modify to those of the node itself * Ability for a Kubelet to delete its Node object is retained as preemptible nodes or those in auto-scaling instance groups need to be able to remove themselves on shutdown. This need continues to have precedence over any risk of a node deleting itself maliciously Security notes: 1. Issued Kubelet certificates authenticate as user `system:node:NAME` and group `system:nodes` and are limited in their authorization to perform API operations by Node authorization and NodeRestriction admission. Previously, a Kubelet's authorization was broader. This is the primary security motivation. 2. The bootstrap kubeconfig credential has the same sensitivity as the previous generated TLS client-certificate kubeconfig. It must be distributed securely to nodes. Its compromise still allows an attacker to obtain a Kubelet kubeconfig 3. Bootstrapping Kubelet kubeconfig's with a limited lifetime offers a slight security improvement. * An attacker who obtains the kubeconfig can likely obtain the bootstrap kubeconfig as well, to obtain the ability to renew their access * A compromised bootstrap kubeconfig could plausibly be handled by replacing the bootstrap token Secret, distributing the token to new nodes, and expiration. Whereas a compromised TLS-client certificate kubeconfig can't be revoked (no CRL). However, replacing a bootstrap token can be impractical in real cluster environments, so the limited lifetime is mostly a theoretical benefit. * Cluster CSR objects are visible via kubectl which is nice 4. Bootstrapping node-unique Kubelet kubeconfigs means Kubelet clients have more identity information, which can improve the utility of audits and future features Rel: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/ Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/185 * Add Fedora CoreOS Azure docs to site navigation * Fix missing Fedora CoreOS Azure docs * Update recommended Terraform provider versions * Sync the Terraform provider plugin versions to those actively used and tested by the author * Fix terraform fmt * Use Terraform element wrap-around for AWS controllers subnet_id (#714) * Fix Terraform plan error when controller_count exceeds available AWS zones (e.g. 5 controllers) * Update Grafana from v6.7.2 to v7.0.0-beta1 * https://github.com/grafana/grafana/releases/tag/v7.0.0-beta1 * Update Prometheus from v2.17.2 to v2.18.0-rc.1 * https://github.com/prometheus/prometheus/releases/tag/v2.18.0-rc.1 * Update nginx-ingress from v0.30.0 to v0.32.0 * Add support for IngressClass and RBAC authorization * Since our nginx ingress controller example uses the flag `--ingress-class=public`, add an IngressClass to go along with it Rel: https://kubernetes.io/docs/concepts/services-networking/ingress/#ingress-class * Update Prometheus from v2.18.0-rc.1 to v2.18.0 * https://github.com/prometheus/prometheus/releases/tag/v2.18.0 * Update Prometheus from v2.18.0 to v2.18.1 * https://github.com/prometheus/prometheus/releases/tag/v2.18.1 * Update Grafana from v7.0.0-beta1 to v7.0.0-beta2 * https://github.com/grafana/grafana/releases/tag/v7.0.0-beta2 * Use Fedora CoreOS image streams on Google Cloud * Add `os_stream` variable to set a Fedora CoreOS stream to `stable` (default), `testing`, or `next` * Deprecate `os_image` variable. Remove docs about uploading Fedora CoreOS images manually, this is no longer needed * https://docs.fedoraproject.org/en-US/fedora-coreos/update-streams/ Rel: https://github.com/coreos/fedora-coreos-docs/pull/70 * Fix Calico install-cni crash loop on Pod restarts * Set a consistent MCS level/range for Calico install-cni * Note: Rebooting a node was a workaround, because Kubelet relabels /etc/kubernetes(/cni/net.d) Background: * On SELinux enforcing systems, the Calico CNI install-cni container ran with default SELinux context and a random MCS pair. install-cni places CNI configs by first creating a temporary file and then moving them into place, which means the file MCS categories depend on the containers SELinux context. * calico-node Pod restarts creates a new install-cni container with a different MCS pair that cannot access the earlier written file (it places configs every time), causing the init container to error and calico-node to crash loop * https://github.com/projectcalico/cni-plugin/issues/874 ``` mv: inter-device move failed: '/calico.conf.tmp' to '/host/etc/cni/net.d/10-calico.conflist'; unable to remove target: Permission denied Failed to mv files. This may be caused by selinux configuration on the host, or something else. ``` Note, this isn't a host SELinux configuration issue. Related: * https://github.com/poseidon/terraform-render-bootstrap/pull/186 * Update Calico from v3.13.3 to v3.14.0 * https://docs.projectcalico.org/v3.14/release-notes/ * Update Grafana from v7.0.0-beta2 to v7.0.0-beta.3 * https://github.com/grafana/grafana/releases/tag/v7.0.0-beta3 * Support Fedora CoreOS OS image streams on AWS * Add `os_stream` variable to set the stream to stable (default), testing, or next * Remove unused os_image variable on Fedora CoreOS AWS * Highlight SELinux enforcing mode in features * Restore use of Flatcar Linux Azure Marketplace image * Switch Flatcar Linux Azure to use the Marketplace image from Kinvolk (offer `flatcar-container-linux-free`) * Accepting Azure Marketplace terms is still neccessary, update docs to show accepting the free offer rather than BYOL * Upstream Flatcar: https://github.com/flatcar-linux/Flatcar/issues/82 * Typhoon: https://github.com/poseidon/typhoon/issues/703 * Update Grafana from v7.0.0-beta3 to v7.0.0 * https://github.com/grafana/grafana/releases/tag/7.0.0 * Update kube-state-metrics from v1.9.5 to v1.9.6 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.6 * Update node-exporter from v1.0.0-rc.0 to v1.0.0-rc.1 * https://github.com/prometheus/node_exporter/releases/tag/v1.0.0-rc.1 * Rollback Grafana to v7.0.0-beta3, v7.0.0 image is missing * Grafana hasn't published the v7.0.0 image yet * Use new Azure subnet to set address_prefixes list * Update Azure subnet `address_prefix` to `azure_prefixes` list * Fix warning that `address_prefix` is deprecated * Require `terraform-provider-azurerm` v2.8.0+ (action required) Rel: https://github.com/terraform-providers/terraform-provider-azurerm/pull/6493 * Update Grafana from v7.0.0-beta2 to v7.0.0 * https://grafana.com/docs/grafana/latest/guides/whats-new-in-v7-0/ * Update etcd from v3.4.7 to v3.4.8 * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md#v348-2020-05-18 * Fix Fedora CoreOS on GCP proposing controller recreate * With Fedora CoreOS image stream support (#727), the latest resolved image will change over the lifecycle of a cluster. * Fix issue where an image diff proposed replacing a Fedora CoreOS controller on GCP, introduced in #727 (unreleased) * Also ignore image diffs to the GCP managed instance group of workers. This aligns with worker AMI diffs being ignored on AWS and similar on Azure, since workers update themselves. Background: * Controller nodes should strictly not be recreated by Terraform, they are stateful (etcd) and should not be replaced * Across cloud platforms, OS image diffs are ignored since both Flatcar Linux and Fedora CoreOS nodes update themselves. For workers, user-data or disk size diffs (where relevant) are allowed to recreate workers templates/configs since these are considered to be user-initiated declarations that a reprovision should be done * Set Kubelet image via kubelet.service KUBELET_IMAGE * Write the systemd kubelet.service to use `KUBELET_IMAGE` as the Kubelet. This provides a nice way to use systemd dropins to temporarily override the image (e.g. during a registry outage) Note: Only Typhoon Kubelet images and registries are supported. * Update Kubernetes from v1.18.2 to v1.18.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md * Upgrade docs packages and refresh content * Promote DigitalOcean from alpha to beta for Fedora CoreOS and Flatcar Linux * Upgrade mkdocs-material and PyPI packages for docs * Replace docs mentions of Container Linux with Flatcar Linux and move docs/cl to docs/flatcar-linux * Deprecate CoreOS Container Linux support. Its still usable for some time, but start removing docs * Update etcd from v3.4.8 to v3.4.9 * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md#v349-2020-05-20 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions to those actively used internally * Fix terraform fmt * Update node-exporter from v1.0.0-rc.1 to v1.0.0 * https://github.com/prometheus/node_exporter/releases/tag/v1.0.0 * Update Grafana from v7.0.0 to v7.0.1 * https://github.com/grafana/grafana/releases/tag/v7.0.1 * Update mkdocs-material from v5.2.0 to v5.2.2 * https://github.com/squidfunk/mkdocs-material/releases/tag/5.2.2 * Update Github issue template to use drop-downs (#747) * Create a stricter bug report template * Highlight topics that are not accepted in issues: operation, support, debugging, advice, or Kubernetes concepts * Add a section to strongly suggest bug reports link a PR or describe a solution. This may be able to weed out topics that aren't focused bug reports * Update the fallback issue template * Even "blank" issues need to fill out the fallback template * Update Calico from v3.14.0 to v3.14.1 * https://docs.projectcalico.org/v3.14/release-notes/ * Change Kubelet container image publishing * Build Kubelet container images internally and publish to Quay and Dockerhub (new) as an alternative in case of registry outage or breach * Use our infra to provide single and multi-arch (default) Kublet images for possible future use * Docs: Show how to use alternative Kubelet images via snippets and a systemd dropin (builds on #737) Changes: * Update docs with changes to Kubelet image building * If you prefer to trust images built by Quay/Dockerhub, automated image builds are still available with unique tags (albeit with some limitations): * Quay automated builds are tagged `build-{short_sha}` (limit: only amd64) * Dockerhub automated builts are tagged `build-{tag}` and `build-master` (limit: only amd64, no shas) Links: * Kubelet: https://github.com/poseidon/kubelet * Docs: https://typhoon.psdn.io/topics/security/#container-images * Registries: * quay.io/poseidon/kubelet * docker.io/psdn/kubelet * Tweak minor style elements of issue templates * Update kube-state-metrics from v1.9.6 to v1.9.7 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.7 * Update Grafana from v7.0.1 to v7.0.3 * https://github.com/grafana/grafana/releases/tag/v7.0.2 * https://github.com/grafana/grafana/releases/tag/v7.0.3 * Update Prometheus from v2.18.1 to v2.19.0-rc.0 * https://github.com/prometheus/prometheus/releases/tag/v2.19.0-rc.0 * Fix Fedora CoreOS docs for selecting a stream * Fedora CoreOS image `os_stream` stable, testing, and next have been configurable since v1.18.3 * Remove mention of outdated `os_image` variable * Update security disclosure contact email * Use security@psdn.io across github.com/poseidon projects * Use strict mode for Container Linux Configs * Enable terraform-provider-ct `strict` mode for parsing Container Linux Configs and snippets * Fix Container Linux Config systemd unit syntax `enable` (old) to `enabled` * Align with Fedora CoreOS which uses strict mode already * Update Prometheus from v2.19.0-rc.0 to v2.19.0 * https://github.com/prometheus/prometheus/releases/tag/v2.19.0 * Remove unused Kubelet cert / key Terraform state * Generated Kubelet TLS certificate and key are not longer used or distributed to machines since Kubelet TLS bootstrap is used instead. Remove the certificate and key from state * Remove unused Kubelet lock-file and exit-on-lock-contention * Kubelet `--lock-file` and `--exit-on-lock-contention` date back to usage of bootkube and at one point running Kubelet in a "self-hosted" style whereby an on-host Kubelet (rkt) started pods, but then a Kubelet DaemonSet was scheduled and able to take over (hence self-hosted). `lock-file` and `exit-on-lock-contention` flags supported this pivot. The pattern has been out of favor (in bootkube too) for years because of dueling Kubelet complexity * Typhoon runs Kubelet as a container via an on-host systemd unit using podman (Fedora CoreOS) or rkt (Flatcar Linux). In fact, Typhoon no longer uses bootkube or control plane pivot (let alone Kubelet pivot) and uses static pods since v1.16.0 * https://github.com/poseidon/typhoon/pull/536 * Update node-exporter from v1.0.0 to v1.0.1 * https://github.com/prometheus/node_exporter/releases/tag/v1.0.1 * Update mkdocs packages for website * Fix typo in DigitalOcean docs title * Update nginx-ingress from v0.32.0 to v0.33.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-0.33.0 * Update Kubernetes from v1.18.3 to v1.18.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1184 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions with those used internally * Rename controller node label and NoSchedule taint * Remove node label `node.kubernetes.io/master` from controller nodes * Use `node.kubernetes.io/controller` (present since v1.9.5, [#160](https://github.com/poseidon/typhoon/pull/160)) to node select controllers * Rename controller NoSchedule taint from `node-role.kubernetes.io/master` to `node-role.kubernetes.io/controller` * Tolerate the new taint name for workloads that may run on controller nodes and stop tolerating `node-role.kubernetes.io/master` taint * Fix Kubelet starting before hostname set on FCOS AWS * Fedora CoreOS `kubelet.service` can start before the hostname is set. Kubelet reads the hostname to determine the node name to register. If the hostname was read as localhost, Kubelet will continue trying to register as localhost (problem) * This race manifests as a node that appears NotReady, the Kubelet is trying to register as localhost, while the host itself (by then) has an AWS provided hostname. Restarting kubelet.service is a manual fix so Kubelet re-reads the hostname * This race could only be shown on AWS, not on Google Cloud or Azure despite attempts. Bare-metal and DigitalOcean differ and use hostname-override (e.g. afterburn) so they're not affected * Wait for nodes to have a non-localhost hostname in the oneshot that awaits /etc/resolve.conf. Typhoon has no valid cases for a node hostname being localhost (not even single-node clusters) Related Openshift: https://github.com/openshift/machine-config-operator/pull/1813 Close https://github.com/poseidon/typhoon/issues/765 * Reduce Calcio MTU on Fedora CoreOS Azure * Change the Calico VXLAN interface for MTU from 1450 to 1410 * VXLAN on Azure should support MTU 1450. However, there is history where performance measures have shown that 1410 is needed to have expected performance. Flatcar Linux has the same MTU 1410 override and note * FCOS 31.20200323.3.2 was known to perform fine with 1450, but now in 31.20200517.3.0 the right value seems to be 1410 * Add experimental Cilium CNI provider * Accept experimental CNI `networking` mode "cilium" * Run Cilium v1.8.0-rc4 with overlay vxlan tunnels and a minimal set of features. We're interested in: * IPAM: Divide pod_cidr into /24 subnets per node * CNI networking pod-to-pod, pod-to-external * BPF masquerade * NetworkPolicy as defined by Kubernetes (no L7 Policy) * Continue using kube-proxy with Cilium probe mode * Firewall changes: * Require UDP 8472 for vxlan (Linux kernel default) between nodes * Optional ICMP echo(8) between nodes for host reachability (health) * Optional TCP 4240 between nodes for endpoint reachability (health) Known Issues: * Containers with `hostPort` don't listen on all host addresses, these workloads must use `hostNetwork` for now https://github.com/cilium/cilium/issues/12116 * Erroneous warning on Fedora CoreOS https://github.com/cilium/cilium/issues/10256 Note: This is experimental. It is not listed in docs and may be changed or removed without a deprecation notice Related: * https://github.com/poseidon/terraform-render-bootstrap/pull/192 * https://github.com/cilium/cilium/issues/12217 * Update Cilium from v1.8.0-rc4 to v1.8.0 * https://github.com/cilium/cilium/releases/tag/v1.8.0 * Update Prometheus from v2.19.0 to v2.19.1 * https://github.com/prometheus/prometheus/releases/tag/v2.19.1 * Update Grafana from v7.0.3 to v7.0.4 * https://github.com/grafana/grafana/releases/tag/v7.0.4 * Update mkdocs-material from v5.3.0 to v5.3.3 * Update Calico from v3.14.1 to v3.15.0 * https://docs.projectcalico.org/v3.15/release-notes/ * Update Kubernetes from v1.18.4 to v1.18.5 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1185 * Update Prometheus from v2.19.1 to v2.19.2 * https://github.com/prometheus/prometheus/releases/tag/v2.19.2 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions with those used internally * Revert "Update Prometheus from v2.19.1 to v2.19.2" * Prometheus has not published the v1.19.2 * This reverts commit 81b6f54169119702c3cc6a3ecabca77f8646b444. * Isolate each DigitalOcean cluster in its own VPC * DigitalOcean introduced Virtual Private Cloud (VPC) support to match other clouds and enhance the prior "private networking" feature. Before, droplet's belonging to different clusters (but residing in the same region) could reach one another (although Typhoon firewall rules prohibit this). Now, droplets in a VPC reside in their own network * https://www.digitalocean.com/docs/networking/vpc/ * Create droplet instances in a VPC per cluster. This matches the design of Typhoon AWS, Azure, and GCP. * Require `terraform-provider-digitalocean` v1.16.0+ (action required) * Output `vpc_id` for use with an attached DigitalOcean loadbalancer * Remove os_image variable on Google Cloud Fedora CoreOS * In v1.18.3, the `os_stream` variable was added to select a Fedora CoreOS image stream (stable, testing, next) on AWS and Google Cloud (which publish official streams) * Remove `os_image` variable deprecated in v1.18.3. Manually uploaded images are no longer needed * Fix terraform fmt in firewall rules * Promote Fedora CoreOS on Google Cloud to stable status * Allow using Flatcar Linux edge on Azure * Set Kubelet cgroup driver to systemd when Flatcar Linux edge is chosen Note: Typhoon module status assumes use of the stable variant of an OS channel/stream. Its possible to use earlier variants and those are sometimes tested or developed against, but stable is the recommendation * Remove CoreOS Container Linux image names from docs * Remove coreos-stable, coreos-beta, and coreos-alpha channel references from docs * CoreOS Container Linux is end of life (see changelog) * Update Grafana from v7.0.4 to v7.0.5 * https://github.com/grafana/grafana/releases/tag/v7.0.5 * Update Cilium from v1.8.0 to v1.8.1 * https://github.com/cilium/cilium/releases/tag/v1.8.1 * Update Prometheus from v2.19.1 to v2.19.2 * https://github.com/prometheus/prometheus/releases/tag/v2.19.2 * Update Grafana from v7.0.5 to v7.0.6 * https://github.com/grafana/grafana/releases/tag/v7.0.6 * Update mkdocs-material from v5.3.3 to v5.4.0 * Update Kubernetes from v1.18.5 to v1.18.6 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1186 * https://github.com/poseidon/terraform-render-bootstrap/pull/201 * Update ingress-nginx from v0.33.0 to v0.34.1 * Switch to ingress-nginx controller images from us.grc.io (eu, asia can also be used if desired) * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.34.1 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.34.0 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions with those used internally * Show Cilium as a CNI provider option in docs * Start to show Cilium as a CNI option * https://github.com/cilium/cilium * Update Grafana from v7.0.6 to v7.1.0 * https://github.com/grafana/grafana/releases/tag/v7.1.0 * Update etcd from v3.4.9 to v3.4.10 * https://github.com/etcd-io/etcd/releases/tag/v3.4.10 * Declare etcd data directory permissions * Set etcd data directory /var/lib/etcd permissions to 700 * On Flatcar Linux, /var/lib/etcd is pre-existing and Ignition v2 doesn't overwrite the directory. Update the Container Linux config, but add the manual chmod workaround to bootstrap for Flatcar Linux users * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md#v3410-2020-07-16 * https://github.com/etcd-io/etcd/pull/11798 * Update CoreDNS from v1.6.7 to v1.7.0 * https://coredns.io/2020/06/15/coredns-1.7.0-release/ * Update Grafana dashboard with revised metrics names * Update Cilium from v1.8.1 to v1.8.2 * https://github.com/cilium/cilium/releases/tag/v1.8.2 * Fix some links in docs (#788) * Update Grafana from v7.1.0 to v7.1.1 * https://github.com/grafana/grafana/releases/tag/v7.1.1 * Update Prometheus from v2.19.2 to v2.20.0 * https://github.com/prometheus/prometheus/releases/tag/v2.20.0 * Migrate to Fedora CoreOS * bastion only needs base fcos config * Revert "bastion only needs base fcos config" This reverts commit 2984f3e70be6f8ecdc6a96f70cf809b260c919d6. * support custom bastion snippet as a variable * Fix flannel support on Fedora CoreOS * Fedora CoreOS now ships systemd-udev's `default.link` while Flannel relies on being able to pick its own MAC address for the `flannel.1` link for tunneled traffic to reach cni0 on the destination side, without being dropped * This change first appeared in FCOS testing-devel 32.20200624.20.1 and is the behavior going forward in FCOS since it was added to align FCOS network naming / configs with the rest of Fedora and address issues related to the default being missing * Flatcar Linux (and Container Linux) has a specific flannel.link configuration builtin, so it was not affected * https://github.com/coreos/fedora-coreos-tracker/issues/574#issuecomment-665487296 Note: Typhoon's recommended and default CNI provider is Calico, unless `networking` is set to flannel directly. * Relex terraform-provider-matchbox version constraint * Allow use of terraform-provider-matchbox v0.3+ (which allows v0.3.0 <= version < v1.0) for any pre 1.0 release * Before, the requirement was v0.3.0 <= version < v0.4.0 * Update from coreos/flannel-cni to poseidon/flannel-cni * Update CNI plugins from v0.6.0 to v0.8.6 to fix several CVEs * Update the base image to alpine:3.12 * Use `flannel-cni` as an init container and remove sleep * https://github.com/poseidon/terraform-render-bootstrap/pull/205 * https://github.com/poseidon/flannel-cni * https://quay.io/repository/poseidon/flannel-cni Background * Switch from github.com/coreos/flannel-cni v0.3.0 which was last published by me in 2017 and is no longer accessible to me to maintain or patch * Port to the poseidon/flannel-cni rewrite, which releases v0.4.0 to continue the prior release numbering * Update mkdocs-material from v5.4.0 to v5.5.1 * use scoop's fork of terraform-render-bootstrap * policy arn * Revert "policy arn" This reverts commit 4579af8bda4a5720e791c56fa02c14ecb767a537. * workers and controllers need to stay private * fedora coreos 32 * fedora coreos 32 * Support Fedora CoreOS OS image streams on AWS * fix mistakes in resolving merging conflicts * add new security components * fix json format * Update Grafana from v7.1.1 to v7.1.3 * https://github.com/grafana/grafana/releases/tag/v7.1.3 * https://github.com/grafana/grafana/releases/tag/v7.1.2 * Allow terraform-provider-aws v3.0+ plugin * Typhoon AWS is compatible with terraform-provider-aws v3.x releases * Continue to allow v2.23+, no v3.x specific features are used * Set required provider versions in the worker module, since it can be used independently Related: * https://github.com/terraform-providers/terraform-provider-aws/releases/tag/v3.0.0 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions to those used internally * fix ssl cert mounts * Migrate from Terraform v0.12.x to v0.13.x * Recommend Terraform v0.13.x * Support automatic install of poseidon's provider plugins * Update tutorial docs for Terraform v0.13.x * Add migration guide for Terraform v0.13.x (best-effort) * Require Terraform v0.12.26+ (migration compatibility) * Require `terraform-provider-ct` v0.6.1 * Require `terraform-provider-matchbox` v0.4.1 * Require `terraform-provider-digitalocean` v1.20+ Related: * https://www.hashicorp.com/blog/announcing-hashicorp-terraform-0-13/ * https://www.terraform.io/upgrade-guides/0-13.html * https://registry.terraform.io/providers/poseidon/ct/latest * https://registry.terraform.io/providers/poseidon/matchbox/latest * apiserver nlb should be internal * update terraform-render-bootstrap with latest upstream * Update Terraform migration guide SHA * Mention the first master branch SHA that introduced Terraform v0.13 forward compatibility * Link the migration guide on Github until a release is available and website docs are published * Update Kubernetes from v1.18.6 to v1.18.8 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1188 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions to those used internally * Update mkdocs-material from v5.5.1 to v5.5.6 * Fix minor details in docs * try relabeling /etc/kubernetes/bootstrap-secrets by explicitly mounting to kubelet * relabeling does not need explicitly mounting to kubelet * need to update the type label of bootstrap-secret in the newest typhoon * update terraform-render-bootstrap with latest upstream * rm unnecessary volume mounts on etcd * rm output/ Co-authored-by: Dalton Hubble Co-authored-by: Arve Knudsen Co-authored-by: Suraj Deshmukh Co-authored-by: Ben Drucker Co-authored-by: Eldon --- .github/ISSUE_TEMPLATE.md | 33 --- .github/ISSUE_TEMPLATE/bug_report.md | 39 +++ .github/ISSUE_TEMPLATE/config.yml | 5 + CHANGES.md | 218 ++++++++++++++- README.md | 36 +-- addons/grafana/dashboards-coredns.yaml | 10 +- addons/grafana/deployment.yaml | 2 +- addons/nginx-ingress/aws/class.yaml | 6 + addons/nginx-ingress/aws/deployment.yaml | 2 +- .../nginx-ingress/aws/rbac/cluster-role.yaml | 9 + addons/nginx-ingress/azure/class.yaml | 6 + addons/nginx-ingress/azure/deployment.yaml | 2 +- .../azure/rbac/cluster-role.yaml | 9 + addons/nginx-ingress/bare-metal/class.yaml | 6 + .../nginx-ingress/bare-metal/deployment.yaml | 2 +- .../bare-metal/rbac/cluster-role.yaml | 9 + addons/nginx-ingress/digital-ocean/class.yaml | 6 + .../digital-ocean/daemonset.yaml | 2 +- .../digital-ocean/rbac/cluster-role.yaml | 9 + addons/nginx-ingress/google-cloud/class.yaml | 6 + .../google-cloud/deployment.yaml | 2 +- .../google-cloud/rbac/cluster-role.yaml | 9 + addons/prometheus/deployment.yaml | 2 +- .../kube-state-metrics/deployment.yaml | 2 +- .../exporters/node-exporter/daemonset.yaml | 2 +- addons/prometheus/rules.yaml | 8 +- aws/container-linux/kubernetes/README.md | 6 +- aws/container-linux/kubernetes/bootstrap.tf | 4 +- .../kubernetes/cl/controller.yaml | 5 +- aws/container-linux/kubernetes/controllers.tf | 9 +- aws/container-linux/kubernetes/network.tf | 1 - aws/container-linux/kubernetes/outputs.tf | 1 + aws/container-linux/kubernetes/security.tf | 203 ++++++++++++-- aws/container-linux/kubernetes/versions.tf | 10 +- .../kubernetes/workers/cl/worker.yaml | 22 +- .../kubernetes/workers/outputs.tf | 1 + .../kubernetes/workers/versions.tf | 12 +- .../kubernetes/workers/workers.tf | 6 +- aws/fedora-coreos/kubernetes/README.md | 9 +- aws/fedora-coreos/kubernetes/ami.tf | 10 +- aws/fedora-coreos/kubernetes/bastion.tf | 243 ++++++++++++++++ aws/fedora-coreos/kubernetes/bootstrap.tf | 7 +- aws/fedora-coreos/kubernetes/controllers.tf | 98 ++++++- aws/fedora-coreos/kubernetes/fcc/bastion.yaml | 3 + .../kubernetes/fcc/controller.yaml | 59 ++-- .../kubernetes/ignition-configs-bucket.tf | 15 + aws/fedora-coreos/kubernetes/network.tf | 100 ++++++- aws/fedora-coreos/kubernetes/nlb.tf | 30 +- aws/fedora-coreos/kubernetes/outputs.tf | 80 +++++- aws/fedora-coreos/kubernetes/security.tf | 227 +++++++++++++-- aws/fedora-coreos/kubernetes/ssh.tf | 28 +- aws/fedora-coreos/kubernetes/variables.tf | 78 +++++- aws/fedora-coreos/kubernetes/versions.tf | 10 +- aws/fedora-coreos/kubernetes/workers.tf | 16 +- aws/fedora-coreos/kubernetes/workers/ami.tf | 10 +- .../kubernetes/workers/fcc/worker.yaml | 38 ++- .../kubernetes/workers/outputs.tf | 9 + .../kubernetes/workers/variables.tf | 37 ++- .../kubernetes/workers/versions.tf | 12 +- .../kubernetes/workers/workers.tf | 103 ++++++- azure/container-linux/kubernetes/README.md | 4 +- azure/container-linux/kubernetes/bootstrap.tf | 2 +- .../kubernetes/cl/controller.yaml | 6 +- .../container-linux/kubernetes/controllers.tf | 16 +- azure/container-linux/kubernetes/network.tf | 4 +- azure/container-linux/kubernetes/security.tf | 92 ++++++ azure/container-linux/kubernetes/variables.tf | 2 +- azure/container-linux/kubernetes/versions.tf | 10 +- .../kubernetes/workers/cl/worker.yaml | 24 +- .../kubernetes/workers/variables.tf | 2 +- .../kubernetes/workers/versions.tf | 12 +- .../kubernetes/workers/workers.tf | 14 +- azure/fedora-coreos/kubernetes/README.md | 15 +- azure/fedora-coreos/kubernetes/bootstrap.tf | 5 +- azure/fedora-coreos/kubernetes/controllers.tf | 8 +- .../kubernetes/fcc/controller.yaml | 44 ++- azure/fedora-coreos/kubernetes/network.tf | 4 +- azure/fedora-coreos/kubernetes/security.tf | 92 ++++++ azure/fedora-coreos/kubernetes/versions.tf | 10 +- .../kubernetes/workers/fcc/worker.yaml | 25 +- .../kubernetes/workers/versions.tf | 12 +- .../kubernetes/workers/workers.tf | 6 +- .../container-linux/kubernetes/README.md | 4 +- .../container-linux/kubernetes/bootstrap.tf | 2 +- .../kubernetes/cl/controller.yaml | 5 +- .../kubernetes/cl/install.yaml | 2 +- .../container-linux/kubernetes/cl/worker.yaml | 19 +- .../container-linux/kubernetes/profiles.tf | 16 +- .../container-linux/kubernetes/versions.tf | 14 +- bare-metal/fedora-coreos/kubernetes/README.md | 6 +- .../fedora-coreos/kubernetes/bootstrap.tf | 2 +- .../kubernetes/fcc/controller.yaml | 45 ++- .../fedora-coreos/kubernetes/fcc/worker.yaml | 23 +- .../fedora-coreos/kubernetes/versions.tf | 14 +- .../container-linux/kubernetes/README.md | 4 +- .../container-linux/kubernetes/bootstrap.tf | 2 +- .../kubernetes/cl/controller.yaml | 5 +- .../container-linux/kubernetes/cl/worker.yaml | 23 +- .../container-linux/kubernetes/controllers.tf | 13 +- .../container-linux/kubernetes/network.tf | 38 ++- .../container-linux/kubernetes/outputs.tf | 8 + .../container-linux/kubernetes/versions.tf | 18 +- .../container-linux/kubernetes/workers.tf | 11 +- .../fedora-coreos/kubernetes/README.md | 4 +- .../fedora-coreos/kubernetes/bootstrap.tf | 2 +- .../fedora-coreos/kubernetes/controllers.tf | 13 +- .../kubernetes/fcc/controller.yaml | 44 ++- .../fedora-coreos/kubernetes/fcc/worker.yaml | 25 +- .../fedora-coreos/kubernetes/network.tf | 37 ++- .../fedora-coreos/kubernetes/outputs.tf | 8 + .../fedora-coreos/kubernetes/versions.tf | 18 +- .../fedora-coreos/kubernetes/workers.tf | 11 +- docs/advanced/customization.md | 37 ++- docs/advanced/worker-pools.md | 28 +- docs/architecture/digitalocean.md | 1 + docs/architecture/operating-systems.md | 2 +- docs/fedora-coreos/aws.md | 44 +-- docs/fedora-coreos/azure.md | 45 +-- docs/fedora-coreos/bare-metal.md | 50 ++-- docs/fedora-coreos/digitalocean.md | 44 +-- docs/fedora-coreos/google-cloud.md | 62 ++-- docs/{cl => flatcar-linux}/aws.md | 44 +-- docs/{cl => flatcar-linux}/azure.md | 54 ++-- docs/{cl => flatcar-linux}/bare-metal.md | 52 ++-- .../digitalocean.md} | 46 +-- docs/{cl => flatcar-linux}/google-cloud.md | 44 +-- docs/index.md | 45 ++- docs/topics/faq.md | 11 +- docs/topics/hardware.md | 2 +- docs/topics/maintenance.md | 264 ++++-------------- docs/topics/performance.md | 2 +- docs/topics/security.md | 40 ++- .../container-linux/kubernetes/README.md | 6 +- .../container-linux/kubernetes/bootstrap.tf | 2 +- .../kubernetes/cl/controller.yaml | 5 +- .../container-linux/kubernetes/controllers.tf | 8 +- .../container-linux/kubernetes/network.tf | 26 ++ .../container-linux/kubernetes/versions.tf | 8 +- .../kubernetes/workers/cl/worker.yaml | 22 +- .../kubernetes/workers/versions.tf | 12 +- .../kubernetes/workers/workers.tf | 6 +- .../fedora-coreos/kubernetes/README.md | 12 +- .../fedora-coreos/kubernetes/bootstrap.tf | 2 +- .../fedora-coreos/kubernetes/controllers.tf | 7 +- .../kubernetes/fcc/controller.yaml | 44 ++- .../fedora-coreos/kubernetes/image.tf | 6 + .../fedora-coreos/kubernetes/network.tf | 26 ++ .../fedora-coreos/kubernetes/variables.tf | 5 +- .../fedora-coreos/kubernetes/versions.tf | 8 +- .../fedora-coreos/kubernetes/workers.tf | 2 +- .../kubernetes/workers/fcc/worker.yaml | 25 +- .../fedora-coreos/kubernetes/workers/image.tf | 6 + .../kubernetes/workers/variables.tf | 5 +- .../kubernetes/workers/versions.tf | 12 +- .../kubernetes/workers/workers.tf | 5 +- mkdocs.yml | 47 ++-- .../bootstrap-apiserver.yaml | 59 ---- .../bootstrap-controller-manager.yaml | 36 --- .../bootstrap-scheduler.yaml | 23 -- .../bgpconfigurations-crd.yaml | 13 - output/manifests-networking/bgppeers-crd.yaml | 13 - .../calico-cluster-role-binding.yaml | 12 - .../calico-cluster-role.yaml | 68 ----- .../manifests-networking/calico-config.yaml | 39 --- .../calico-service-account.yaml | 5 - output/manifests-networking/calico.yaml | 146 ---------- .../clusterinformations-crd.yaml | 13 - .../felixconfigurations-crd.yaml | 13 - .../globalnetworkpolicies-crd.yaml | 13 - .../globalnetworksets-crd.yaml | 13 - output/manifests-networking/ippools-crd.yaml | 13 - .../networkpolicies-crd.yaml | 13 - output/manifests/kube-apiserver-secret.yaml | 14 - output/manifests/kube-apiserver.yaml | 88 ------ .../kube-controller-manager-disruption.yaml | 11 - .../kube-controller-manager-role-binding.yaml | 12 - .../manifests/kube-controller-manager-sa.yaml | 5 - .../kube-controller-manager-secret.yaml | 9 - output/manifests/kube-controller-manager.yaml | 82 ------ output/manifests/kube-dns-deployment.yaml | 154 ---------- output/manifests/kube-dns-sa.yaml | 5 - output/manifests/kube-dns-svc.yaml | 20 -- output/manifests/kube-proxy-role-binding.yaml | 12 - output/manifests/kube-proxy-sa.yaml | 5 - output/manifests/kube-proxy.yaml | 67 ----- .../manifests/kube-scheduler-disruption.yaml | 11 - output/manifests/kube-scheduler.yaml | 58 ---- .../kube-system-rbac-role-binding.yaml | 12 - output/manifests/kubeconfig-in-cluster.yaml | 22 -- .../pod-checkpointer-role-binding.yaml | 13 - output/manifests/pod-checkpointer-role.yaml | 12 - output/manifests/pod-checkpointer-sa.yaml | 5 - output/manifests/pod-checkpointer.yaml | 72 ----- requirements.txt | 8 +- 194 files changed, 2910 insertions(+), 2292 deletions(-) delete mode 100644 .github/ISSUE_TEMPLATE.md create mode 100644 .github/ISSUE_TEMPLATE/bug_report.md create mode 100644 .github/ISSUE_TEMPLATE/config.yml create mode 100644 addons/nginx-ingress/aws/class.yaml create mode 100644 addons/nginx-ingress/azure/class.yaml create mode 100644 addons/nginx-ingress/bare-metal/class.yaml create mode 100644 addons/nginx-ingress/digital-ocean/class.yaml create mode 100644 addons/nginx-ingress/google-cloud/class.yaml create mode 100644 aws/fedora-coreos/kubernetes/bastion.tf create mode 100644 aws/fedora-coreos/kubernetes/fcc/bastion.yaml create mode 100644 aws/fedora-coreos/kubernetes/ignition-configs-bucket.tf rename docs/{cl => flatcar-linux}/aws.md (89%) rename docs/{cl => flatcar-linux}/azure.md (89%) rename docs/{cl => flatcar-linux}/bare-metal.md (89%) rename docs/{cl/digital-ocean.md => flatcar-linux/digitalocean.md} (88%) rename docs/{cl => flatcar-linux}/google-cloud.md (91%) create mode 100644 google-cloud/fedora-coreos/kubernetes/image.tf create mode 100644 google-cloud/fedora-coreos/kubernetes/workers/image.tf delete mode 100644 output/bootstrap-manifests/bootstrap-apiserver.yaml delete mode 100644 output/bootstrap-manifests/bootstrap-controller-manager.yaml delete mode 100644 output/bootstrap-manifests/bootstrap-scheduler.yaml delete mode 100644 output/manifests-networking/bgpconfigurations-crd.yaml delete mode 100644 output/manifests-networking/bgppeers-crd.yaml delete mode 100644 output/manifests-networking/calico-cluster-role-binding.yaml delete mode 100644 output/manifests-networking/calico-cluster-role.yaml delete mode 100644 output/manifests-networking/calico-config.yaml delete mode 100644 output/manifests-networking/calico-service-account.yaml delete mode 100644 output/manifests-networking/calico.yaml delete mode 100644 output/manifests-networking/clusterinformations-crd.yaml delete mode 100644 output/manifests-networking/felixconfigurations-crd.yaml delete mode 100644 output/manifests-networking/globalnetworkpolicies-crd.yaml delete mode 100644 output/manifests-networking/globalnetworksets-crd.yaml delete mode 100644 output/manifests-networking/ippools-crd.yaml delete mode 100644 output/manifests-networking/networkpolicies-crd.yaml delete mode 100644 output/manifests/kube-apiserver-secret.yaml delete mode 100644 output/manifests/kube-apiserver.yaml delete mode 100644 output/manifests/kube-controller-manager-disruption.yaml delete mode 100644 output/manifests/kube-controller-manager-role-binding.yaml delete mode 100644 output/manifests/kube-controller-manager-sa.yaml delete mode 100644 output/manifests/kube-controller-manager-secret.yaml delete mode 100644 output/manifests/kube-controller-manager.yaml delete mode 100644 output/manifests/kube-dns-deployment.yaml delete mode 100644 output/manifests/kube-dns-sa.yaml delete mode 100644 output/manifests/kube-dns-svc.yaml delete mode 100644 output/manifests/kube-proxy-role-binding.yaml delete mode 100644 output/manifests/kube-proxy-sa.yaml delete mode 100644 output/manifests/kube-proxy.yaml delete mode 100644 output/manifests/kube-scheduler-disruption.yaml delete mode 100644 output/manifests/kube-scheduler.yaml delete mode 100644 output/manifests/kube-system-rbac-role-binding.yaml delete mode 100644 output/manifests/kubeconfig-in-cluster.yaml delete mode 100644 output/manifests/pod-checkpointer-role-binding.yaml delete mode 100644 output/manifests/pod-checkpointer-role.yaml delete mode 100644 output/manifests/pod-checkpointer-sa.yaml delete mode 100644 output/manifests/pod-checkpointer.yaml diff --git a/.github/ISSUE_TEMPLATE.md b/.github/ISSUE_TEMPLATE.md deleted file mode 100644 index 6f9dfaf37..000000000 --- a/.github/ISSUE_TEMPLATE.md +++ /dev/null @@ -1,33 +0,0 @@ - - -## Bug - -### Environment - -* Platform: aws, azure, bare-metal, google-cloud, digital-ocean -* OS: fedora-coreos, flatcar-linux -* Release: Typhoon version or Git SHA (reporting latest is **not** helpful) -* Terraform: `terraform version` (reporting latest is **not** helpful) -* Plugins: Provider plugin versions (reporting latest is **not** helpful) - -### Problem - -Describe the problem. - -### Desired Behavior - -Describe the goal. - -### Steps to Reproduce - -Provide clear steps to reproduce the issue unless already covered. - -## Feature Request - -### Feature - -Describe the feature and what problem it solves. - -### Tradeoffs - -What are the pros and cons of this feature? How will it be exercised and maintained? diff --git a/.github/ISSUE_TEMPLATE/bug_report.md b/.github/ISSUE_TEMPLATE/bug_report.md new file mode 100644 index 000000000..cca945fa7 --- /dev/null +++ b/.github/ISSUE_TEMPLATE/bug_report.md @@ -0,0 +1,39 @@ +--- +name: Bug report +about: Report a bug to improve the project +title: '' +labels: '' +assignees: '' + +--- + + + +**Description** + +A clear and concise description of what the bug is. + +**Steps to Reproduce** + +Provide clear steps to reproduce the bug. + +- [ ] Relevant error messages if appropriate (concise, not a dump of everything). +- [ ] Explored using a vanilla cluster from the [tutorials](https://typhoon.psdn.io/#documentation). Ruled out [customizations](https://typhoon.psdn.io/advanced/customization/). + +**Expected behavior** + +A clear and concise description of what you expected to happen. + +**Environment** + +* Platform: aws, azure, bare-metal, google-cloud, digital-ocean +* OS: fedora-coreos, flatcar-linux (include release version) +* Release: Typhoon version or Git SHA (reporting latest is **not** helpful) +* Terraform: `terraform version` (reporting latest is **not** helpful) +* Plugins: Provider plugin versions (reporting latest is **not** helpful) + +**Possible Solution** + + + +Link to a PR or description. diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml new file mode 100644 index 000000000..1a270766e --- /dev/null +++ b/.github/ISSUE_TEMPLATE/config.yml @@ -0,0 +1,5 @@ +blank_issues_enabled: true +contact_links: + - name: Security + url: https://typhoon.psdn.io/topics/security/ + about: Report security vulnerabilities diff --git a/CHANGES.md b/CHANGES.md index 26a194eca..3b7952e25 100644 --- a/CHANGES.md +++ b/CHANGES.md @@ -4,39 +4,245 @@ Notable changes between versions. ## Latest +### v1.18.8 + +* Kubernetes [v1.18.8](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1188) +* Migrate from Terraform v0.12.x to v0.13.x ([#804](https://github.com/poseidon/typhoon/pull/804)) (**action required**) + * Recommend Terraform v0.13.x ([migration guide](https://typhoon.psdn.io/topics/maintenance/#terraform-versions)) + * Support automatic install of poseidon's provider plugins ([poseidon/ct](https://registry.terraform.io/providers/poseidon/ct/latest), [poseidon/matchbox](https://registry.terraform.io/providers/poseidon/matchbox/latest)) + * Require Terraform v0.12.26+ (migration compatibility) + * Require `terraform-provider-ct` v0.6.1 + * Require `terraform-provider-matchbox` v0.4.1 +* Update etcd from v3.4.9 to [v3.4.10](https://github.com/etcd-io/etcd/releases/tag/v3.4.10) +* Update CoreDNS from v1.6.7 to [v1.7.0](https://coredns.io/2020/06/15/coredns-1.7.0-release/) +* Update Cilium from v1.8.1 to [v1.8.2](https://github.com/cilium/cilium/releases/tag/v1.8.2) +* Update [coreos/flannel-cni](https://github.com/coreos/flannel-cni) to [poseidon/flannel-cni](https://github.com/poseidon/flannel-cni) ([#798](https://github.com/poseidon/typhoon/pull/798)) + * Update CNI plugins and fix CVEs with Flannel CNI (non-default) + * Transition to a poseidon maintained container image + +### AWS + +* Allow `terraform-provider-aws` v3.0+ ([#803](https://github.com/poseidon/typhoon/pull/803)) + * Recommend updating `terraform-provider-aws` to v3.0+ + * Continue to allow v2.23+, no v3.x specific features are used + +### DigitalOcean + +* Require `terraform-provider-digitalocean` v1.21+ for Terraform v0.13.x (unenforced) +* Require `terraform-provider-digitalocean` v1.20+ for Terraform v0.12.x + +### Fedora CoreOS + +* Fix support for Flannel with Fedora CoreOS ([#795](https://github.com/poseidon/typhoon/pull/795)) + * Configure `flannel.1` link to select its own MAC address to solve flannel + pod-to-pod traffic drops starting with default link changes in Fedora CoreOS + 32.20200629.3.0 ([details](https://github.com/coreos/fedora-coreos-tracker/issues/574#issuecomment-665487296)) + +#### Addons + +* Update Prometheus from v2.19.2 to [v2.20.0](https://github.com/prometheus/prometheus/releases/tag/v2.20.0) +* Update Grafana from v7.0.6 to [v7.1.3](https://github.com/grafana/grafana/releases/tag/v7.1.3) + +## v1.18.6 + +* Kubernetes [v1.18.6](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1186) +* Update Calico from v3.15.0 to [v3.15.1](https://docs.projectcalico.org/v3.15/release-notes/) +* Update Cilium from v1.8.0 to [v1.8.1](https://github.com/cilium/cilium/releases/tag/v1.8.1) + +#### Addons + +* Update nginx-ingress from v0.33.0 to [v0.34.1](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.34.1) + * [ingress-nginx](https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.34.0) will publish images only to gcr.io +* Update Prometheus from v2.19.1 to [v2.19.2](https://github.com/prometheus/prometheus/releases/tag/v2.19.2) +* Update Grafana from v7.0.4 to [v7.0.6](https://github.com/grafana/grafana/releases/tag/v7.0.6) + +## v1.18.5 + +* Kubernetes [v1.18.5](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1185) +* Add Cilium v1.8.0 as a (experimental) CNI provider option ([#760](https://github.com/poseidon/typhoon/pull/760)) + * Set `networking` to "cilium" to enable +* Update Calico from v3.14.1 to [v3.15.0](https://docs.projectcalico.org/v3.15/release-notes/) + +#### DigitalOcean + +* Isolate each cluster in an independent DigitalOcean VPC ([#776](https://github.com/poseidon/typhoon/pull/776)) + * Create droplets in a VPC per cluster (matches Typhoon AWS, Azure, and GCP) + * Require `terraform-provider-digitalocean` v1.16.0+ (action required) + * Output `vpc_id` for use with an attached DigitalOcean [loadbalancer](https://github.com/poseidon/typhoon/blob/v1.18.5/docs/architecture/digitalocean.md#custom-load-balancer) + +### Fedora CoreOS + +#### Google Cloud + +* Promote Fedora CoreOS to stable +* Remove `os_image` variable deprecated in v1.18.3 ([#777](https://github.com/poseidon/typhoon/pull/777)) + * Use `os_stream` to select a Fedora CoreOS image stream + +### Flatcar Linux + +#### Azure + +* Allow using Flatcar Linux Edge by setting `os_image` to "flatcar-edge" ([#778](https://github.com/poseidon/typhoon/pull/778)) + +#### Addons + +* Update Prometheus from v2.19.0 to [v2.19.1](https://github.com/prometheus/prometheus/releases/tag/v2.19.1) +* Update Grafana from v7.0.3 to [v7.0.4](https://github.com/grafana/grafana/releases/tag/v7.0.4) + +## v1.18.4 + +* Kubernetes [v1.18.4](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1184) +* Update Kubelet image publishing ([#749](https://github.com/poseidon/typhoon/pull/749)) + * Build Kubelet images internally and publish to Quay and Dockerhub + * [quay.io/poseidon/kubelet](https://quay.io/repository/poseidon/kubelet) (official) + * [docker.io/psdn/kubelet](https://hub.docker.com/r/psdn/kubelet) (fallback) + * Continue offering automated image builds with an alternate tag strategy (see [docs](https://typhoon.psdn.io/topics/security/#container-images)) + * [Document](https://typhoon.psdn.io/advanced/customization/#kubelet) use of alternate Kubelet images during registry incidents +* Update Calico from v3.14.0 to [v3.14.1](https://docs.projectcalico.org/v3.14/release-notes/) + * Fix [CVE-2020-13597](https://github.com/kubernetes/kubernetes/issues/91507) +* Rename controller NoSchedule taint from `node-role.kubernetes.io/master` to `node-role.kubernetes.io/controller` ([#764](https://github.com/poseidon/typhoon/pull/764)) + * Tolerate the new taint name for workloads that may run on controller nodes +* Remove node label `node.kubernetes.io/master` from controller nodes ([#764](https://github.com/poseidon/typhoon/pull/764)) + * Use `node.kubernetes.io/controller` (present since v1.9.5, [#160](https://github.com/poseidon/typhoon/pull/160)) to node select controllers +* Remove unused Kubelet `-lock-file` and `-exit-on-lock-contention` ([#758](https://github.com/poseidon/typhoon/pull/758)) + +### Fedora CoreOS + +#### Azure + +* Use `strict` Fedora CoreOS Config (FCC) snippet parsing ([#755](https://github.com/poseidon/typhoon/pull/755)) +* Reduce Calico vxlan interface MTU to maintain performance ([#767](https://github.com/poseidon/typhoon/pull/766)) + +#### AWS + +* Fix Kubelet service race with hostname update ([#766](https://github.com/poseidon/typhoon/pull/766)) + * Wait for a hostname to avoid Kubelet trying to register as `localhost` + +### Flatcar Linux + +* Use `strict` Container Linux Config (CLC) snippet parsing ([#755](https://github.com/poseidon/typhoon/pull/755)) + * Require `terraform-provider-ct` v0.4+, recommend v0.5+ (**action required**) + +### Addons + +* Update nginx-ingress from v0.32.0 to [v0.33.0](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.33.0) +* Update Prometheus from v2.18.1 to [v2.19.0](https://github.com/prometheus/prometheus/releases/tag/v2.19.0) +* Update node-exporter from v1.0.0-rc.1 to [v1.0.1](https://github.com/prometheus/node_exporter/releases/tag/v1.0.1) +* Update kube-state-metrics from v1.9.6 to v1.9.7 +* Update Grafana from v7.0.0 to v7.0.3 + +## v1.18.3 + +* Kubernetes [v1.18.3](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1183) +* Use Kubelet [TLS bootstrap](https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/) with bootstrap token authentication ([#713](https://github.com/poseidon/typhoon/pull/713)) + * Enable Node [Authorization](https://kubernetes.io/docs/reference/access-authn-authz/node/) and [NodeRestriction](https://kubernetes.io/docs/reference/access-authn-authz/admission-controllers/#noderestriction) to reduce authorization scope + * Renew Kubelet certificates every 72 hours +* Update etcd from v3.4.7 to [v3.4.9](https://github.com/etcd-io/etcd/releases/tag/v3.4.9) +* Update Calico from v3.13.1 to [v3.14.0](https://docs.projectcalico.org/v3.14/release-notes/) +* Add CoreDNS node affinity preference for controller nodes ([#188](https://github.com/poseidon/terraform-render-bootstrap/pull/188)) +* Deprecate CoreOS Container Linux support (no OS [updates](https://coreos.com/os/eol/) after May 2020) + * Use a `fedora-coreos` module for Fedora CoreOS + * Use a `container-linux` module for Flatcar Linux + +### AWS + +* Fix Terraform plan error when `controller_count` exceeds AWS zones (e.g. 5 controllers) ([#714](https://github.com/poseidon/typhoon/pull/714)) + * Regressed in v1.17.1 ([#605](https://github.com/poseidon/typhoon/pull/605)) + +### Azure + +* Update Azure subnets to set `address_prefixes` list ([#730](https://github.com/poseidon/typhoon/pull/730)) + * Fix warning that `address_prefix` is deprecated + * Require `terraform-provider-azurerm` v2.8.0+ (action required) + +### DigitalOcean + +* Promote DigitalOcean to beta on both Fedora CoreOS and Flatcar Linux + +### Fedora CoreOS + +* Fix Calico `install-cni` crashloop on Pod restarts ([#724](https://github.com/poseidon/typhoon/pull/724)) + * SELinux enforcement requires consistent file context MCS level + * Restarting a node resolved the issue as a previous workaround + +#### AWS + +* Support Fedora CoreOS [image streams](https://docs.fedoraproject.org/en-US/fedora-coreos/update-streams/) ([#727](https://github.com/poseidon/typhoon/pull/727)) + * Add `os_stream` variable to set the stream to `stable` (default), `testing`, or `next` + * Remove unused `os_image` variable + +#### Google + +* Support Fedora CoreOS [image streams](https://docs.fedoraproject.org/en-US/fedora-coreos/update-streams/) ([#723](https://github.com/poseidon/typhoon/pull/723)) + * Add `os_stream` variable to set the stream to `stable` (default), `testing`, or `next` + * Deprecate `os_image` variable. Manual image uploads are no longer needed + +### Flatcar Linux + +#### Azure + +* Use the Flatcar Linux Azure Marketplace image + * Restore [#664](https://github.com/poseidon/typhoon/pull/664) (reverted in [#707](https://github.com/poseidon/typhoon/pull/707)) but use Flatcar Linux new free offer (not byol) +* Change `os_image` to use a `flatcar-stable` default + +#### Google + +* Promote Flatcar Linux to beta + +### Addons + +* Update nginx-ingress from v0.30.0 to [v0.32.0](https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.32.0) + * Add support for [IngressClass](https://kubernetes.io/docs/concepts/services-networking/ingress/#ingress-class) +* Update Prometheus from v2.17.1 to v2.18.1 + * Update kube-state-metrics from v1.9.5 to [v1.9.6](https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.6) + * Update node-exporter from v1.0.0-rc.0 to [v1.0.0-rc.1](https://github.com/prometheus/node_exporter/releases/tag/v1.0.0-rc.1) +* Update Grafana from v6.7.2 to [v7.0.0](https://grafana.com/docs/grafana/latest/guides/whats-new-in-v7-0/) + +## v1.18.2 + * Kubernetes [v1.18.2](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1182) * Choose Fedora CoreOS or Flatcar Linux (**action required**) * Use a `fedora-coreos` module for Fedora CoreOS * Use a `container-linux` module for Flatcar Linux +* Change Container Linux modules' defaults from CoreOS Container Linux to [Flatcar Container Linux](https://typhoon.psdn.io/architecture/operating-systems/) ([#702](https://github.com/poseidon/typhoon/pull/702)) * CoreOS Container Linux [won't receive updates](https://coreos.com/os/eol/) after May 2020 ### Fedora CoreOS +* Fix bootstrap race condition from SELinux unshared content label ([#708](https://github.com/poseidon/typhoon/pull/708)) + #### Azure * Add support for Fedora CoreOS ([#704](https://github.com/poseidon/typhoon/pull/704)) -### Flatcar Linux / Container Linux +#### DigitalOcean + +* Fix race condition creating firewall allow rules ([#709](https://github.com/poseidon/typhoon/pull/709)) + +### Flatcar Linux #### AWS -* Change Container Linux `os_image` default from `coreos-stable` to `flatcar-stable` ([#702](https://github.com/poseidon/typhoon/pull/702)) +* Change `os_image` default from `coreos-stable` to `flatcar-stable` ([#702](https://github.com/poseidon/typhoon/pull/702)) #### Azure -* Change Container Linux `os_image` default from `coreos-stable` to `flatcar-stable` ([#702](https://github.com/poseidon/typhoon/pull/702)) +* Change `os_image` to be required. Recommend uploading a Flatcar Linux image (**action required**) ([#702](https://github.com/poseidon/typhoon/pull/702)) +* Disable Flatcar Linux Azure Marketplace image [support](https://github.com/poseidon/typhoon/pull/664) (**breaking**, [#707](https://github.com/poseidon/typhoon/pull/707)) + * Revert to manual uploading until marketplace issue is closed ([#703](https://github.com/poseidon/typhoon/issues/703)) #### Bare-Metal -* Container Linux users should change [os_channel](https://typhoon.psdn.io/cl/bare-metal/#required) from a CoreOS channel to a Flatcar channel +* Recommend changing [os_channel](https://typhoon.psdn.io/cl/bare-metal/#required) from `coreos-stable` to `flatcar-stable` #### Google -* Change Container Linux `os_image` to be required. Container Linux users should upload a Flatcar Linux image and set it (**action required**) ([#702](https://github.com/poseidon/typhoon/pull/702)) +* Change `os_image` to be required. Recommend uploading a Flatcar Linux image (**action required**) ([#702](https://github.com/poseidon/typhoon/pull/702)) #### DigitalOcean -* Change Container Linux `os_image` to be required. Container Linux users should upload a Flatcar Linux image and set it (**action required**) ([#702](https://github.com/poseidon/typhoon/pull/702)) +* Change `os_image` to be required. Recommend uploading a Flatcar Linux image (**action required**) ([#702](https://github.com/poseidon/typhoon/pull/702)) +* Fix race condition creating firewall allow rules ([#709](https://github.com/poseidon/typhoon/pull/709)) ## v1.18.1 diff --git a/README.md b/README.md index 277b3fea7..e3c01309c 100644 --- a/README.md +++ b/README.md @@ -27,9 +27,9 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) -* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking -* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) +* Kubernetes v1.18.8 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking +* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Ready for Ingress, Prometheus, Grafana, CSI, or other [addons](https://typhoon.psdn.io/addons/overview/) @@ -44,35 +44,25 @@ Typhoon is available for [Fedora CoreOS](https://getfedora.org/coreos/). | AWS | Fedora CoreOS | [aws/fedora-coreos/kubernetes](aws/fedora-coreos/kubernetes) | stable | | Azure | Fedora CoreOS | [azure/fedora-coreos/kubernetes](azure/fedora-coreos/kubernetes) | alpha | | Bare-Metal | Fedora CoreOS | [bare-metal/fedora-coreos/kubernetes](bare-metal/fedora-coreos/kubernetes) | beta | -| DigitalOcean | Fedora CoreOS | [digital-ocean/fedora-coreos/kubernetes](digital-ocean/fedora-coreos/kubernetes) | alpha | -| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](google-cloud/fedora-coreos/kubernetes) | beta | +| DigitalOcean | Fedora CoreOS | [digital-ocean/fedora-coreos/kubernetes](digital-ocean/fedora-coreos/kubernetes) | beta | +| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](google-cloud/fedora-coreos/kubernetes) | stable | -Typhoon is available for [Flatcar Container Linux](https://www.flatcar-linux.org/releases/). +Typhoon is available for [Flatcar Linux](https://www.flatcar-linux.org/releases/). | Platform | Operating System | Terraform Module | Status | |---------------|------------------|------------------|--------| | AWS | Flatcar Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable | | Azure | Flatcar Linux | [azure/container-linux/kubernetes](azure/container-linux/kubernetes) | alpha | | Bare-Metal | Flatcar Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable | -| DigitalOcean | Flatcar Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | alpha | -| Google Cloud | Flatcar Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | alpha | - -Typhoon is available for CoreOS Container Linux ([no updates](https://coreos.com/os/eol/) after May 2020). - -| Platform | Operating System | Terraform Module | Status | -|---------------|------------------|------------------|--------| -| AWS | Container Linux | [aws/container-linux/kubernetes](aws/container-linux/kubernetes) | stable | -| Azure | Container Linux | [azure/container-linux/kubernetes](azure/container-linux/kubernetes) | alpha | -| Bare-Metal | Container Linux | [bare-metal/container-linux/kubernetes](bare-metal/container-linux/kubernetes) | stable | -| Digital Ocean | Container Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | beta | -| Google Cloud | Container Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | stable | +| DigitalOcean | Flatcar Linux | [digital-ocean/container-linux/kubernetes](digital-ocean/container-linux/kubernetes) | beta | +| Google Cloud | Flatcar Linux | [google-cloud/container-linux/kubernetes](google-cloud/container-linux/kubernetes) | beta | ## Documentation * [Docs](https://typhoon.psdn.io) * Architecture [concepts](https://typhoon.psdn.io/architecture/concepts/) and [operating systems](https://typhoon.psdn.io/architecture/operating-systems/) * Fedora CoreOS tutorials for [AWS](docs/fedora-coreos/aws.md), [Azure](docs/fedora-coreos/azure.md), [Bare-Metal](docs/fedora-coreos/bare-metal.md), [DigitalOcean](docs/fedora-coreos/digitalocean.md), and [Google Cloud](docs/fedora-coreos/google-cloud.md) -* Flatcar Linux tutorials for [AWS](docs/cl/aws.md), [Azure](docs/cl/azure.md), [Bare-Metal](docs/cl/bare-metal.md), [DigitalOcean](docs/cl/digital-ocean.md), and [Google Cloud](docs/cl/google-cloud.md) +* Flatcar Linux tutorials for [AWS](docs/flatcar-linux/aws.md), [Azure](docs/flatcar-linux/azure.md), [Bare-Metal](docs/flatcar-linux/bare-metal.md), [DigitalOcean](docs/flatcar-linux/digitalocean.md), and [Google Cloud](docs/flatcar-linux/google-cloud.md) ## Usage @@ -80,7 +70,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo ```tf module "yavin" { - source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.18.8" # Google Cloud cluster_name = "yavin" @@ -119,9 +109,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou $ export KUBECONFIG=/home/user/.kube/configs/yavin-config $ kubectl get nodes NAME ROLES STATUS AGE VERSION -yavin-controller-0.c.example-com.internal Ready 6m v1.18.2 -yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.2 -yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.2 +yavin-controller-0.c.example-com.internal Ready 6m v1.18.8 +yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.8 +yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.8 ``` List the pods. diff --git a/addons/grafana/dashboards-coredns.yaml b/addons/grafana/dashboards-coredns.yaml index eb4395b79..8bd507aaa 100644 --- a/addons/grafana/dashboards-coredns.yaml +++ b/addons/grafana/dashboards-coredns.yaml @@ -72,7 +72,7 @@ data: "steppedLine": false, "targets": [ { - "expr": "sum(rate(coredns_dns_request_count_total{instance=~\"$instance\"}[5m])) by (proto)", + "expr": "sum(rate(coredns_dns_requests_total{instance=~\"$instance\"}[5m])) by (proto)", "format": "time_series", "intervalFactor": 2, "legendFormat": "{{proto}}", @@ -163,7 +163,7 @@ data: "steppedLine": false, "targets": [ { - "expr": "sum(rate(coredns_dns_request_type_count_total{instance=~\"$instance\"}[5m])) by (type)", + "expr": "sum(rate(coredns_dns_requests_total{instance=~\"$instance\"}[5m])) by (type)", "format": "time_series", "intervalFactor": 2, "legendFormat": "{{type}}", @@ -254,7 +254,7 @@ data: "steppedLine": false, "targets": [ { - "expr": "sum(rate(coredns_dns_request_count_total{instance=~\"$instance\"}[5m])) by (zone)", + "expr": "sum(rate(coredns_dns_requests_total{instance=~\"$instance\"}[5m])) by (zone)", "format": "time_series", "intervalFactor": 2, "legendFormat": "{{zone}}", @@ -463,7 +463,7 @@ data: "steppedLine": false, "targets": [ { - "expr": "sum(rate(coredns_dns_response_rcode_count_total{instance=~\"$instance\"}[5m])) by (rcode)", + "expr": "sum(rate(coredns_dns_responses_total{instance=~\"$instance\"}[5m])) by (rcode)", "format": "time_series", "intervalFactor": 2, "legendFormat": "{{rcode}}", @@ -790,7 +790,7 @@ data: "steppedLine": false, "targets": [ { - "expr": "sum(coredns_cache_size{instance=~\"$instance\"}) by (type)", + "expr": "sum(coredns_cache_entries{instance=~\"$instance\"}) by (type)", "format": "time_series", "intervalFactor": 2, "legendFormat": "{{type}}", diff --git a/addons/grafana/deployment.yaml b/addons/grafana/deployment.yaml index 57a0a2476..eb2b9ec56 100644 --- a/addons/grafana/deployment.yaml +++ b/addons/grafana/deployment.yaml @@ -23,7 +23,7 @@ spec: spec: containers: - name: grafana - image: docker.io/grafana/grafana:6.7.2 + image: docker.io/grafana/grafana:7.1.3 env: - name: GF_PATHS_CONFIG value: "/etc/grafana/custom.ini" diff --git a/addons/nginx-ingress/aws/class.yaml b/addons/nginx-ingress/aws/class.yaml new file mode 100644 index 000000000..bbc8015c3 --- /dev/null +++ b/addons/nginx-ingress/aws/class.yaml @@ -0,0 +1,6 @@ +apiVersion: networking.k8s.io/v1beta1 +kind: IngressClass +metadata: + name: public +spec: + controller: k8s.io/ingress-nginx diff --git a/addons/nginx-ingress/aws/deployment.yaml b/addons/nginx-ingress/aws/deployment.yaml index 56b74e882..e07337d52 100644 --- a/addons/nginx-ingress/aws/deployment.yaml +++ b/addons/nginx-ingress/aws/deployment.yaml @@ -22,7 +22,7 @@ spec: spec: containers: - name: nginx-ingress-controller - image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0 + image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1 args: - /nginx-ingress-controller - --ingress-class=public diff --git a/addons/nginx-ingress/aws/rbac/cluster-role.yaml b/addons/nginx-ingress/aws/rbac/cluster-role.yaml index 5682d3974..90edbeb17 100644 --- a/addons/nginx-ingress/aws/rbac/cluster-role.yaml +++ b/addons/nginx-ingress/aws/rbac/cluster-role.yaml @@ -51,3 +51,12 @@ rules: - ingresses/status verbs: - update + - apiGroups: + - "networking.k8s.io" + resources: + - ingressclasses + verbs: + - get + - list + - watch + diff --git a/addons/nginx-ingress/azure/class.yaml b/addons/nginx-ingress/azure/class.yaml new file mode 100644 index 000000000..bbc8015c3 --- /dev/null +++ b/addons/nginx-ingress/azure/class.yaml @@ -0,0 +1,6 @@ +apiVersion: networking.k8s.io/v1beta1 +kind: IngressClass +metadata: + name: public +spec: + controller: k8s.io/ingress-nginx diff --git a/addons/nginx-ingress/azure/deployment.yaml b/addons/nginx-ingress/azure/deployment.yaml index 56b74e882..e07337d52 100644 --- a/addons/nginx-ingress/azure/deployment.yaml +++ b/addons/nginx-ingress/azure/deployment.yaml @@ -22,7 +22,7 @@ spec: spec: containers: - name: nginx-ingress-controller - image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0 + image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1 args: - /nginx-ingress-controller - --ingress-class=public diff --git a/addons/nginx-ingress/azure/rbac/cluster-role.yaml b/addons/nginx-ingress/azure/rbac/cluster-role.yaml index 5682d3974..90edbeb17 100644 --- a/addons/nginx-ingress/azure/rbac/cluster-role.yaml +++ b/addons/nginx-ingress/azure/rbac/cluster-role.yaml @@ -51,3 +51,12 @@ rules: - ingresses/status verbs: - update + - apiGroups: + - "networking.k8s.io" + resources: + - ingressclasses + verbs: + - get + - list + - watch + diff --git a/addons/nginx-ingress/bare-metal/class.yaml b/addons/nginx-ingress/bare-metal/class.yaml new file mode 100644 index 000000000..bbc8015c3 --- /dev/null +++ b/addons/nginx-ingress/bare-metal/class.yaml @@ -0,0 +1,6 @@ +apiVersion: networking.k8s.io/v1beta1 +kind: IngressClass +metadata: + name: public +spec: + controller: k8s.io/ingress-nginx diff --git a/addons/nginx-ingress/bare-metal/deployment.yaml b/addons/nginx-ingress/bare-metal/deployment.yaml index ac86bd5fe..6a49f3d81 100644 --- a/addons/nginx-ingress/bare-metal/deployment.yaml +++ b/addons/nginx-ingress/bare-metal/deployment.yaml @@ -22,7 +22,7 @@ spec: spec: containers: - name: nginx-ingress-controller - image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0 + image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1 args: - /nginx-ingress-controller - --ingress-class=public diff --git a/addons/nginx-ingress/bare-metal/rbac/cluster-role.yaml b/addons/nginx-ingress/bare-metal/rbac/cluster-role.yaml index 5682d3974..90edbeb17 100644 --- a/addons/nginx-ingress/bare-metal/rbac/cluster-role.yaml +++ b/addons/nginx-ingress/bare-metal/rbac/cluster-role.yaml @@ -51,3 +51,12 @@ rules: - ingresses/status verbs: - update + - apiGroups: + - "networking.k8s.io" + resources: + - ingressclasses + verbs: + - get + - list + - watch + diff --git a/addons/nginx-ingress/digital-ocean/class.yaml b/addons/nginx-ingress/digital-ocean/class.yaml new file mode 100644 index 000000000..bbc8015c3 --- /dev/null +++ b/addons/nginx-ingress/digital-ocean/class.yaml @@ -0,0 +1,6 @@ +apiVersion: networking.k8s.io/v1beta1 +kind: IngressClass +metadata: + name: public +spec: + controller: k8s.io/ingress-nginx diff --git a/addons/nginx-ingress/digital-ocean/daemonset.yaml b/addons/nginx-ingress/digital-ocean/daemonset.yaml index 1bf474d66..538efbbe3 100644 --- a/addons/nginx-ingress/digital-ocean/daemonset.yaml +++ b/addons/nginx-ingress/digital-ocean/daemonset.yaml @@ -22,7 +22,7 @@ spec: spec: containers: - name: nginx-ingress-controller - image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0 + image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1 args: - /nginx-ingress-controller - --ingress-class=public diff --git a/addons/nginx-ingress/digital-ocean/rbac/cluster-role.yaml b/addons/nginx-ingress/digital-ocean/rbac/cluster-role.yaml index 5682d3974..90edbeb17 100644 --- a/addons/nginx-ingress/digital-ocean/rbac/cluster-role.yaml +++ b/addons/nginx-ingress/digital-ocean/rbac/cluster-role.yaml @@ -51,3 +51,12 @@ rules: - ingresses/status verbs: - update + - apiGroups: + - "networking.k8s.io" + resources: + - ingressclasses + verbs: + - get + - list + - watch + diff --git a/addons/nginx-ingress/google-cloud/class.yaml b/addons/nginx-ingress/google-cloud/class.yaml new file mode 100644 index 000000000..bbc8015c3 --- /dev/null +++ b/addons/nginx-ingress/google-cloud/class.yaml @@ -0,0 +1,6 @@ +apiVersion: networking.k8s.io/v1beta1 +kind: IngressClass +metadata: + name: public +spec: + controller: k8s.io/ingress-nginx diff --git a/addons/nginx-ingress/google-cloud/deployment.yaml b/addons/nginx-ingress/google-cloud/deployment.yaml index 56b74e882..e07337d52 100644 --- a/addons/nginx-ingress/google-cloud/deployment.yaml +++ b/addons/nginx-ingress/google-cloud/deployment.yaml @@ -22,7 +22,7 @@ spec: spec: containers: - name: nginx-ingress-controller - image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller:0.30.0 + image: us.gcr.io/k8s-artifacts-prod/ingress-nginx/controller:v0.34.1 args: - /nginx-ingress-controller - --ingress-class=public diff --git a/addons/nginx-ingress/google-cloud/rbac/cluster-role.yaml b/addons/nginx-ingress/google-cloud/rbac/cluster-role.yaml index 5682d3974..90edbeb17 100644 --- a/addons/nginx-ingress/google-cloud/rbac/cluster-role.yaml +++ b/addons/nginx-ingress/google-cloud/rbac/cluster-role.yaml @@ -51,3 +51,12 @@ rules: - ingresses/status verbs: - update + - apiGroups: + - "networking.k8s.io" + resources: + - ingressclasses + verbs: + - get + - list + - watch + diff --git a/addons/prometheus/deployment.yaml b/addons/prometheus/deployment.yaml index 35569e836..a0dbc4832 100644 --- a/addons/prometheus/deployment.yaml +++ b/addons/prometheus/deployment.yaml @@ -20,7 +20,7 @@ spec: serviceAccountName: prometheus containers: - name: prometheus - image: quay.io/prometheus/prometheus:v2.17.1 + image: quay.io/prometheus/prometheus:v2.20.0 args: - --web.listen-address=0.0.0.0:9090 - --config.file=/etc/prometheus/prometheus.yaml diff --git a/addons/prometheus/exporters/kube-state-metrics/deployment.yaml b/addons/prometheus/exporters/kube-state-metrics/deployment.yaml index bf9274fe9..fb5389a57 100644 --- a/addons/prometheus/exporters/kube-state-metrics/deployment.yaml +++ b/addons/prometheus/exporters/kube-state-metrics/deployment.yaml @@ -24,7 +24,7 @@ spec: serviceAccountName: kube-state-metrics containers: - name: kube-state-metrics - image: quay.io/coreos/kube-state-metrics:v1.9.5 + image: quay.io/coreos/kube-state-metrics:v1.9.7 ports: - name: metrics containerPort: 8080 diff --git a/addons/prometheus/exporters/node-exporter/daemonset.yaml b/addons/prometheus/exporters/node-exporter/daemonset.yaml index da3f723af..2a30c37be 100644 --- a/addons/prometheus/exporters/node-exporter/daemonset.yaml +++ b/addons/prometheus/exporters/node-exporter/daemonset.yaml @@ -28,7 +28,7 @@ spec: hostPID: true containers: - name: node-exporter - image: quay.io/prometheus/node-exporter:v1.0.0-rc.0 + image: quay.io/prometheus/node-exporter:v1.0.1 args: - --path.procfs=/host/proc - --path.sysfs=/host/sys diff --git a/addons/prometheus/rules.yaml b/addons/prometheus/rules.yaml index 69026c1ef..359cad7e6 100644 --- a/addons/prometheus/rules.yaml +++ b/addons/prometheus/rules.yaml @@ -882,10 +882,10 @@ data: { "alert": "KubeClientCertificateExpiration", "annotations": { - "message": "A client certificate used to authenticate to the apiserver is expiring in less than 7.0 days.", + "message": "A client certificate used to authenticate to the apiserver is expiring in less than 1.0 hours.", "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration" }, - "expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 604800\n", + "expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 3600\n", "labels": { "severity": "warning" } @@ -893,10 +893,10 @@ data: { "alert": "KubeClientCertificateExpiration", "annotations": { - "message": "A client certificate used to authenticate to the apiserver is expiring in less than 24.0 hours.", + "message": "A client certificate used to authenticate to the apiserver is expiring in less than 0.1 hours.", "runbook_url": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubeclientcertificateexpiration" }, - "expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 86400\n", + "expr": "apiserver_client_certificate_expiration_seconds_count{job=\"apiserver\"} > 0 and on(job) histogram_quantile(0.01, sum by (job, le) (rate(apiserver_client_certificate_expiration_seconds_bucket{job=\"apiserver\"}[5m]))) < 300\n", "labels": { "severity": "critical" } diff --git a/aws/container-linux/kubernetes/README.md b/aws/container-linux/kubernetes/README.md index b45143847..9c8fd0142 100644 --- a/aws/container-linux/kubernetes/README.md +++ b/aws/container-linux/kubernetes/README.md @@ -11,11 +11,11 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) -* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking +* Kubernetes v1.18.8 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization -* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/) +* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/) ## Docs diff --git a/aws/container-linux/kubernetes/bootstrap.tf b/aws/container-linux/kubernetes/bootstrap.tf index ed4ac9121..5e39beafa 100644 --- a/aws/container-linux/kubernetes/bootstrap.tf +++ b/aws/container-linux/kubernetes/bootstrap.tf @@ -1,6 +1,6 @@ # Kubernetes assets (kubeconfig, manifests) module "bootstrap" { - source = "git::https://github.com/takescoop/terraform-render-bootstrap.git?ref=a7ad7d74586fd698be22b75b4cc314c2d3dc4bcd" + source = "git::https://github.com/takescoop/terraform-render-bootstrap.git?ref=d3132edba9f84ad210376f0632d435c08d6ce3e4" cluster_name = var.cluster_name api_servers = concat(list(format("%s.%s", var.cluster_name, var.dns_zone)), var.apiserver_aliases) @@ -17,5 +17,3 @@ module "bootstrap" { # scoop apiserver_arguments = var.apiserver_arguments } - - diff --git a/aws/container-linux/kubernetes/cl/controller.yaml b/aws/container-linux/kubernetes/cl/controller.yaml index abf5f0e65..351915478 100644 --- a/aws/container-linux/kubernetes/cl/controller.yaml +++ b/aws/container-linux/kubernetes/cl/controller.yaml @@ -52,6 +52,7 @@ systemd: Description=Kubelet via Hyperkube Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.8 Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver} ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests @@ -91,7 +92,7 @@ systemd: --mount volume=var-log,target=/var/log \ --volume opt-cni-bin,kind=host,source=/opt/cni/bin \ --mount volume=opt-cni-bin,target=/opt/cni/bin \ - docker://quay.io/poseidon/kubelet:v1.18.2 -- \ + $${KUBELET_IMAGE} -- \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ @@ -135,7 +136,7 @@ systemd: --volume script,kind=host,source=/opt/bootstrap/apply \ --mount volume=script,target=/apply \ --insecure-options=image \ - docker://quay.io/poseidon/kubelet:v1.18.2 \ + docker://quay.io/poseidon/kubelet:v1.18.8 \ --net=host \ --dns=host \ --exec=/apply diff --git a/aws/container-linux/kubernetes/controllers.tf b/aws/container-linux/kubernetes/controllers.tf index fdf20fa63..1470d2bb0 100644 --- a/aws/container-linux/kubernetes/controllers.tf +++ b/aws/container-linux/kubernetes/controllers.tf @@ -72,10 +72,10 @@ resource "aws_s3_bucket_object" "controller-ignitions" { # Controller Ignition configs data "ct_config" "controller-ignitions" { - count = var.controller_count - content = data.template_file.controller-configs.*.rendered[count.index] - pretty_print = false - snippets = var.controller_snippets + count = var.controller_count + content = data.template_file.controller-configs.*.rendered[count.index] + strict = true + snippets = var.controller_snippets } # Controller Container Linux configs @@ -155,4 +155,3 @@ resource "aws_iam_instance_profile" "controller" { name = "${var.cluster_name}-controller" role = aws_iam_role.controller.id } - diff --git a/aws/container-linux/kubernetes/network.tf b/aws/container-linux/kubernetes/network.tf index 5818b471f..fec50d288 100644 --- a/aws/container-linux/kubernetes/network.tf +++ b/aws/container-linux/kubernetes/network.tf @@ -139,4 +139,3 @@ resource "aws_nat_gateway" "nat" { resource "aws_egress_only_internet_gateway" "egress_igw" { vpc_id = aws_vpc.network.id } - diff --git a/aws/container-linux/kubernetes/outputs.tf b/aws/container-linux/kubernetes/outputs.tf index c2cc22a75..3e605f974 100644 --- a/aws/container-linux/kubernetes/outputs.tf +++ b/aws/container-linux/kubernetes/outputs.tf @@ -127,3 +127,4 @@ output "worker_autoscaling_group" { value = module.workers.autoscaling_group description = "Name of the workers autoscaling group" } + diff --git a/aws/container-linux/kubernetes/security.tf b/aws/container-linux/kubernetes/security.tf index 422754d86..21de9f18a 100644 --- a/aws/container-linux/kubernetes/security.tf +++ b/aws/container-linux/kubernetes/security.tf @@ -13,6 +13,30 @@ resource "aws_security_group" "controller" { } } +resource "aws_security_group_rule" "controller-icmp" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "icmp" + from_port = 8 + to_port = 0 + source_security_group_id = aws_security_group.worker.id +} + +resource "aws_security_group_rule" "controller-icmp-self" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "icmp" + from_port = 8 + to_port = 0 + self = true +} + resource "aws_security_group_rule" "controller-ssh" { security_group_id = aws_security_group.controller.id @@ -54,27 +78,64 @@ resource "aws_security_group_rule" "controller-etcd-metrics-self" { self = true } -resource "aws_security_group_rule" "controller-apiserver" { +resource "aws_security_group_rule" "controller-cilium-health" { + count = var.networking == "cilium" ? 1 : 0 + security_group_id = aws_security_group.controller.id - type = "ingress" - protocol = "tcp" - from_port = 6443 - to_port = 6443 - cidr_blocks = ["0.0.0.0/0"] + type = "ingress" + protocol = "tcp" + from_port = 4240 + to_port = 4240 + source_security_group_id = aws_security_group.worker.id } -# Allow Prometheus to scrape kube-proxy -resource "aws_security_group_rule" "kube-proxy-metrics" { +resource "aws_security_group_rule" "controller-cilium-health-self" { + count = var.networking == "cilium" ? 1 : 0 + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "tcp" + from_port = 4240 + to_port = 4240 + self = true +} + +# IANA VXLAN default +resource "aws_security_group_rule" "controller-vxlan" { + count = var.networking == "flannel" ? 1 : 0 + security_group_id = aws_security_group.controller.id type = "ingress" - protocol = "tcp" - from_port = 10249 - to_port = 10249 + protocol = "udp" + from_port = 4789 + to_port = 4789 source_security_group_id = aws_security_group.worker.id } +resource "aws_security_group_rule" "controller-vxlan-self" { + count = var.networking == "flannel" ? 1 : 0 + + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "udp" + from_port = 4789 + to_port = 4789 + self = true +} + +resource "aws_security_group_rule" "controller-apiserver" { + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "tcp" + from_port = 6443 + to_port = 6443 + cidr_blocks = ["0.0.0.0/0"] +} + # Allow Prometheus to scrape kube-scheduler resource "aws_security_group_rule" "controller-scheduler-metrics" { security_group_id = aws_security_group.controller.id @@ -97,20 +158,21 @@ resource "aws_security_group_rule" "controller-manager-metrics" { source_security_group_id = aws_security_group.worker.id } -resource "aws_security_group_rule" "controller-vxlan" { - count = var.networking == "flannel" ? 1 : 0 +# Linux VXLAN default +resource "aws_security_group_rule" "controller-linux-vxlan" { + count = var.networking == "cilium" ? 1 : 0 security_group_id = aws_security_group.controller.id type = "ingress" protocol = "udp" - from_port = 4789 - to_port = 4789 + from_port = 8472 + to_port = 8472 source_security_group_id = aws_security_group.worker.id } -resource "aws_security_group_rule" "controller-vxlan-self" { - count = var.networking == "flannel" ? 1 : 0 +resource "aws_security_group_rule" "controller-linux-vxlan-self" { + count = var.networking == "cilium" ? 1 : 0 security_group_id = aws_security_group.controller.id @@ -132,6 +194,17 @@ resource "aws_security_group_rule" "controller-node-exporter" { source_security_group_id = aws_security_group.worker.id } +# Allow Prometheus to scrape kube-proxy +resource "aws_security_group_rule" "kube-proxy-metrics" { + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "tcp" + from_port = 10249 + to_port = 10249 + source_security_group_id = aws_security_group.worker.id +} + # Allow apiserver to access kubelets for exec, log, port-forward resource "aws_security_group_rule" "controller-kubelet" { security_group_id = aws_security_group.controller.id @@ -153,6 +226,28 @@ resource "aws_security_group_rule" "controller-kubelet-self" { self = true } +# Allow Prometheus to scrape kube-scheduler +resource "aws_security_group_rule" "controller-scheduler-metrics" { + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "tcp" + from_port = 10251 + to_port = 10251 + source_security_group_id = aws_security_group.worker.id +} + +# Allow Prometheus to scrape kube-controller-manager +resource "aws_security_group_rule" "controller-manager-metrics" { + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "tcp" + from_port = 10252 + to_port = 10252 + source_security_group_id = aws_security_group.worker.id +} + resource "aws_security_group_rule" "controller-bgp" { security_group_id = aws_security_group.controller.id @@ -237,6 +332,30 @@ resource "aws_security_group" "worker" { } } +resource "aws_security_group_rule" "worker-icmp" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "icmp" + from_port = 8 + to_port = 0 + source_security_group_id = aws_security_group.controller.id +} + +resource "aws_security_group_rule" "worker-icmp-self" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "icmp" + from_port = 8 + to_port = 0 + self = true +} + resource "aws_security_group_rule" "worker-ssh-bastion" { security_group_id = aws_security_group.worker.id @@ -277,6 +396,31 @@ resource "aws_security_group_rule" "worker-https" { cidr_blocks = ["0.0.0.0/0"] } +resource "aws_security_group_rule" "worker-cilium-health" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "tcp" + from_port = 4240 + to_port = 4240 + source_security_group_id = aws_security_group.controller.id +} + +resource "aws_security_group_rule" "worker-cilium-health-self" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "tcp" + from_port = 4240 + to_port = 4240 + self = true +} + +# IANA VXLAN default resource "aws_security_group_rule" "worker-vxlan" { count = var.networking == "flannel" ? 1 : 0 @@ -301,6 +445,31 @@ resource "aws_security_group_rule" "worker-vxlan-self" { self = true } +# Linux VXLAN default +resource "aws_security_group_rule" "worker-linux-vxlan" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "udp" + from_port = 8472 + to_port = 8472 + source_security_group_id = aws_security_group.controller.id +} + +resource "aws_security_group_rule" "worker-linux-vxlan-self" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "udp" + from_port = 4789 + to_port = 4789 + self = true +} + # Allow Prometheus to scrape node-exporter daemonset resource "aws_security_group_rule" "worker-node-exporter" { security_group_id = aws_security_group.worker.id diff --git a/aws/container-linux/kubernetes/versions.tf b/aws/container-linux/kubernetes/versions.tf index f7a10ff15..75b52097f 100644 --- a/aws/container-linux/kubernetes/versions.tf +++ b/aws/container-linux/kubernetes/versions.tf @@ -1,11 +1,15 @@ # Terraform version and plugin versions terraform { - required_version = "~> 0.12.6" + required_version = ">= 0.12.26, < 0.14.0" required_providers { - aws = "~> 2.23" - ct = "~> 0.3" + aws = ">= 2.23, <= 4.0" template = "~> 2.1" null = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } } } diff --git a/aws/container-linux/kubernetes/workers/cl/worker.yaml b/aws/container-linux/kubernetes/workers/cl/worker.yaml index 5d329ac7b..6bdb9a34b 100644 --- a/aws/container-linux/kubernetes/workers/cl/worker.yaml +++ b/aws/container-linux/kubernetes/workers/cl/worker.yaml @@ -2,11 +2,11 @@ systemd: units: - name: docker.service - enable: true + enabled: true - name: locksmithd.service mask: true - name: wait-for-dns.service - enable: true + enabled: true contents: | [Unit] Description=Wait for DNS entries @@ -19,12 +19,13 @@ systemd: [Install] RequiredBy=kubelet.service - name: kubelet.service - enable: true + enabled: true contents: | [Unit] - Description=Kubelet via Hyperkube + Description=Kubelet Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.8 Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver} ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests @@ -64,20 +65,19 @@ systemd: --mount volume=var-log,target=/var/log \ --volume opt-cni-bin,kind=host,source=/opt/cni/bin \ --mount volume=opt-cni-bin,target=/opt/cni/bin \ - docker://quay.io/poseidon/kubelet:v1.18.2 -- \ + $${KUBELET_IMAGE} -- \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=$${KUBELET_CGROUP_DRIVER} \ --client-ca-file=/etc/kubernetes/ca.crt \ --cloud-provider=aws \ --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ --node-labels=node.kubernetes.io/node \ %{~ for label in split(",", node_labels) ~} @@ -85,6 +85,7 @@ systemd: %{~ endfor ~} --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid Restart=always @@ -92,7 +93,7 @@ systemd: [Install] WantedBy=multi-user.target - name: delete-node.service - enable: true + enabled: true contents: | [Unit] Description=Waiting to delete Kubernetes node on shutdown @@ -113,6 +114,7 @@ storage: ${kubeconfig} - path: /etc/sysctl.d/max-user-watches.conf filesystem: root + mode: 0644 contents: inline: | fs.inotify.max_user_watches=16184 @@ -128,7 +130,7 @@ storage: --volume config,kind=host,source=/etc/kubernetes \ --mount volume=config,target=/etc/kubernetes \ --insecure-options=image \ - docker://quay.io/poseidon/kubelet:v1.18.2 \ + docker://quay.io/poseidon/kubelet:v1.18.8 \ --net=host \ --dns=host \ --exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname) diff --git a/aws/container-linux/kubernetes/workers/outputs.tf b/aws/container-linux/kubernetes/workers/outputs.tf index ea9141c33..e9e2b65ae 100644 --- a/aws/container-linux/kubernetes/workers/outputs.tf +++ b/aws/container-linux/kubernetes/workers/outputs.tf @@ -17,3 +17,4 @@ output "autoscaling_group" { description = "Name of the workers autoscaling group" value = aws_autoscaling_group.workers.name } + diff --git a/aws/container-linux/kubernetes/workers/versions.tf b/aws/container-linux/kubernetes/workers/versions.tf index ac97c6ac8..564a6ff38 100644 --- a/aws/container-linux/kubernetes/workers/versions.tf +++ b/aws/container-linux/kubernetes/workers/versions.tf @@ -1,4 +1,14 @@ +# Terraform version and plugin versions terraform { - required_version = ">= 0.12" + required_version = ">= 0.12.26, < 0.14.0" + required_providers { + aws = ">= 2.23, <= 4.0" + template = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } + } } diff --git a/aws/container-linux/kubernetes/workers/workers.tf b/aws/container-linux/kubernetes/workers/workers.tf index fbbff2e2d..33df7f84c 100644 --- a/aws/container-linux/kubernetes/workers/workers.tf +++ b/aws/container-linux/kubernetes/workers/workers.tf @@ -98,9 +98,9 @@ resource "aws_s3_bucket_object" "worker-ignition" { # Worker Ignition config data "ct_config" "worker-ignition" { - content = data.template_file.worker-config.rendered - pretty_print = false - snippets = var.snippets + content = data.template_file.worker-config.rendered + strict = true + snippets = var.snippets } # Worker Container Linux config diff --git a/aws/fedora-coreos/kubernetes/README.md b/aws/fedora-coreos/kubernetes/README.md index 3acb9ecd8..9b3a8e457 100644 --- a/aws/fedora-coreos/kubernetes/README.md +++ b/aws/fedora-coreos/kubernetes/README.md @@ -11,13 +11,12 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) -* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking -* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) +* Kubernetes v1.18.8 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking +* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization -* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/) +* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/) ## Docs Please see the [official docs](https://typhoon.psdn.io) and the AWS [tutorial](https://typhoon.psdn.io/fedora-coreos/aws/). - diff --git a/aws/fedora-coreos/kubernetes/ami.tf b/aws/fedora-coreos/kubernetes/ami.tf index e32ce159f..a7ab184bd 100644 --- a/aws/fedora-coreos/kubernetes/ami.tf +++ b/aws/fedora-coreos/kubernetes/ami.tf @@ -13,16 +13,8 @@ data "aws_ami" "fedora-coreos" { values = ["hvm"] } - filter { - name = "name" - values = ["fedora-coreos-31.*.*.*-hvm"] - } - filter { name = "description" - values = ["Fedora CoreOS stable*"] + values = ["Fedora CoreOS ${var.os_stream} *"] } - - # try to filter out dev images (AWS filters can't) - name_regex = "^fedora-coreos-31.[0-9]*.[0-9]*.[0-9]*-hvm*" } diff --git a/aws/fedora-coreos/kubernetes/bastion.tf b/aws/fedora-coreos/kubernetes/bastion.tf new file mode 100644 index 000000000..6f8874474 --- /dev/null +++ b/aws/fedora-coreos/kubernetes/bastion.tf @@ -0,0 +1,243 @@ +locals { + user_data_bastion = { + ignition = { + version = "3.0.0" + config = { + merge = [ + { + source = var.base_ignition_config_path + }, + { + source = "s3://${aws_s3_bucket.ignition_configs.id}/${aws_s3_bucket_object.bastion_config.id}" + } + ] + } + } + } +} + +resource "aws_autoscaling_group" "bastion" { + name = "${var.cluster_name}-bastion ${aws_launch_configuration.bastion.name}" + + # count + desired_capacity = var.bastion_count + min_size = var.bastion_count + max_size = var.bastion_count + default_cooldown = 30 + health_check_grace_period = 30 + + # network + vpc_zone_identifier = aws_subnet.private.*.id + + # template + launch_configuration = aws_launch_configuration.bastion.name + + # target groups to which instances should be added + target_group_arns = [ + aws_lb_target_group.bastion.id + ] + + min_elb_capacity = 1 + + lifecycle { + # override the default destroy and replace update behavior + create_before_destroy = true + ignore_changes = [launch_configuration] + } + + tags = [{ + key = "Name" + value = "${var.cluster_name}-bastion" + propagate_at_launch = true + }] +} + +resource "aws_launch_configuration" "bastion" { + image_id = coalesce(var.ami, data.aws_ami.fedora-coreos.image_id) + instance_type = var.bastion_type + + user_data = jsonencode(local.user_data_bastion) + + # network + security_groups = [ + aws_security_group.bastion_external.id + ] + + # iam + iam_instance_profile = aws_iam_instance_profile.bastion.name + + lifecycle { + // Override the default destroy and replace update behavior + create_before_destroy = true + } +} + +resource "aws_s3_bucket_object" "bastion_config" { + bucket = aws_s3_bucket.ignition_configs.id + key = "bastion.json" + content = data.ct_config.bastion_ign.rendered +} + +data "ct_config" "bastion_ign" { + content = file("${path.module}/fcc/bastion.yaml") + strict = true + snippets = var.bastion_snippets +} + +resource "aws_security_group" "bastion_external" { + name_prefix = "${var.cluster_name}-bastion-external-" + description = "Allows access to the bastion from the internet" + + vpc_id = aws_vpc.network.id + + tags = { + Name = "${var.cluster_name}-bastion-external" + } + + ingress { + protocol = "tcp" + from_port = 22 + to_port = 22 + cidr_blocks = ["0.0.0.0/0"] + } + + egress { + protocol = -1 + from_port = 0 + to_port = 0 + cidr_blocks = ["0.0.0.0/0"] + } + + lifecycle { + create_before_destroy = true + } +} + +resource "aws_security_group" "bastion_internal" { + name_prefix = "${var.cluster_name}-bastion-internal-" + description = "Allows access to a host from the bastion" + + vpc_id = aws_vpc.network.id + + tags = { + Name = "${var.cluster_name}-bastion-internal" + } + + ingress { + protocol = "tcp" + from_port = 22 + to_port = 22 + security_groups = [aws_security_group.bastion_external.id] + } + + lifecycle { + create_before_destroy = true + } +} + +resource "aws_lb" "bastion" { + name = "${var.cluster_name}-bastion" + load_balancer_type = "network" + + subnets = aws_subnet.public.*.id +} + + +resource "aws_lb_listener" "bastion" { + load_balancer_arn = aws_lb.bastion.arn + protocol = "TCP" + port = "22" + + default_action { + type = "forward" + target_group_arn = aws_lb_target_group.bastion.arn + } +} + +resource "aws_lb_target_group" "bastion" { + name = "${var.cluster_name}-bastion" + vpc_id = aws_vpc.network.id + target_type = "instance" + + protocol = "TCP" + port = 22 + + health_check { + protocol = "TCP" + port = 22 + + healthy_threshold = 3 + unhealthy_threshold = 3 + + interval = 10 + } +} + +resource "aws_route53_record" "bastion" { + depends_on = [aws_autoscaling_group.bastion] + + zone_id = var.dns_zone_id + + name = format("bastion.%s.%s.", var.cluster_name, var.dns_zone) + type = "A" + + alias { + name = aws_lb.bastion.dns_name + zone_id = aws_lb.bastion.zone_id + evaluate_target_health = false + } +} + +resource "aws_iam_role_policy" "bastion_read_base_ignition_config" { + name = "read-base-ignition-config" + role = aws_iam_role.bastion.id + policy = var.base_ignition_config_read_policy +} + +resource "aws_iam_role_policy" "bastion_read_ignition_config" { + name = "read-ignition-configs" + role = aws_iam_role.bastion.id + policy = data.aws_iam_policy_document.bastion_read_ignition_config.json +} + +data "aws_iam_policy_document" "bastion_read_ignition_config" { + statement { + effect = "Allow" + actions = ["s3:GetObject"] + resources = ["${aws_s3_bucket.ignition_configs.arn}/${aws_s3_bucket_object.bastion_config.id}"] + } +} + +resource "aws_iam_role_policy" "bastion_instance_read_ec2" { + name = "instance-read-ec2" + role = aws_iam_role.bastion.id + policy = data.aws_iam_policy_document.bastion_instance_read_ec2.json +} + +data "aws_iam_policy_document" "bastion_instance_read_ec2" { + statement { + actions = ["ec2:Describe*"] + resources = ["*"] + } +} + +data "aws_iam_policy_document" "bastion_assume_role" { + statement { + actions = ["sts:AssumeRole"] + + principals { + type = "Service" + identifiers = ["ec2.amazonaws.com"] + } + } +} + +resource "aws_iam_role" "bastion" { + name = "${var.cluster_name}-bastion" + assume_role_policy = data.aws_iam_policy_document.bastion_assume_role.json +} + +resource "aws_iam_instance_profile" "bastion" { + name = "${var.cluster_name}-bastion" + role = aws_iam_role.bastion.id +} diff --git a/aws/fedora-coreos/kubernetes/bootstrap.tf b/aws/fedora-coreos/kubernetes/bootstrap.tf index d35fd11ef..f1d42504c 100644 --- a/aws/fedora-coreos/kubernetes/bootstrap.tf +++ b/aws/fedora-coreos/kubernetes/bootstrap.tf @@ -1,9 +1,9 @@ # Kubernetes assets (kubeconfig, manifests) module "bootstrap" { - source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc" + source = "git::https://github.com/takescoop/terraform-render-bootstrap.git?ref=d3132edba9f84ad210376f0632d435c08d6ce3e4" cluster_name = var.cluster_name - api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] + api_servers = concat(list(format("%s.%s", var.cluster_name, var.dns_zone)), var.apiserver_aliases) etcd_servers = aws_route53_record.etcds.*.fqdn asset_dir = var.asset_dir networking = var.networking @@ -15,5 +15,8 @@ module "bootstrap" { enable_aggregation = var.enable_aggregation trusted_certs_dir = "/etc/pki/tls/certs" + + # scoop + apiserver_arguments = var.apiserver_arguments } diff --git a/aws/fedora-coreos/kubernetes/controllers.tf b/aws/fedora-coreos/kubernetes/controllers.tf index e4bd9ddba..ae1330601 100644 --- a/aws/fedora-coreos/kubernetes/controllers.tf +++ b/aws/fedora-coreos/kubernetes/controllers.tf @@ -17,14 +17,33 @@ resource "aws_route53_record" "etcds" { resource "aws_instance" "controllers" { count = var.controller_count - tags = { - Name = "${var.cluster_name}-controller-${count.index}" - } + tags = map( + "Name", "${var.cluster_name}-controller-${count.index}", + "kubernetes.io/cluster/${var.cluster_name}", "owned" + ) instance_type = var.controller_type - ami = data.aws_ami.fedora-coreos.image_id - user_data = data.ct_config.controller-ignitions.*.rendered[count.index] + ami = coalesce(var.ami, data.aws_ami.fedora-coreos.image_id) + iam_instance_profile = aws_iam_instance_profile.controller.name + + user_data = < /dev/null; do sleep 1; done' [Install] RequiredBy=kubelet.service @@ -51,9 +52,10 @@ systemd: enabled: true contents: | [Unit] - Description=Kubelet via Hyperkube (System Container) + Description=Kubelet (System Container) Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.8 ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -72,34 +74,35 @@ systemd: --volume /run:/run \ --volume /sys/fs/cgroup:/sys/fs/cgroup:ro \ --volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \ - --volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \ + --volume /etc/pki/tls/certs:/etc/pki/tls/certs:ro \ + --volume /etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro \ --volume /var/lib/calico:/var/lib/calico:ro \ --volume /var/lib/docker:/var/lib/docker \ --volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \ --volume /var/log:/var/log \ --volume /var/run/lock:/var/run/lock:z \ --volume /opt/cni/bin:/opt/cni/bin:z \ - quay.io/poseidon/kubelet:v1.18.2 \ + $${KUBELET_IMAGE} \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=systemd \ --cgroups-per-qos=true \ --enforce-node-allocatable=pods \ --client-ca-file=/etc/kubernetes/ca.crt \ + --cloud-provider=aws \ --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ - --node-labels=node.kubernetes.io/master \ --node-labels=node.kubernetes.io/controller="true" \ --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ - --register-with-taints=node-role.kubernetes.io/master=:NoSchedule \ + --register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/podman stop kubelet Delegate=yes @@ -116,17 +119,20 @@ systemd: Type=oneshot RemainAfterExit=true WorkingDirectory=/opt/bootstrap + ExecStartPre=-/usr/bin/podman rm bootstrap ExecStart=/usr/bin/podman run --name bootstrap \ --network host \ - --volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,Z \ + --volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,z \ --volume /opt/bootstrap/assets:/assets:ro,Z \ --volume /opt/bootstrap/apply:/apply:ro,Z \ --entrypoint=/apply \ - quay.io/poseidon/kubelet:v1.18.2 + quay.io/poseidon/kubelet:v1.18.8 ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done ExecStartPost=-/usr/bin/podman stop bootstrap storage: directories: + - path: /var/lib/etcd + mode: 0700 - path: /etc/kubernetes - path: /opt/bootstrap files: @@ -150,12 +156,14 @@ storage: chmod -R 500 /etc/ssl/etcd mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/ mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/ - sudo mkdir -p /etc/kubernetes/manifests - sudo mv static-manifests/* /etc/kubernetes/manifests/ - sudo mkdir -p /opt/bootstrap/assets - sudo mv manifests /opt/bootstrap/assets/manifests - sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/ + mkdir -p /etc/kubernetes/manifests + mv static-manifests/* /etc/kubernetes/manifests/ + mkdir -p /opt/bootstrap/assets + rm -rf /opt/bootstrap/assets/manifests/manifests + mv manifests /opt/bootstrap/assets/manifests + mv manifests-networking/* /opt/bootstrap/assets/manifests/ rm -rf assets auth static-manifests tls manifests-networking + chcon -R -u system_u -t container_file_t /etc/kubernetes/bootstrap-secrets - path: /opt/bootstrap/apply mode: 0544 contents: @@ -174,6 +182,18 @@ storage: contents: inline: | fs.inotify.max_user_watches=16184 + - path: /etc/sysctl.d/reverse-path-filter.conf + contents: + inline: | + net.ipv4.conf.default.rp_filter=0 + net.ipv4.conf.*.rp_filter=0 + - path: /etc/systemd/network/50-flannel.link + contents: + inline: | + [Match] + OriginalName=flannel* + [Link] + MACAddressPolicy=none - path: /etc/systemd/system.conf.d/accounting.conf contents: inline: | @@ -204,8 +224,3 @@ storage: ETCD_PEER_CERT_FILE=/etc/ssl/certs/etcd/peer.crt ETCD_PEER_KEY_FILE=/etc/ssl/certs/etcd/peer.key ETCD_PEER_CLIENT_CERT_AUTH=true -passwd: - users: - - name: core - ssh_authorized_keys: - - ${ssh_authorized_key} diff --git a/aws/fedora-coreos/kubernetes/ignition-configs-bucket.tf b/aws/fedora-coreos/kubernetes/ignition-configs-bucket.tf new file mode 100644 index 000000000..794fb7471 --- /dev/null +++ b/aws/fedora-coreos/kubernetes/ignition-configs-bucket.tf @@ -0,0 +1,15 @@ +resource "aws_s3_bucket" "ignition_configs" { + bucket = "${var.cluster_name}-ignition-configs" + + server_side_encryption_configuration { + rule { + apply_server_side_encryption_by_default { + sse_algorithm = "aws:kms" + } + } + } + + versioning { + enabled = true + } +} diff --git a/aws/fedora-coreos/kubernetes/network.tf b/aws/fedora-coreos/kubernetes/network.tf index bdb4bff1e..fec50d288 100644 --- a/aws/fedora-coreos/kubernetes/network.tf +++ b/aws/fedora-coreos/kubernetes/network.tf @@ -1,3 +1,7 @@ +locals { + az_count = length(data.aws_availability_zones.all.names) +} + data "aws_availability_zones" "all" { } @@ -22,22 +26,20 @@ resource "aws_internet_gateway" "gateway" { } } -resource "aws_route_table" "default" { +resource "aws_route_table" "public" { vpc_id = aws_vpc.network.id - tags = { - "Name" = var.cluster_name - } + tags = map("Name", "${var.cluster_name}-public") } -resource "aws_route" "egress-ipv4" { - route_table_id = aws_route_table.default.id +resource "aws_route" "internet_gateway" { + route_table_id = aws_route_table.public.id destination_cidr_block = "0.0.0.0/0" gateway_id = aws_internet_gateway.gateway.id } -resource "aws_route" "egress-ipv6" { - route_table_id = aws_route_table.default.id +resource "aws_route" "ipv6_internet_gateway" { + route_table_id = aws_route_table.public.id destination_ipv6_cidr_block = "::/0" gateway_id = aws_internet_gateway.gateway.id } @@ -45,7 +47,7 @@ resource "aws_route" "egress-ipv6" { # Subnets (one per availability zone) resource "aws_subnet" "public" { - count = length(data.aws_availability_zones.all.names) + count = local.az_count vpc_id = aws_vpc.network.id availability_zone = data.aws_availability_zones.all.names[count.index] @@ -55,15 +57,85 @@ resource "aws_subnet" "public" { map_public_ip_on_launch = true assign_ipv6_address_on_creation = true - tags = { - "Name" = "${var.cluster_name}-public-${count.index}" - } + tags = merge( + var.subnet_tags_public, + map("Name", "${var.cluster_name}-public-${count.index}") + ) } resource "aws_route_table_association" "public" { - count = length(data.aws_availability_zones.all.names) + count = local.az_count - route_table_id = aws_route_table.default.id + route_table_id = aws_route_table.public.id subnet_id = aws_subnet.public.*.id[count.index] } +resource "aws_subnet" "private" { + count = local.az_count + + vpc_id = aws_vpc.network.id + availability_zone = data.aws_availability_zones.all.names[count.index] + + cidr_block = cidrsubnet(var.host_cidr, 4, count.index + 8) + ipv6_cidr_block = cidrsubnet(aws_vpc.network.ipv6_cidr_block, 8, count.index + 8) + assign_ipv6_address_on_creation = true + + tags = merge( + var.subnet_tags_private, + map("Name", "${var.cluster_name}-private-${count.index}") + ) +} + +resource "aws_route_table" "private" { + count = local.az_count + + vpc_id = aws_vpc.network.id + tags = map("Name", "${var.cluster_name}-private") +} + +resource "aws_route" "nat_gateway" { + count = local.az_count + + route_table_id = aws_route_table.private.*.id[count.index] + + destination_cidr_block = "0.0.0.0/0" + nat_gateway_id = aws_nat_gateway.nat.*.id[count.index] +} + +resource "aws_route" "egress_only_gateway" { + count = local.az_count + + route_table_id = aws_route_table.private.*.id[count.index] + + destination_ipv6_cidr_block = "::/0" + egress_only_gateway_id = aws_egress_only_internet_gateway.egress_igw.id +} + +resource "aws_route_table_association" "private" { + count = local.az_count + + route_table_id = aws_route_table.private.*.id[count.index] + subnet_id = aws_subnet.private.*.id[count.index] +} + + +resource "aws_eip" "nat" { + count = local.az_count + + vpc = true +} + +resource "aws_nat_gateway" "nat" { + depends_on = [ + aws_internet_gateway.gateway, + ] + + count = local.az_count + + allocation_id = aws_eip.nat.*.id[count.index] + subnet_id = aws_subnet.public.*.id[count.index] +} + +resource "aws_egress_only_internet_gateway" "egress_igw" { + vpc_id = aws_vpc.network.id +} diff --git a/aws/fedora-coreos/kubernetes/nlb.tf b/aws/fedora-coreos/kubernetes/nlb.tf index d26a03c50..57f91823c 100644 --- a/aws/fedora-coreos/kubernetes/nlb.tf +++ b/aws/fedora-coreos/kubernetes/nlb.tf @@ -19,7 +19,7 @@ resource "aws_lb" "nlb" { load_balancer_type = "network" internal = true - subnets = aws_subnet.public.*.id + subnets = aws_subnet.private.*.id enable_cross_zone_load_balancing = true } @@ -36,35 +36,11 @@ resource "aws_lb_listener" "apiserver-https" { } } -# Forward HTTP ingress traffic to workers -resource "aws_lb_listener" "ingress-http" { - load_balancer_arn = aws_lb.nlb.arn - protocol = "TCP" - port = 80 - - default_action { - type = "forward" - target_group_arn = module.workers.target_group_http - } -} - -# Forward HTTPS ingress traffic to workers -resource "aws_lb_listener" "ingress-https" { - load_balancer_arn = aws_lb.nlb.arn - protocol = "TCP" - port = 443 - - default_action { - type = "forward" - target_group_arn = module.workers.target_group_https - } -} - # Target group of controllers resource "aws_lb_target_group" "controllers" { name = "${var.cluster_name}-controllers" vpc_id = aws_vpc.network.id - target_type = "instance" + target_type = "ip" protocol = "TCP" port = 6443 @@ -88,7 +64,7 @@ resource "aws_lb_target_group_attachment" "controllers" { count = var.controller_count target_group_arn = aws_lb_target_group.controllers.arn - target_id = aws_instance.controllers.*.id[count.index] + target_id = aws_instance.controllers.*.private_ip[count.index] port = 6443 } diff --git a/aws/fedora-coreos/kubernetes/outputs.tf b/aws/fedora-coreos/kubernetes/outputs.tf index d9afc7bd3..3e605f974 100644 --- a/aws/fedora-coreos/kubernetes/outputs.tf +++ b/aws/fedora-coreos/kubernetes/outputs.tf @@ -21,9 +21,14 @@ output "vpc_id" { description = "ID of the VPC for creating worker instances" } -output "subnet_ids" { +output "private_subnet_ids" { + value = aws_subnet.private.*.id + description = "List of private subnet IDs" +} + +output "public_subnet_ids" { value = aws_subnet.public.*.id - description = "List of subnet IDs for creating worker instances" + description = "List of public subnet IDs" } output "worker_security_groups" { @@ -35,6 +40,11 @@ output "kubeconfig" { value = module.bootstrap.kubeconfig-kubelet } +output "kube_ca" { + description = "Base64-encoded CA cert data for Kubernetes apiserver" + value = module.bootstrap.ca_cert +} + # Outputs for custom load balancing output "nlb_id" { @@ -52,3 +62,69 @@ output "worker_target_group_https" { value = module.workers.target_group_https } +# Scoop outputs + +output "bastion_dns_name" { + value = aws_lb.bastion.dns_name + description = "DNS name of the network load balancer for distributing traffic to bastion hosts" + + depends_on = [ + aws_autoscaling_group.bastion + ] +} + +output "apiserver_dns_name" { + value = aws_route53_record.apiserver.fqdn + description = "DNS name of the Route53 record used to access the Kubernetes apiserver" +} + +output "bootstrap_controller_ip" { + value = aws_instance.controllers.0.private_ip + description = "IP address of the controller instance used to bootstrap the cluster" +} + +output "nat_ips" { + value = aws_eip.nat.*.public_ip + description = "List of NAT IPs where public traffic from this cluster will originate" +} + +output "private_route_tables" { + value = aws_route_table.private.*.id + description = "IDs of the private route tables that can be used to add additional private routes" +} + +output "private_route_tables_count" { + value = length(aws_route_table.private) + description = "Number of private route tables that are created" +} + +output "public_route_tables" { + value = aws_route_table.public.*.id + description = "IDs of the public route tables" +} + +output "public_route_tables_count" { + value = length(aws_route_table.public) + description = "Number of public route tables that are created" +} + +output "depends_id" { + value = null_resource.bootstrap.id + description = "Resource ID that will be defined when the cluster is ready" +} + +output "controller_role" { + value = aws_iam_role.controller.arn + description = "Instance role ARN attached to controller instances via instance profile" +} + +output "worker_role" { + value = module.workers.instance_role + description = "Instance role ARN attached to worker instances via instance profile" +} + +output "worker_autoscaling_group" { + value = module.workers.autoscaling_group + description = "Name of the workers autoscaling group" +} + diff --git a/aws/fedora-coreos/kubernetes/security.tf b/aws/fedora-coreos/kubernetes/security.tf index 60727af85..fa91522d9 100644 --- a/aws/fedora-coreos/kubernetes/security.tf +++ b/aws/fedora-coreos/kubernetes/security.tf @@ -13,14 +13,38 @@ resource "aws_security_group" "controller" { } } +resource "aws_security_group_rule" "controller-icmp" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "icmp" + from_port = 8 + to_port = 0 + source_security_group_id = aws_security_group.worker.id +} + +resource "aws_security_group_rule" "controller-icmp-self" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "icmp" + from_port = 8 + to_port = 0 + self = true +} + resource "aws_security_group_rule" "controller-ssh" { security_group_id = aws_security_group.controller.id - type = "ingress" - protocol = "tcp" - from_port = 22 - to_port = 22 - cidr_blocks = ["0.0.0.0/0"] + type = "ingress" + protocol = "tcp" + from_port = 22 + to_port = 22 + source_security_group_id = aws_security_group.worker.id } resource "aws_security_group_rule" "controller-etcd" { @@ -44,39 +68,41 @@ resource "aws_security_group_rule" "controller-etcd-metrics" { source_security_group_id = aws_security_group.worker.id } -# Allow Prometheus to scrape kube-proxy -resource "aws_security_group_rule" "kube-proxy-metrics" { +resource "aws_security_group_rule" "controller-etcd-metrics-self" { security_group_id = aws_security_group.controller.id - type = "ingress" - protocol = "tcp" - from_port = 10249 - to_port = 10249 - source_security_group_id = aws_security_group.worker.id + type = "ingress" + protocol = "tcp" + from_port = 2381 + to_port = 2381 + self = true } -# Allow Prometheus to scrape kube-scheduler -resource "aws_security_group_rule" "controller-scheduler-metrics" { +resource "aws_security_group_rule" "controller-cilium-health" { + count = var.networking == "cilium" ? 1 : 0 + security_group_id = aws_security_group.controller.id type = "ingress" protocol = "tcp" - from_port = 10251 - to_port = 10251 + from_port = 4240 + to_port = 4240 source_security_group_id = aws_security_group.worker.id } -# Allow Prometheus to scrape kube-controller-manager -resource "aws_security_group_rule" "controller-manager-metrics" { +resource "aws_security_group_rule" "controller-cilium-health-self" { + count = var.networking == "cilium" ? 1 : 0 + security_group_id = aws_security_group.controller.id - type = "ingress" - protocol = "tcp" - from_port = 10252 - to_port = 10252 - source_security_group_id = aws_security_group.worker.id + type = "ingress" + protocol = "tcp" + from_port = 4240 + to_port = 4240 + self = true } +# IANA VXLAN default resource "aws_security_group_rule" "controller-vxlan" { count = var.networking == "flannel" ? 1 : 0 @@ -111,6 +137,31 @@ resource "aws_security_group_rule" "controller-apiserver" { cidr_blocks = ["0.0.0.0/0"] } +# Linux VXLAN default +resource "aws_security_group_rule" "controller-linux-vxlan" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "udp" + from_port = 8472 + to_port = 8472 + source_security_group_id = aws_security_group.worker.id +} + +resource "aws_security_group_rule" "controller-linux-vxlan-self" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "udp" + from_port = 8472 + to_port = 8472 + self = true +} + # Allow Prometheus to scrape node-exporter daemonset resource "aws_security_group_rule" "controller-node-exporter" { security_group_id = aws_security_group.controller.id @@ -122,6 +173,18 @@ resource "aws_security_group_rule" "controller-node-exporter" { source_security_group_id = aws_security_group.worker.id } + +# Allow Prometheus to scrape kube-proxy +resource "aws_security_group_rule" "kube-proxy-metrics" { + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "tcp" + from_port = 10249 + to_port = 10249 + source_security_group_id = aws_security_group.worker.id +} + # Allow apiserver to access kubelets for exec, log, port-forward resource "aws_security_group_rule" "controller-kubelet" { security_group_id = aws_security_group.controller.id @@ -143,6 +206,28 @@ resource "aws_security_group_rule" "controller-kubelet-self" { self = true } +# Allow Prometheus to scrape kube-scheduler +resource "aws_security_group_rule" "controller-scheduler-metrics" { + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "tcp" + from_port = 10251 + to_port = 10251 + source_security_group_id = aws_security_group.worker.id +} + +# Allow Prometheus to scrape kube-controller-manager +resource "aws_security_group_rule" "controller-manager-metrics" { + security_group_id = aws_security_group.controller.id + + type = "ingress" + protocol = "tcp" + from_port = 10252 + to_port = 10252 + source_security_group_id = aws_security_group.worker.id +} + resource "aws_security_group_rule" "controller-bgp" { security_group_id = aws_security_group.controller.id @@ -227,14 +312,48 @@ resource "aws_security_group" "worker" { } } -resource "aws_security_group_rule" "worker-ssh" { +resource "aws_security_group_rule" "worker-icmp" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "icmp" + from_port = 8 + to_port = 0 + source_security_group_id = aws_security_group.controller.id +} + +resource "aws_security_group_rule" "worker-icmp-self" { + count = var.networking == "cilium" ? 1 : 0 + security_group_id = aws_security_group.worker.id - type = "ingress" - protocol = "tcp" - from_port = 22 - to_port = 22 - cidr_blocks = ["0.0.0.0/0"] + type = "ingress" + protocol = "icmp" + from_port = 8 + to_port = 0 + self = true +} + +resource "aws_security_group_rule" "worker-ssh-bastion" { + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "tcp" + from_port = 22 + to_port = 22 + source_security_group_id = aws_security_group.bastion_external.id +} + +resource "aws_security_group_rule" "worker-ssh-self" { + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "tcp" + from_port = 22 + to_port = 22 + self = true } resource "aws_security_group_rule" "worker-http" { @@ -257,6 +376,31 @@ resource "aws_security_group_rule" "worker-https" { cidr_blocks = ["0.0.0.0/0"] } +resource "aws_security_group_rule" "worker-cilium-health" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "tcp" + from_port = 4240 + to_port = 4240 + source_security_group_id = aws_security_group.controller.id +} + +resource "aws_security_group_rule" "worker-cilium-health-self" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "tcp" + from_port = 4240 + to_port = 4240 + self = true +} + +# IANA VXLAN default resource "aws_security_group_rule" "worker-vxlan" { count = var.networking == "flannel" ? 1 : 0 @@ -281,6 +425,31 @@ resource "aws_security_group_rule" "worker-vxlan-self" { self = true } +# Linux VXLAN default +resource "aws_security_group_rule" "worker-linux-vxlan" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "udp" + from_port = 8472 + to_port = 8472 + source_security_group_id = aws_security_group.controller.id +} + +resource "aws_security_group_rule" "worker-linux-vxlan-self" { + count = var.networking == "cilium" ? 1 : 0 + + security_group_id = aws_security_group.worker.id + + type = "ingress" + protocol = "udp" + from_port = 8472 + to_port = 8472 + self = true +} + # Allow Prometheus to scrape node-exporter daemonset resource "aws_security_group_rule" "worker-node-exporter" { security_group_id = aws_security_group.worker.id diff --git a/aws/fedora-coreos/kubernetes/ssh.tf b/aws/fedora-coreos/kubernetes/ssh.tf index b6396d028..5efecb3b7 100644 --- a/aws/fedora-coreos/kubernetes/ssh.tf +++ b/aws/fedora-coreos/kubernetes/ssh.tf @@ -12,13 +12,21 @@ resource "null_resource" "copy-controller-secrets" { count = var.controller_count depends_on = [ + aws_autoscaling_group.bastion, module.bootstrap, ] connection { - type = "ssh" - host = aws_instance.controllers.*.public_ip[count.index] - user = "core" + type = "ssh" + + host = aws_instance.controllers.*.private_ip[count.index] + user = var.ssh_user + private_key = var.ssh_private_key + + bastion_host = aws_lb.bastion.dns_name + bastion_user = var.ssh_user + bastion_private_key = var.ssh_private_key + timeout = "15m" } @@ -43,9 +51,16 @@ resource "null_resource" "bootstrap" { ] connection { - type = "ssh" - host = aws_instance.controllers[0].public_ip - user = "core" + type = "ssh" + + host = aws_instance.controllers[0].private_ip + user = var.ssh_user + private_key = var.ssh_private_key + + bastion_host = aws_lb.bastion.dns_name + bastion_user = var.ssh_user + bastion_private_key = var.ssh_private_key + timeout = "15m" } @@ -55,4 +70,3 @@ resource "null_resource" "bootstrap" { ] } } - diff --git a/aws/fedora-coreos/kubernetes/variables.tf b/aws/fedora-coreos/kubernetes/variables.tf index 13284f634..5347fa2c1 100644 --- a/aws/fedora-coreos/kubernetes/variables.tf +++ b/aws/fedora-coreos/kubernetes/variables.tf @@ -41,9 +41,9 @@ variable "worker_type" { default = "t3.small" } -variable "os_image" { +variable "os_stream" { type = string - description = "AMI channel for Fedora CoreOS (not yet used)" + description = "Fedora CoreOs image stream for instances (e.g. stable, testing, next)" default = "stable" } @@ -89,13 +89,14 @@ variable "worker_snippets" { default = [] } -# configuration - -variable "ssh_authorized_key" { - type = string - description = "SSH public key for user 'core'" +variable "bastion_snippets" { + type = list(string) + description = "Bastion Fedora CoreOS Config snippets" + default = [] } +# configuration + variable "asset_dir" { type = string description = "Absolute path to a directory where generated assets should be placed (contains secrets)" @@ -161,3 +162,66 @@ variable "cluster_domain_suffix" { default = "cluster.local" } +# Scoop variables + +variable "apiserver_aliases" { + type = list(string) + description = "List of alternate DNS names that can be used to address the Kubernetes API" + default = [] +} + +variable "apiserver_arguments" { + type = list(string) + default = [] + description = "Custom arguments to pass to the kube-apiserver" +} + +variable "bastion_type" { + type = string + default = "t2.micro" + description = "Bastion EC2 instance type" +} + +variable "bastion_count" { + type = number + default = 1 + description = "Number of bastion hosts to run" +} + +variable "ami" { + type = string + description = "Custom AMI to use to launch instances. When no value is set for a role, the latest stable CoreOS AMI is used." + default = "" +} + +variable "base_ignition_config_path" { + type = string + description = "The full path of the S3 object that stores base ignition config" +} + +variable "base_ignition_config_read_policy" { + type = string + description = "The contents of the IAM policy that allows reading base ignition config" +} + +variable "ssh_user" { + type = string + description = "Username for provisioning via SSH" +} + +variable "ssh_private_key" { + type = string + description = "SSH private key to use with provisioners" +} + +variable "subnet_tags_private" { + type = map(string) + description = "Tags to apply to private subnets" + default = {} +} + +variable "subnet_tags_public" { + type = map(string) + description = "Tags to apply to public subnets" + default = {} +} diff --git a/aws/fedora-coreos/kubernetes/versions.tf b/aws/fedora-coreos/kubernetes/versions.tf index bd2776f08..75b52097f 100644 --- a/aws/fedora-coreos/kubernetes/versions.tf +++ b/aws/fedora-coreos/kubernetes/versions.tf @@ -1,11 +1,15 @@ # Terraform version and plugin versions terraform { - required_version = "~> 0.12.6" + required_version = ">= 0.12.26, < 0.14.0" required_providers { - aws = "~> 2.23" - ct = "~> 0.4" + aws = ">= 2.23, <= 4.0" template = "~> 2.1" null = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } } } diff --git a/aws/fedora-coreos/kubernetes/workers.tf b/aws/fedora-coreos/kubernetes/workers.tf index e8b57e620..41436793e 100644 --- a/aws/fedora-coreos/kubernetes/workers.tf +++ b/aws/fedora-coreos/kubernetes/workers.tf @@ -1,24 +1,30 @@ module "workers" { - source = "./workers" - name = var.cluster_name + source = "./workers" + cluster_name = var.cluster_name + name = var.cluster_name # AWS vpc_id = aws_vpc.network.id - subnet_ids = aws_subnet.public.*.id + subnet_ids = aws_subnet.private.*.id security_groups = [aws_security_group.worker.id] worker_count = var.worker_count instance_type = var.worker_type - os_image = var.os_image + os_stream = var.os_stream disk_size = var.disk_size spot_price = var.worker_price target_groups = var.worker_target_groups # configuration kubeconfig = module.bootstrap.kubeconfig-kubelet - ssh_authorized_key = var.ssh_authorized_key service_cidr = var.service_cidr cluster_domain_suffix = var.cluster_domain_suffix snippets = var.worker_snippets node_labels = var.worker_node_labels + + # scoop + ami = var.ami + base_ignition_config_path = var.base_ignition_config_path + base_ignition_config_read_policy = var.base_ignition_config_read_policy + ignition_config_bucket = aws_s3_bucket.ignition_configs.id } diff --git a/aws/fedora-coreos/kubernetes/workers/ami.tf b/aws/fedora-coreos/kubernetes/workers/ami.tf index e32ce159f..a7ab184bd 100644 --- a/aws/fedora-coreos/kubernetes/workers/ami.tf +++ b/aws/fedora-coreos/kubernetes/workers/ami.tf @@ -13,16 +13,8 @@ data "aws_ami" "fedora-coreos" { values = ["hvm"] } - filter { - name = "name" - values = ["fedora-coreos-31.*.*.*-hvm"] - } - filter { name = "description" - values = ["Fedora CoreOS stable*"] + values = ["Fedora CoreOS ${var.os_stream} *"] } - - # try to filter out dev images (AWS filters can't) - name_regex = "^fedora-coreos-31.[0-9]*.[0-9]*.[0-9]*-hvm*" } diff --git a/aws/fedora-coreos/kubernetes/workers/fcc/worker.yaml b/aws/fedora-coreos/kubernetes/workers/fcc/worker.yaml index 0501b8de9..18af3d751 100644 --- a/aws/fedora-coreos/kubernetes/workers/fcc/worker.yaml +++ b/aws/fedora-coreos/kubernetes/workers/fcc/worker.yaml @@ -9,11 +9,12 @@ systemd: enabled: true contents: | [Unit] - Description=Wait for DNS entries + Description=Wait for DNS and hostname Before=kubelet.service [Service] Type=oneshot RemainAfterExit=true + ExecStartPre=/bin/sh -c 'while [ `hostname -s` == "localhost" ]; do sleep 1; done;' ExecStart=/bin/sh -c 'while ! /usr/bin/grep '^[^#[:space:]]' /etc/resolv.conf > /dev/null; do sleep 1; done' [Install] RequiredBy=kubelet.service @@ -21,9 +22,10 @@ systemd: enabled: true contents: | [Unit] - Description=Kubelet via Hyperkube (System Container) + Description=Kubelet (System Container) Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.8 ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -42,28 +44,29 @@ systemd: --volume /run:/run \ --volume /sys/fs/cgroup:/sys/fs/cgroup:ro \ --volume /sys/fs/cgroup/systemd:/sys/fs/cgroup/systemd \ - --volume /etc/pki/tls/certs:/usr/share/ca-certificates:ro \ + --volume /etc/pki/tls/certs:/etc/pki/tls/certs:ro \ + --volume /etc/pki/ca-trust/extracted:/etc/pki/ca-trust/extracted:ro \ --volume /var/lib/calico:/var/lib/calico:ro \ --volume /var/lib/docker:/var/lib/docker \ --volume /var/lib/kubelet:/var/lib/kubelet:rshared,z \ --volume /var/log:/var/log \ --volume /var/run/lock:/var/run/lock:z \ --volume /opt/cni/bin:/opt/cni/bin:z \ - quay.io/poseidon/kubelet:v1.18.2 \ + $${KUBELET_IMAGE} \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=systemd \ --cgroups-per-qos=true \ --enforce-node-allocatable=pods \ --client-ca-file=/etc/kubernetes/ca.crt \ + --cloud-provider=aws \ --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ --node-labels=node.kubernetes.io/node \ %{~ for label in split(",", node_labels) ~} @@ -71,6 +74,7 @@ systemd: %{~ endfor ~} --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/podman stop kubelet Delegate=yes @@ -87,7 +91,7 @@ systemd: Type=oneshot RemainAfterExit=true ExecStart=/bin/true - ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME' + ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.8 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME' [Install] WantedBy=multi-user.target storage: @@ -103,6 +107,18 @@ storage: contents: inline: | fs.inotify.max_user_watches=16184 + - path: /etc/sysctl.d/reverse-path-filter.conf + contents: + inline: | + net.ipv4.conf.default.rp_filter=0 + net.ipv4.conf.*.rp_filter=0 + - path: /etc/systemd/network/50-flannel.link + contents: + inline: | + [Match] + OriginalName=flannel* + [Link] + MACAddressPolicy=none - path: /etc/systemd/system.conf.d/accounting.conf contents: inline: | @@ -110,9 +126,3 @@ storage: DefaultCPUAccounting=yes DefaultMemoryAccounting=yes DefaultBlockIOAccounting=yes -passwd: - users: - - name: core - ssh_authorized_keys: - - ${ssh_authorized_key} - diff --git a/aws/fedora-coreos/kubernetes/workers/outputs.tf b/aws/fedora-coreos/kubernetes/workers/outputs.tf index 22f378855..ea9141c33 100644 --- a/aws/fedora-coreos/kubernetes/workers/outputs.tf +++ b/aws/fedora-coreos/kubernetes/workers/outputs.tf @@ -8,3 +8,12 @@ output "target_group_https" { value = aws_lb_target_group.workers-https.arn } +output "instance_role" { + description = "IAM role ARN attached to instances via instance profile" + value = aws_iam_role.worker.arn +} + +output "autoscaling_group" { + description = "Name of the workers autoscaling group" + value = aws_autoscaling_group.workers.name +} diff --git a/aws/fedora-coreos/kubernetes/workers/variables.tf b/aws/fedora-coreos/kubernetes/workers/variables.tf index 6c21d0a05..0f2f1322a 100644 --- a/aws/fedora-coreos/kubernetes/workers/variables.tf +++ b/aws/fedora-coreos/kubernetes/workers/variables.tf @@ -1,3 +1,8 @@ +variable "cluster_name" { + type = string + description = "Name of the cluster the workers belong to" +} + variable "name" { type = string description = "Unique name for the worker pool" @@ -34,9 +39,9 @@ variable "instance_type" { default = "t3.small" } -variable "os_image" { +variable "os_stream" { type = string - description = "AMI channel for Fedora CoreOS (not yet used)" + description = "Fedora CoreOs image stream for instances (e.g. stable, testing, next)" default = "stable" } @@ -83,11 +88,6 @@ variable "kubeconfig" { description = "Must be set to `kubeconfig` output by cluster" } -variable "ssh_authorized_key" { - type = string - description = "SSH public key for user 'core'" -} - variable "service_cidr" { type = string description = < 0 ? var.spot_price : null enable_monitoring = false - user_data = data.ct_config.worker-ignition.rendered + user_data = jsonencode(local.user_data_worker) # storage root_block_device { @@ -62,13 +86,22 @@ resource "aws_launch_configuration" "worker" { # network security_groups = var.security_groups + # iam + iam_instance_profile = aws_iam_instance_profile.worker.name + lifecycle { // Override the default destroy and replace update behavior create_before_destroy = true - ignore_changes = [image_id] + ignore_changes = [image_id, user_data] } } +resource "aws_s3_bucket_object" "worker-ignition" { + bucket = var.ignition_config_bucket + key = "worker.json" + content = data.ct_config.worker-ignition.rendered +} + # Worker Ignition config data "ct_config" "worker-ignition" { content = data.template_file.worker-config.rendered @@ -82,10 +115,62 @@ data "template_file" "worker-config" { vars = { kubeconfig = indent(10, var.kubeconfig) - ssh_authorized_key = var.ssh_authorized_key cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_domain_suffix = var.cluster_domain_suffix node_labels = join(",", var.node_labels) } } +resource "aws_iam_role_policy" "worker_read_base_ignition_config" { + name = "read-base-ignition-config" + role = aws_iam_role.worker.id + policy = var.base_ignition_config_read_policy +} + +resource "aws_iam_role_policy" "worker_read_ignition_configs" { + name = "read-ignition-configs" + role = aws_iam_role.worker.id + policy = data.aws_iam_policy_document.worker_read_ignition_configs.json +} + +data "aws_iam_policy_document" "worker_read_ignition_configs" { + statement { + effect = "Allow" + actions = ["s3:GetObject"] + resources = ["arn:aws:s3:::${var.ignition_config_bucket}/${aws_s3_bucket_object.worker-ignition.id}"] + } +} + +resource "aws_iam_role_policy" "worker_instance_read_ec2" { + name = "instance-read-ec2" + role = aws_iam_role.worker.id + policy = data.aws_iam_policy_document.worker_instance_read_ec2.json +} + +data "aws_iam_policy_document" "worker_instance_read_ec2" { + statement { + actions = ["ec2:Describe*"] + resources = ["*"] + } +} + +data "aws_iam_policy_document" "assume_role" { + statement { + actions = ["sts:AssumeRole"] + + principals { + type = "Service" + identifiers = ["ec2.amazonaws.com"] + } + } +} + +resource "aws_iam_role" "worker" { + name = "${var.name}-worker" + assume_role_policy = data.aws_iam_policy_document.assume_role.json +} + +resource "aws_iam_instance_profile" "worker" { + name = "${var.name}-worker" + role = aws_iam_role.worker.id +} diff --git a/azure/container-linux/kubernetes/README.md b/azure/container-linux/kubernetes/README.md index 29bff5080..8ac228b8e 100644 --- a/azure/container-linux/kubernetes/README.md +++ b/azure/container-linux/kubernetes/README.md @@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) -* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking +* Kubernetes v1.18.8 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [low-priority](https://typhoon.psdn.io/cl/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/) diff --git a/azure/container-linux/kubernetes/bootstrap.tf b/azure/container-linux/kubernetes/bootstrap.tf index ef9aea940..60eb2fff0 100644 --- a/azure/container-linux/kubernetes/bootstrap.tf +++ b/azure/container-linux/kubernetes/bootstrap.tf @@ -1,6 +1,6 @@ # Kubernetes assets (kubeconfig, manifests) module "bootstrap" { - source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc" + source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=8ef2fe7c992a8c15d696bd3e3a97be713b025e64" cluster_name = var.cluster_name api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] diff --git a/azure/container-linux/kubernetes/cl/controller.yaml b/azure/container-linux/kubernetes/cl/controller.yaml index 3246c8ee5..7262265fe 100644 --- a/azure/container-linux/kubernetes/cl/controller.yaml +++ b/azure/container-linux/kubernetes/cl/controller.yaml @@ -52,6 +52,8 @@ systemd: Description=Kubelet via Hyperkube Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.8 + Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver} ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -90,7 +92,7 @@ systemd: --mount volume=var-log,target=/var/log \ --volume opt-cni-bin,kind=host,source=/opt/cni/bin \ --mount volume=opt-cni-bin,target=/opt/cni/bin \ - docker://quay.io/poseidon/kubelet:v1.18.2 -- \ + $${KUBELET_IMAGE} -- \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ @@ -133,7 +135,7 @@ systemd: --volume script,kind=host,source=/opt/bootstrap/apply \ --mount volume=script,target=/apply \ --insecure-options=image \ - docker://quay.io/poseidon/kubelet:v1.18.2 \ + docker://quay.io/poseidon/kubelet:v1.18.8 \ --net=host \ --dns=host \ --exec=/apply diff --git a/azure/container-linux/kubernetes/controllers.tf b/azure/container-linux/kubernetes/controllers.tf index 671e7fb8f..38a79b983 100644 --- a/azure/container-linux/kubernetes/controllers.tf +++ b/azure/container-linux/kubernetes/controllers.tf @@ -53,21 +53,22 @@ resource "azurerm_linux_virtual_machine" "controllers" { storage_account_type = "Premium_LRS" } + # CoreOS Container Linux or Flatcar Container Linux source_image_reference { publisher = local.flavor == "flatcar" ? "Kinvolk" : "CoreOS" - offer = local.flavor == "flatcar" ? "flatcar-container-linux" : "CoreOS" + offer = local.flavor == "flatcar" ? "flatcar-container-linux-free" : "CoreOS" sku = local.channel version = "latest" } - # Gross hack just for Flatcar Linux + # Gross hack for Flatcar Linux dynamic "plan" { for_each = local.flavor == "flatcar" ? [1] : [] content { name = local.channel publisher = "kinvolk" - product = "flatcar-container-linux" + product = "flatcar-container-linux-free" } } @@ -138,10 +139,10 @@ resource "azurerm_network_interface_backend_address_pool_association" "controlle # Controller Ignition configs data "ct_config" "controller-ignitions" { - count = var.controller_count - content = data.template_file.controller-configs.*.rendered[count.index] - pretty_print = false - snippets = var.controller_snippets + count = var.controller_count + content = data.template_file.controller-configs.*.rendered[count.index] + strict = true + snippets = var.controller_snippets } # Controller Container Linux configs @@ -156,6 +157,7 @@ data "template_file" "controller-configs" { etcd_domain = "${var.cluster_name}-etcd${count.index}.${var.dns_zone}" # etcd0=https://cluster-etcd0.example.com,etcd1=https://cluster-etcd1.example.com,... etcd_initial_cluster = join(",", data.template_file.etcds.*.rendered) + cgroup_driver = local.flavor == "flatcar" && local.channel == "edge" ? "systemd" : "cgroupfs" kubeconfig = indent(10, module.bootstrap.kubeconfig-kubelet) ssh_authorized_key = var.ssh_authorized_key cluster_dns_service_ip = cidrhost(var.service_cidr, 10) diff --git a/azure/container-linux/kubernetes/network.tf b/azure/container-linux/kubernetes/network.tf index ea92a5a7d..562156117 100644 --- a/azure/container-linux/kubernetes/network.tf +++ b/azure/container-linux/kubernetes/network.tf @@ -21,7 +21,7 @@ resource "azurerm_subnet" "controller" { name = "controller" virtual_network_name = azurerm_virtual_network.network.name - address_prefix = cidrsubnet(var.host_cidr, 1, 0) + address_prefixes = [cidrsubnet(var.host_cidr, 1, 0)] } resource "azurerm_subnet_network_security_group_association" "controller" { @@ -34,7 +34,7 @@ resource "azurerm_subnet" "worker" { name = "worker" virtual_network_name = azurerm_virtual_network.network.name - address_prefix = cidrsubnet(var.host_cidr, 1, 1) + address_prefixes = [cidrsubnet(var.host_cidr, 1, 1)] } resource "azurerm_subnet_network_security_group_association" "worker" { diff --git a/azure/container-linux/kubernetes/security.tf b/azure/container-linux/kubernetes/security.tf index feb6fef54..c258ec2d6 100644 --- a/azure/container-linux/kubernetes/security.tf +++ b/azure/container-linux/kubernetes/security.tf @@ -7,6 +7,21 @@ resource "azurerm_network_security_group" "controller" { location = azurerm_resource_group.cluster.location } +resource "azurerm_network_security_rule" "controller-icmp" { + resource_group_name = azurerm_resource_group.cluster.name + + name = "allow-icmp" + network_security_group_name = azurerm_network_security_group.controller.name + priority = "1995" + access = "Allow" + direction = "Inbound" + protocol = "Icmp" + source_port_range = "*" + destination_port_range = "*" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.controller.address_prefix +} + resource "azurerm_network_security_rule" "controller-ssh" { resource_group_name = azurerm_resource_group.cluster.name @@ -100,6 +115,22 @@ resource "azurerm_network_security_rule" "controller-apiserver" { destination_address_prefix = azurerm_subnet.controller.address_prefix } +resource "azurerm_network_security_rule" "controller-cilium-health" { + resource_group_name = azurerm_resource_group.cluster.name + count = var.networking == "cilium" ? 1 : 0 + + name = "allow-cilium-health" + network_security_group_name = azurerm_network_security_group.controller.name + priority = "2019" + access = "Allow" + direction = "Inbound" + protocol = "Tcp" + source_port_range = "*" + destination_port_range = "4240" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.controller.address_prefix +} + resource "azurerm_network_security_rule" "controller-vxlan" { resource_group_name = azurerm_resource_group.cluster.name @@ -115,6 +146,21 @@ resource "azurerm_network_security_rule" "controller-vxlan" { destination_address_prefix = azurerm_subnet.controller.address_prefix } +resource "azurerm_network_security_rule" "controller-linux-vxlan" { + resource_group_name = azurerm_resource_group.cluster.name + + name = "allow-linux-vxlan" + network_security_group_name = azurerm_network_security_group.controller.name + priority = "2021" + access = "Allow" + direction = "Inbound" + protocol = "Udp" + source_port_range = "*" + destination_port_range = "8472" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.controller.address_prefix +} + # Allow Prometheus to scrape node-exporter daemonset resource "azurerm_network_security_rule" "controller-node-exporter" { resource_group_name = azurerm_resource_group.cluster.name @@ -191,6 +237,21 @@ resource "azurerm_network_security_group" "worker" { location = azurerm_resource_group.cluster.location } +resource "azurerm_network_security_rule" "worker-icmp" { + resource_group_name = azurerm_resource_group.cluster.name + + name = "allow-icmp" + network_security_group_name = azurerm_network_security_group.worker.name + priority = "1995" + access = "Allow" + direction = "Inbound" + protocol = "Icmp" + source_port_range = "*" + destination_port_range = "*" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.worker.address_prefix +} + resource "azurerm_network_security_rule" "worker-ssh" { resource_group_name = azurerm_resource_group.cluster.name @@ -236,6 +297,22 @@ resource "azurerm_network_security_rule" "worker-https" { destination_address_prefix = azurerm_subnet.worker.address_prefix } +resource "azurerm_network_security_rule" "worker-cilium-health" { + resource_group_name = azurerm_resource_group.cluster.name + count = var.networking == "cilium" ? 1 : 0 + + name = "allow-cilium-health" + network_security_group_name = azurerm_network_security_group.worker.name + priority = "2014" + access = "Allow" + direction = "Inbound" + protocol = "Tcp" + source_port_range = "*" + destination_port_range = "4240" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.worker.address_prefix +} + resource "azurerm_network_security_rule" "worker-vxlan" { resource_group_name = azurerm_resource_group.cluster.name @@ -251,6 +328,21 @@ resource "azurerm_network_security_rule" "worker-vxlan" { destination_address_prefix = azurerm_subnet.worker.address_prefix } +resource "azurerm_network_security_rule" "worker-linux-vxlan" { + resource_group_name = azurerm_resource_group.cluster.name + + name = "allow-linux-vxlan" + network_security_group_name = azurerm_network_security_group.worker.name + priority = "2016" + access = "Allow" + direction = "Inbound" + protocol = "Udp" + source_port_range = "*" + destination_port_range = "8472" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.worker.address_prefix +} + # Allow Prometheus to scrape node-exporter daemonset resource "azurerm_network_security_rule" "worker-node-exporter" { resource_group_name = azurerm_resource_group.cluster.name diff --git a/azure/container-linux/kubernetes/variables.tf b/azure/container-linux/kubernetes/variables.tf index 44827db49..50b57aed3 100644 --- a/azure/container-linux/kubernetes/variables.tf +++ b/azure/container-linux/kubernetes/variables.tf @@ -48,7 +48,7 @@ variable "worker_type" { variable "os_image" { type = string - description = "Channel for a Container Linux derivative (coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta)" + description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge, coreos-stable, coreos-beta, coreos-alpha)" default = "flatcar-stable" } diff --git a/azure/container-linux/kubernetes/versions.tf b/azure/container-linux/kubernetes/versions.tf index f9653cab3..e90c976c6 100644 --- a/azure/container-linux/kubernetes/versions.tf +++ b/azure/container-linux/kubernetes/versions.tf @@ -1,12 +1,16 @@ # Terraform version and plugin versions terraform { - required_version = "~> 0.12.6" + required_version = ">= 0.12.26, < 0.14.0" required_providers { - azurerm = "~> 2.0" - ct = "~> 0.3" + azurerm = "~> 2.8" template = "~> 2.1" null = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } } } diff --git a/azure/container-linux/kubernetes/workers/cl/worker.yaml b/azure/container-linux/kubernetes/workers/cl/worker.yaml index 607ce00ab..211576746 100644 --- a/azure/container-linux/kubernetes/workers/cl/worker.yaml +++ b/azure/container-linux/kubernetes/workers/cl/worker.yaml @@ -2,11 +2,11 @@ systemd: units: - name: docker.service - enable: true + enabled: true - name: locksmithd.service mask: true - name: wait-for-dns.service - enable: true + enabled: true contents: | [Unit] Description=Wait for DNS entries @@ -19,12 +19,14 @@ systemd: [Install] RequiredBy=kubelet.service - name: kubelet.service - enable: true + enabled: true contents: | [Unit] - Description=Kubelet via Hyperkube + Description=Kubelet Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.8 + Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver} ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -63,19 +65,19 @@ systemd: --mount volume=var-log,target=/var/log \ --volume opt-cni-bin,kind=host,source=/opt/cni/bin \ --mount volume=opt-cni-bin,target=/opt/cni/bin \ - docker://quay.io/poseidon/kubelet:v1.18.2 -- \ + $${KUBELET_IMAGE} -- \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ + --cgroup-driver=$${KUBELET_CGROUP_DRIVER} \ --client-ca-file=/etc/kubernetes/ca.crt \ --cloud-provider=aws \ --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ --node-labels=node.kubernetes.io/node \ %{~ for label in split(",", node_labels) ~} @@ -83,6 +85,7 @@ systemd: %{~ endfor ~} --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid Restart=always @@ -90,7 +93,7 @@ systemd: [Install] WantedBy=multi-user.target - name: delete-node.service - enable: true + enabled: true contents: | [Unit] Description=Waiting to delete Kubernetes node on shutdown @@ -111,6 +114,7 @@ storage: ${kubeconfig} - path: /etc/sysctl.d/max-user-watches.conf filesystem: root + mode: 0644 contents: inline: | fs.inotify.max_user_watches=16184 @@ -126,7 +130,7 @@ storage: --volume config,kind=host,source=/etc/kubernetes \ --mount volume=config,target=/etc/kubernetes \ --insecure-options=image \ - docker://quay.io/poseidon/kubelet:v1.18.2 \ + docker://quay.io/poseidon/kubelet:v1.18.8 \ --net=host \ --dns=host \ --exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname | tr '[:upper:]' '[:lower:]') diff --git a/azure/container-linux/kubernetes/workers/variables.tf b/azure/container-linux/kubernetes/workers/variables.tf index 0ebd606fe..48197d3ef 100644 --- a/azure/container-linux/kubernetes/workers/variables.tf +++ b/azure/container-linux/kubernetes/workers/variables.tf @@ -46,7 +46,7 @@ variable "vm_type" { variable "os_image" { type = string - description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, coreos-stable, coreos-beta, coreos-alpha)" + description = "Channel for a Container Linux derivative (flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge, coreos-stable, coreos-beta, coreos-alpha)" default = "flatcar-stable" } diff --git a/azure/container-linux/kubernetes/workers/versions.tf b/azure/container-linux/kubernetes/workers/versions.tf index ac97c6ac8..b8f7c72ea 100644 --- a/azure/container-linux/kubernetes/workers/versions.tf +++ b/azure/container-linux/kubernetes/workers/versions.tf @@ -1,4 +1,14 @@ +# Terraform version and plugin versions terraform { - required_version = ">= 0.12" + required_version = ">= 0.12.26, < 0.14.0" + required_providers { + azurerm = "~> 2.8" + template = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } + } } diff --git a/azure/container-linux/kubernetes/workers/workers.tf b/azure/container-linux/kubernetes/workers/workers.tf index 3e151e6e5..9070051d8 100644 --- a/azure/container-linux/kubernetes/workers/workers.tf +++ b/azure/container-linux/kubernetes/workers/workers.tf @@ -24,21 +24,22 @@ resource "azurerm_linux_virtual_machine_scale_set" "workers" { caching = "ReadWrite" } + # CoreOS Container Linux or Flatcar Container Linux source_image_reference { publisher = local.flavor == "flatcar" ? "Kinvolk" : "CoreOS" - offer = local.flavor == "flatcar" ? "flatcar-container-linux" : "CoreOS" + offer = local.flavor == "flatcar" ? "flatcar-container-linux-free" : "CoreOS" sku = local.channel version = "latest" } - # Gross hack just for Flatcar Linux + # Gross hack for Flatcar Linux dynamic "plan" { for_each = local.flavor == "flatcar" ? [1] : [] content { name = local.channel publisher = "kinvolk" - product = "flatcar-container-linux" + product = "flatcar-container-linux-free" } } @@ -96,9 +97,9 @@ resource "azurerm_monitor_autoscale_setting" "workers" { # Worker Ignition configs data "ct_config" "worker-ignition" { - content = data.template_file.worker-config.rendered - pretty_print = false - snippets = var.snippets + content = data.template_file.worker-config.rendered + strict = true + snippets = var.snippets } # Worker Container Linux configs @@ -110,6 +111,7 @@ data "template_file" "worker-config" { ssh_authorized_key = var.ssh_authorized_key cluster_dns_service_ip = cidrhost(var.service_cidr, 10) cluster_domain_suffix = var.cluster_domain_suffix + cgroup_driver = local.flavor == "flatcar" && local.channel == "edge" ? "systemd" : "cgroupfs" node_labels = join(",", var.node_labels) } } diff --git a/azure/fedora-coreos/kubernetes/README.md b/azure/fedora-coreos/kubernetes/README.md index 17b80679d..10d9cc8bf 100644 --- a/azure/fedora-coreos/kubernetes/README.md +++ b/azure/fedora-coreos/kubernetes/README.md @@ -11,13 +11,22 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) -* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking -* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) +* Kubernetes v1.18.8 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking +* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot priority](https://typhoon.psdn.io/fedora-coreos/azure/#low-priority) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/) customization * Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/) ## Docs Please see the [official docs](https://typhoon.psdn.io) and the Azure [tutorial](https://typhoon.psdn.io/fedora-coreos/azure/). +* Kubernetes v1.18.6 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking +* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing +* Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/) customization +* Ready for Ingress, Prometheus, Grafana, CSI, and other [addons](https://typhoon.psdn.io/addons/overview/) + +## Docs + +Please see the [official docs](https://typhoon.psdn.io) and the Digital Ocean [tutorial](https://typhoon.psdn.io/fedora-coreos/digitalocean/). diff --git a/azure/fedora-coreos/kubernetes/bootstrap.tf b/azure/fedora-coreos/kubernetes/bootstrap.tf index 152f4adcb..c882c2948 100644 --- a/azure/fedora-coreos/kubernetes/bootstrap.tf +++ b/azure/fedora-coreos/kubernetes/bootstrap.tf @@ -1,6 +1,6 @@ # Kubernetes assets (kubeconfig, manifests) module "bootstrap" { - source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc" + source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=8ef2fe7c992a8c15d696bd3e3a97be713b025e64" cluster_name = var.cluster_name api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] @@ -10,8 +10,9 @@ module "bootstrap" { networking = var.networking # only effective with Calico networking + # we should be able to use 1450 MTU, but in practice, 1410 was needed network_encapsulation = "vxlan" - network_mtu = "1450" + network_mtu = "1410" pod_cidr = var.pod_cidr service_cidr = var.service_cidr diff --git a/azure/fedora-coreos/kubernetes/controllers.tf b/azure/fedora-coreos/kubernetes/controllers.tf index b82992315..02853e33a 100644 --- a/azure/fedora-coreos/kubernetes/controllers.tf +++ b/azure/fedora-coreos/kubernetes/controllers.tf @@ -113,10 +113,10 @@ resource "azurerm_network_interface_backend_address_pool_association" "controlle # Controller Ignition configs data "ct_config" "controller-ignitions" { - count = var.controller_count - content = data.template_file.controller-configs.*.rendered[count.index] - pretty_print = false - snippets = var.controller_snippets + count = var.controller_count + content = data.template_file.controller-configs.*.rendered[count.index] + strict = true + snippets = var.controller_snippets } # Controller Fedora CoreOS configs diff --git a/azure/fedora-coreos/kubernetes/fcc/controller.yaml b/azure/fedora-coreos/kubernetes/fcc/controller.yaml index 95af2d12c..6949ba04e 100644 --- a/azure/fedora-coreos/kubernetes/fcc/controller.yaml +++ b/azure/fedora-coreos/kubernetes/fcc/controller.yaml @@ -28,7 +28,7 @@ systemd: --network host \ --volume /var/lib/etcd:/var/lib/etcd:rw,Z \ --volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \ - quay.io/coreos/etcd:v3.4.7 + quay.io/coreos/etcd:v3.4.10 ExecStop=/usr/bin/podman stop etcd [Install] WantedBy=multi-user.target @@ -51,9 +51,10 @@ systemd: enabled: true contents: | [Unit] - Description=Kubelet via Hyperkube (System Container) + Description=Kubelet (System Container) Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.8 ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -79,10 +80,11 @@ systemd: --volume /var/log:/var/log \ --volume /var/run/lock:/var/run/lock:z \ --volume /opt/cni/bin:/opt/cni/bin:z \ - quay.io/poseidon/kubelet:v1.18.2 \ + $${KUBELET_IMAGE} \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=systemd \ --cgroups-per-qos=true \ --enforce-node-allocatable=pods \ @@ -90,16 +92,14 @@ systemd: --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ - --node-labels=node.kubernetes.io/master \ --node-labels=node.kubernetes.io/controller="true" \ --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ - --register-with-taints=node-role.kubernetes.io/master=:NoSchedule \ + --register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/podman stop kubelet Delegate=yes @@ -119,15 +119,17 @@ systemd: ExecStartPre=-/usr/bin/podman rm bootstrap ExecStart=/usr/bin/podman run --name bootstrap \ --network host \ - --volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,Z \ + --volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,z \ --volume /opt/bootstrap/assets:/assets:ro,Z \ --volume /opt/bootstrap/apply:/apply:ro,Z \ --entrypoint=/apply \ - quay.io/poseidon/kubelet:v1.18.2 + quay.io/poseidon/kubelet:v1.18.8 ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done ExecStartPost=-/usr/bin/podman stop bootstrap storage: directories: + - path: /var/lib/etcd + mode: 0700 - path: /etc/kubernetes - path: /opt/bootstrap files: @@ -151,11 +153,11 @@ storage: chmod -R 500 /etc/ssl/etcd mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/ mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/ - sudo mkdir -p /etc/kubernetes/manifests - sudo mv static-manifests/* /etc/kubernetes/manifests/ - sudo mkdir -p /opt/bootstrap/assets - sudo mv manifests /opt/bootstrap/assets/manifests - sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/ + mkdir -p /etc/kubernetes/manifests + mv static-manifests/* /etc/kubernetes/manifests/ + mkdir -p /opt/bootstrap/assets + mv manifests /opt/bootstrap/assets/manifests + mv manifests-networking/* /opt/bootstrap/assets/manifests/ rm -rf assets auth static-manifests tls manifests-networking - path: /opt/bootstrap/apply mode: 0544 @@ -175,6 +177,18 @@ storage: contents: inline: | fs.inotify.max_user_watches=16184 + - path: /etc/sysctl.d/reverse-path-filter.conf + contents: + inline: | + net.ipv4.conf.default.rp_filter=0 + net.ipv4.conf.*.rp_filter=0 + - path: /etc/systemd/network/50-flannel.link + contents: + inline: | + [Match] + OriginalName=flannel* + [Link] + MACAddressPolicy=none - path: /etc/systemd/system.conf.d/accounting.conf contents: inline: | diff --git a/azure/fedora-coreos/kubernetes/network.tf b/azure/fedora-coreos/kubernetes/network.tf index ea92a5a7d..562156117 100644 --- a/azure/fedora-coreos/kubernetes/network.tf +++ b/azure/fedora-coreos/kubernetes/network.tf @@ -21,7 +21,7 @@ resource "azurerm_subnet" "controller" { name = "controller" virtual_network_name = azurerm_virtual_network.network.name - address_prefix = cidrsubnet(var.host_cidr, 1, 0) + address_prefixes = [cidrsubnet(var.host_cidr, 1, 0)] } resource "azurerm_subnet_network_security_group_association" "controller" { @@ -34,7 +34,7 @@ resource "azurerm_subnet" "worker" { name = "worker" virtual_network_name = azurerm_virtual_network.network.name - address_prefix = cidrsubnet(var.host_cidr, 1, 1) + address_prefixes = [cidrsubnet(var.host_cidr, 1, 1)] } resource "azurerm_subnet_network_security_group_association" "worker" { diff --git a/azure/fedora-coreos/kubernetes/security.tf b/azure/fedora-coreos/kubernetes/security.tf index feb6fef54..c258ec2d6 100644 --- a/azure/fedora-coreos/kubernetes/security.tf +++ b/azure/fedora-coreos/kubernetes/security.tf @@ -7,6 +7,21 @@ resource "azurerm_network_security_group" "controller" { location = azurerm_resource_group.cluster.location } +resource "azurerm_network_security_rule" "controller-icmp" { + resource_group_name = azurerm_resource_group.cluster.name + + name = "allow-icmp" + network_security_group_name = azurerm_network_security_group.controller.name + priority = "1995" + access = "Allow" + direction = "Inbound" + protocol = "Icmp" + source_port_range = "*" + destination_port_range = "*" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.controller.address_prefix +} + resource "azurerm_network_security_rule" "controller-ssh" { resource_group_name = azurerm_resource_group.cluster.name @@ -100,6 +115,22 @@ resource "azurerm_network_security_rule" "controller-apiserver" { destination_address_prefix = azurerm_subnet.controller.address_prefix } +resource "azurerm_network_security_rule" "controller-cilium-health" { + resource_group_name = azurerm_resource_group.cluster.name + count = var.networking == "cilium" ? 1 : 0 + + name = "allow-cilium-health" + network_security_group_name = azurerm_network_security_group.controller.name + priority = "2019" + access = "Allow" + direction = "Inbound" + protocol = "Tcp" + source_port_range = "*" + destination_port_range = "4240" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.controller.address_prefix +} + resource "azurerm_network_security_rule" "controller-vxlan" { resource_group_name = azurerm_resource_group.cluster.name @@ -115,6 +146,21 @@ resource "azurerm_network_security_rule" "controller-vxlan" { destination_address_prefix = azurerm_subnet.controller.address_prefix } +resource "azurerm_network_security_rule" "controller-linux-vxlan" { + resource_group_name = azurerm_resource_group.cluster.name + + name = "allow-linux-vxlan" + network_security_group_name = azurerm_network_security_group.controller.name + priority = "2021" + access = "Allow" + direction = "Inbound" + protocol = "Udp" + source_port_range = "*" + destination_port_range = "8472" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.controller.address_prefix +} + # Allow Prometheus to scrape node-exporter daemonset resource "azurerm_network_security_rule" "controller-node-exporter" { resource_group_name = azurerm_resource_group.cluster.name @@ -191,6 +237,21 @@ resource "azurerm_network_security_group" "worker" { location = azurerm_resource_group.cluster.location } +resource "azurerm_network_security_rule" "worker-icmp" { + resource_group_name = azurerm_resource_group.cluster.name + + name = "allow-icmp" + network_security_group_name = azurerm_network_security_group.worker.name + priority = "1995" + access = "Allow" + direction = "Inbound" + protocol = "Icmp" + source_port_range = "*" + destination_port_range = "*" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.worker.address_prefix +} + resource "azurerm_network_security_rule" "worker-ssh" { resource_group_name = azurerm_resource_group.cluster.name @@ -236,6 +297,22 @@ resource "azurerm_network_security_rule" "worker-https" { destination_address_prefix = azurerm_subnet.worker.address_prefix } +resource "azurerm_network_security_rule" "worker-cilium-health" { + resource_group_name = azurerm_resource_group.cluster.name + count = var.networking == "cilium" ? 1 : 0 + + name = "allow-cilium-health" + network_security_group_name = azurerm_network_security_group.worker.name + priority = "2014" + access = "Allow" + direction = "Inbound" + protocol = "Tcp" + source_port_range = "*" + destination_port_range = "4240" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.worker.address_prefix +} + resource "azurerm_network_security_rule" "worker-vxlan" { resource_group_name = azurerm_resource_group.cluster.name @@ -251,6 +328,21 @@ resource "azurerm_network_security_rule" "worker-vxlan" { destination_address_prefix = azurerm_subnet.worker.address_prefix } +resource "azurerm_network_security_rule" "worker-linux-vxlan" { + resource_group_name = azurerm_resource_group.cluster.name + + name = "allow-linux-vxlan" + network_security_group_name = azurerm_network_security_group.worker.name + priority = "2016" + access = "Allow" + direction = "Inbound" + protocol = "Udp" + source_port_range = "*" + destination_port_range = "8472" + source_address_prefixes = [azurerm_subnet.controller.address_prefix, azurerm_subnet.worker.address_prefix] + destination_address_prefix = azurerm_subnet.worker.address_prefix +} + # Allow Prometheus to scrape node-exporter daemonset resource "azurerm_network_security_rule" "worker-node-exporter" { resource_group_name = azurerm_resource_group.cluster.name diff --git a/azure/fedora-coreos/kubernetes/versions.tf b/azure/fedora-coreos/kubernetes/versions.tf index f9653cab3..e90c976c6 100644 --- a/azure/fedora-coreos/kubernetes/versions.tf +++ b/azure/fedora-coreos/kubernetes/versions.tf @@ -1,12 +1,16 @@ # Terraform version and plugin versions terraform { - required_version = "~> 0.12.6" + required_version = ">= 0.12.26, < 0.14.0" required_providers { - azurerm = "~> 2.0" - ct = "~> 0.3" + azurerm = "~> 2.8" template = "~> 2.1" null = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } } } diff --git a/azure/fedora-coreos/kubernetes/workers/fcc/worker.yaml b/azure/fedora-coreos/kubernetes/workers/fcc/worker.yaml index 664f0c6a2..e9e382786 100644 --- a/azure/fedora-coreos/kubernetes/workers/fcc/worker.yaml +++ b/azure/fedora-coreos/kubernetes/workers/fcc/worker.yaml @@ -21,9 +21,10 @@ systemd: enabled: true contents: | [Unit] - Description=Kubelet via Hyperkube (System Container) + Description=Kubelet (System Container) Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.8 ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -49,10 +50,11 @@ systemd: --volume /var/log:/var/log \ --volume /var/run/lock:/var/run/lock:z \ --volume /opt/cni/bin:/opt/cni/bin:z \ - quay.io/poseidon/kubelet:v1.18.2 \ + $${KUBELET_IMAGE} \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=systemd \ --cgroups-per-qos=true \ --enforce-node-allocatable=pods \ @@ -60,10 +62,8 @@ systemd: --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ --node-labels=node.kubernetes.io/node \ %{~ for label in split(",", node_labels) ~} @@ -71,6 +71,7 @@ systemd: %{~ endfor ~} --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/podman stop kubelet Delegate=yes @@ -87,7 +88,7 @@ systemd: Type=oneshot RemainAfterExit=true ExecStart=/bin/true - ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME' + ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.8 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME' [Install] WantedBy=multi-user.target storage: @@ -103,6 +104,18 @@ storage: contents: inline: | fs.inotify.max_user_watches=16184 + - path: /etc/sysctl.d/reverse-path-filter.conf + contents: + inline: | + net.ipv4.conf.default.rp_filter=0 + net.ipv4.conf.*.rp_filter=0 + - path: /etc/systemd/network/50-flannel.link + contents: + inline: | + [Match] + OriginalName=flannel* + [Link] + MACAddressPolicy=none - path: /etc/systemd/system.conf.d/accounting.conf contents: inline: | diff --git a/azure/fedora-coreos/kubernetes/workers/versions.tf b/azure/fedora-coreos/kubernetes/workers/versions.tf index ac97c6ac8..b8f7c72ea 100644 --- a/azure/fedora-coreos/kubernetes/workers/versions.tf +++ b/azure/fedora-coreos/kubernetes/workers/versions.tf @@ -1,4 +1,14 @@ +# Terraform version and plugin versions terraform { - required_version = ">= 0.12" + required_version = ">= 0.12.26, < 0.14.0" + required_providers { + azurerm = "~> 2.8" + template = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } + } } diff --git a/azure/fedora-coreos/kubernetes/workers/workers.tf b/azure/fedora-coreos/kubernetes/workers/workers.tf index a5c2c209c..6ecb81a50 100644 --- a/azure/fedora-coreos/kubernetes/workers/workers.tf +++ b/azure/fedora-coreos/kubernetes/workers/workers.tf @@ -72,9 +72,9 @@ resource "azurerm_monitor_autoscale_setting" "workers" { # Worker Ignition configs data "ct_config" "worker-ignition" { - content = data.template_file.worker-config.rendered - pretty_print = false - snippets = var.snippets + content = data.template_file.worker-config.rendered + strict = true + snippets = var.snippets } # Worker Fedora CoreOS configs diff --git a/bare-metal/container-linux/kubernetes/README.md b/bare-metal/container-linux/kubernetes/README.md index 5c52402e8..7a10f4ddc 100644 --- a/bare-metal/container-linux/kubernetes/README.md +++ b/bare-metal/container-linux/kubernetes/README.md @@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) -* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking +* Kubernetes v1.18.8 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/) diff --git a/bare-metal/container-linux/kubernetes/bootstrap.tf b/bare-metal/container-linux/kubernetes/bootstrap.tf index 6d577ae0c..b71284888 100644 --- a/bare-metal/container-linux/kubernetes/bootstrap.tf +++ b/bare-metal/container-linux/kubernetes/bootstrap.tf @@ -1,6 +1,6 @@ # Kubernetes assets (kubeconfig, manifests) module "bootstrap" { - source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc" + source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=8ef2fe7c992a8c15d696bd3e3a97be713b025e64" cluster_name = var.cluster_name api_servers = [var.k8s_domain_name] diff --git a/bare-metal/container-linux/kubernetes/cl/controller.yaml b/bare-metal/container-linux/kubernetes/cl/controller.yaml index ea9c2d9d1..112340ca9 100644 --- a/bare-metal/container-linux/kubernetes/cl/controller.yaml +++ b/bare-metal/container-linux/kubernetes/cl/controller.yaml @@ -60,6 +60,7 @@ systemd: Description=Kubelet via Hyperkube Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.8 Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver} ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests @@ -103,7 +104,7 @@ systemd: --mount volume=etc-iscsi,target=/etc/iscsi \ --volume usr-sbin-iscsiadm,kind=host,source=/usr/sbin/iscsiadm \ --mount volume=usr-sbin-iscsiadm,target=/sbin/iscsiadm \ - docker://quay.io/poseidon/kubelet:v1.18.2 -- \ + $${KUBELET_IMAGE} -- \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ @@ -147,7 +148,7 @@ systemd: --volume script,kind=host,source=/opt/bootstrap/apply \ --mount volume=script,target=/apply \ --insecure-options=image \ - docker://quay.io/poseidon/kubelet:v1.18.2 \ + docker://quay.io/poseidon/kubelet:v1.18.8 \ --net=host \ --dns=host \ --exec=/apply diff --git a/bare-metal/container-linux/kubernetes/cl/install.yaml b/bare-metal/container-linux/kubernetes/cl/install.yaml index e8562c932..6a36f19dc 100644 --- a/bare-metal/container-linux/kubernetes/cl/install.yaml +++ b/bare-metal/container-linux/kubernetes/cl/install.yaml @@ -2,7 +2,7 @@ systemd: units: - name: installer.service - enable: true + enabled: true contents: | [Unit] Requires=network-online.target diff --git a/bare-metal/container-linux/kubernetes/cl/worker.yaml b/bare-metal/container-linux/kubernetes/cl/worker.yaml index 164323cf5..a581195b3 100644 --- a/bare-metal/container-linux/kubernetes/cl/worker.yaml +++ b/bare-metal/container-linux/kubernetes/cl/worker.yaml @@ -2,11 +2,11 @@ systemd: units: - name: docker.service - enable: true + enabled: true - name: locksmithd.service mask: true - name: kubelet.path - enable: true + enabled: true contents: | [Unit] Description=Watch for kubeconfig @@ -15,7 +15,7 @@ systemd: [Install] WantedBy=multi-user.target - name: wait-for-dns.service - enable: true + enabled: true contents: | [Unit] Description=Wait for DNS entries @@ -30,9 +30,10 @@ systemd: - name: kubelet.service contents: | [Unit] - Description=Kubelet via Hyperkube + Description=Kubelet Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.8 Environment=KUBELET_CGROUP_DRIVER=${cgroup_driver} ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests @@ -76,20 +77,19 @@ systemd: --mount volume=etc-iscsi,target=/etc/iscsi \ --volume usr-sbin-iscsiadm,kind=host,source=/usr/sbin/iscsiadm \ --mount volume=usr-sbin-iscsiadm,target=/sbin/iscsiadm \ - docker://quay.io/poseidon/kubelet:v1.18.2 -- \ + $${KUBELET_IMAGE} -- \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=$${KUBELET_CGROUP_DRIVER} \ --client-ca-file=/etc/kubernetes/ca.crt \ --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ --hostname-override=${domain_name} \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ --node-labels=node.kubernetes.io/node \ %{~ for label in compact(split(",", node_labels)) ~} @@ -100,6 +100,7 @@ systemd: %{~ endfor ~} --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid Restart=always @@ -111,6 +112,7 @@ storage: directories: - path: /etc/kubernetes filesystem: root + mode: 0755 files: - path: /etc/hostname filesystem: root @@ -120,6 +122,7 @@ storage: ${domain_name} - path: /etc/sysctl.d/max-user-watches.conf filesystem: root + mode: 0644 contents: inline: | fs.inotify.max_user_watches=16184 diff --git a/bare-metal/container-linux/kubernetes/profiles.tf b/bare-metal/container-linux/kubernetes/profiles.tf index 7ec6cc3a7..eee772539 100644 --- a/bare-metal/container-linux/kubernetes/profiles.tf +++ b/bare-metal/container-linux/kubernetes/profiles.tf @@ -141,10 +141,10 @@ resource "matchbox_profile" "controllers" { } data "ct_config" "controller-ignitions" { - count = length(var.controllers) - content = data.template_file.controller-configs.*.rendered[count.index] - pretty_print = false - snippets = lookup(var.snippets, var.controllers.*.name[count.index], []) + count = length(var.controllers) + content = data.template_file.controller-configs.*.rendered[count.index] + strict = true + snippets = lookup(var.snippets, var.controllers.*.name[count.index], []) } data "template_file" "controller-configs" { @@ -171,10 +171,10 @@ resource "matchbox_profile" "workers" { } data "ct_config" "worker-ignitions" { - count = length(var.workers) - content = data.template_file.worker-configs.*.rendered[count.index] - pretty_print = false - snippets = lookup(var.snippets, var.workers.*.name[count.index], []) + count = length(var.workers) + content = data.template_file.worker-configs.*.rendered[count.index] + strict = true + snippets = lookup(var.snippets, var.workers.*.name[count.index], []) } data "template_file" "worker-configs" { diff --git a/bare-metal/container-linux/kubernetes/versions.tf b/bare-metal/container-linux/kubernetes/versions.tf index f7f7aaf69..1efd6a18c 100644 --- a/bare-metal/container-linux/kubernetes/versions.tf +++ b/bare-metal/container-linux/kubernetes/versions.tf @@ -1,12 +1,20 @@ # Terraform version and plugin versions terraform { - required_version = "~> 0.12.6" + required_version = ">= 0.12.26, < 0.14.0" required_providers { - matchbox = "~> 0.3.0" - ct = "~> 0.3" template = "~> 2.1" null = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } + + matchbox = { + source = "poseidon/matchbox" + version = "~> 0.4.1" + } } } diff --git a/bare-metal/fedora-coreos/kubernetes/README.md b/bare-metal/fedora-coreos/kubernetes/README.md index d9da6a1c2..cb448d84b 100644 --- a/bare-metal/fedora-coreos/kubernetes/README.md +++ b/bare-metal/fedora-coreos/kubernetes/README.md @@ -11,9 +11,9 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) -* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking -* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) +* Kubernetes v1.18.8 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking +* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing * Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/) diff --git a/bare-metal/fedora-coreos/kubernetes/bootstrap.tf b/bare-metal/fedora-coreos/kubernetes/bootstrap.tf index daa929a7c..fdc3da51e 100644 --- a/bare-metal/fedora-coreos/kubernetes/bootstrap.tf +++ b/bare-metal/fedora-coreos/kubernetes/bootstrap.tf @@ -1,6 +1,6 @@ # Kubernetes assets (kubeconfig, manifests) module "bootstrap" { - source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc" + source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=8ef2fe7c992a8c15d696bd3e3a97be713b025e64" cluster_name = var.cluster_name api_servers = [var.k8s_domain_name] diff --git a/bare-metal/fedora-coreos/kubernetes/fcc/controller.yaml b/bare-metal/fedora-coreos/kubernetes/fcc/controller.yaml index c16cc4afd..4fa1342fa 100644 --- a/bare-metal/fedora-coreos/kubernetes/fcc/controller.yaml +++ b/bare-metal/fedora-coreos/kubernetes/fcc/controller.yaml @@ -28,7 +28,7 @@ systemd: --network host \ --volume /var/lib/etcd:/var/lib/etcd:rw,Z \ --volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \ - quay.io/coreos/etcd:v3.4.7 + quay.io/coreos/etcd:v3.4.10 ExecStop=/usr/bin/podman stop etcd [Install] WantedBy=multi-user.target @@ -50,9 +50,10 @@ systemd: - name: kubelet.service contents: | [Unit] - Description=Kubelet via Hyperkube (System Container) + Description=Kubelet (System Container) Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.8 ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -80,10 +81,11 @@ systemd: --volume /opt/cni/bin:/opt/cni/bin:z \ --volume /etc/iscsi:/etc/iscsi \ --volume /sbin/iscsiadm:/sbin/iscsiadm \ - quay.io/poseidon/kubelet:v1.18.2 \ + $${KUBELET_IMAGE} \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=systemd \ --cgroups-per-qos=true \ --enforce-node-allocatable=pods \ @@ -91,17 +93,15 @@ systemd: --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ --hostname-override=${domain_name} \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ - --node-labels=node.kubernetes.io/master \ --node-labels=node.kubernetes.io/controller="true" \ --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ - --register-with-taints=node-role.kubernetes.io/master=:NoSchedule \ + --register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/podman stop kubelet Delegate=yes @@ -127,17 +127,20 @@ systemd: Type=oneshot RemainAfterExit=true WorkingDirectory=/opt/bootstrap + ExecStartPre=-/usr/bin/podman rm bootstrap ExecStart=/usr/bin/podman run --name bootstrap \ --network host \ - --volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,Z \ + --volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,z \ --volume /opt/bootstrap/assets:/assets:ro,Z \ --volume /opt/bootstrap/apply:/apply:ro,Z \ --entrypoint=/apply \ - quay.io/poseidon/kubelet:v1.18.2 + quay.io/poseidon/kubelet:v1.18.8 ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done ExecStartPost=-/usr/bin/podman stop bootstrap storage: directories: + - path: /var/lib/etcd + mode: 0700 - path: /etc/kubernetes - path: /opt/bootstrap files: @@ -161,11 +164,11 @@ storage: chmod -R 500 /etc/ssl/etcd mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/ mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/ - sudo mkdir -p /etc/kubernetes/manifests - sudo mv static-manifests/* /etc/kubernetes/manifests/ - sudo mkdir -p /opt/bootstrap/assets - sudo mv manifests /opt/bootstrap/assets/manifests - sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/ + mkdir -p /etc/kubernetes/manifests + mv static-manifests/* /etc/kubernetes/manifests/ + mkdir -p /opt/bootstrap/assets + mv manifests /opt/bootstrap/assets/manifests + mv manifests-networking/* /opt/bootstrap/assets/manifests/ rm -rf assets auth static-manifests tls manifests-networking - path: /opt/bootstrap/apply mode: 0544 @@ -185,6 +188,18 @@ storage: contents: inline: | fs.inotify.max_user_watches=16184 + - path: /etc/sysctl.d/reverse-path-filter.conf + contents: + inline: | + net.ipv4.conf.default.rp_filter=0 + net.ipv4.conf.*.rp_filter=0 + - path: /etc/systemd/network/50-flannel.link + contents: + inline: | + [Match] + OriginalName=flannel* + [Link] + MACAddressPolicy=none - path: /etc/systemd/system.conf.d/accounting.conf contents: inline: | diff --git a/bare-metal/fedora-coreos/kubernetes/fcc/worker.yaml b/bare-metal/fedora-coreos/kubernetes/fcc/worker.yaml index 5133d36e6..d0aee9a60 100644 --- a/bare-metal/fedora-coreos/kubernetes/fcc/worker.yaml +++ b/bare-metal/fedora-coreos/kubernetes/fcc/worker.yaml @@ -20,9 +20,10 @@ systemd: - name: kubelet.service contents: | [Unit] - Description=Kubelet via Hyperkube (System Container) + Description=Kubelet (System Container) Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.8 ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -50,10 +51,11 @@ systemd: --volume /opt/cni/bin:/opt/cni/bin:z \ --volume /etc/iscsi:/etc/iscsi \ --volume /sbin/iscsiadm:/sbin/iscsiadm \ - quay.io/poseidon/kubelet:v1.18.2 \ + $${KUBELET_IMAGE} \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=systemd \ --cgroups-per-qos=true \ --enforce-node-allocatable=pods \ @@ -61,11 +63,9 @@ systemd: --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ --hostname-override=${domain_name} \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ --node-labels=node.kubernetes.io/node \ %{~ for label in compact(split(",", node_labels)) ~} @@ -76,6 +76,7 @@ systemd: %{~ endfor ~} --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/podman stop kubelet Delegate=yes @@ -105,6 +106,18 @@ storage: contents: inline: | fs.inotify.max_user_watches=16184 + - path: /etc/sysctl.d/reverse-path-filter.conf + contents: + inline: | + net.ipv4.conf.default.rp_filter=0 + net.ipv4.conf.*.rp_filter=0 + - path: /etc/systemd/network/50-flannel.link + contents: + inline: | + [Match] + OriginalName=flannel* + [Link] + MACAddressPolicy=none - path: /etc/systemd/system.conf.d/accounting.conf contents: inline: | diff --git a/bare-metal/fedora-coreos/kubernetes/versions.tf b/bare-metal/fedora-coreos/kubernetes/versions.tf index fd7df2ffa..3a493e596 100644 --- a/bare-metal/fedora-coreos/kubernetes/versions.tf +++ b/bare-metal/fedora-coreos/kubernetes/versions.tf @@ -1,11 +1,19 @@ # Terraform version and plugin versions terraform { - required_version = "~> 0.12.6" + required_version = ">= 0.12.26, < 0.14.0" required_providers { - matchbox = "~> 0.3.0" - ct = "~> 0.4" template = "~> 2.1" null = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } + + matchbox = { + source = "poseidon/matchbox" + version = "~> 0.4.1" + } } } diff --git a/digital-ocean/container-linux/kubernetes/README.md b/digital-ocean/container-linux/kubernetes/README.md index 73b5fad91..87321693b 100644 --- a/digital-ocean/container-linux/kubernetes/README.md +++ b/digital-ocean/container-linux/kubernetes/README.md @@ -11,8 +11,8 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) -* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking +* Kubernetes v1.18.8 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization * Ready for Ingress, Prometheus, Grafana, CSI, and other [addons](https://typhoon.psdn.io/addons/overview/) diff --git a/digital-ocean/container-linux/kubernetes/bootstrap.tf b/digital-ocean/container-linux/kubernetes/bootstrap.tf index 8fe3c1b49..3b024cb58 100644 --- a/digital-ocean/container-linux/kubernetes/bootstrap.tf +++ b/digital-ocean/container-linux/kubernetes/bootstrap.tf @@ -1,6 +1,6 @@ # Kubernetes assets (kubeconfig, manifests) module "bootstrap" { - source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc" + source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=8ef2fe7c992a8c15d696bd3e3a97be713b025e64" cluster_name = var.cluster_name api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] diff --git a/digital-ocean/container-linux/kubernetes/cl/controller.yaml b/digital-ocean/container-linux/kubernetes/cl/controller.yaml index 0816ca90a..cec5785c3 100644 --- a/digital-ocean/container-linux/kubernetes/cl/controller.yaml +++ b/digital-ocean/container-linux/kubernetes/cl/controller.yaml @@ -62,6 +62,7 @@ systemd: After=coreos-metadata.service Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.8 EnvironmentFile=/run/metadata/coreos ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests @@ -101,7 +102,7 @@ systemd: --mount volume=var-log,target=/var/log \ --volume opt-cni-bin,kind=host,source=/opt/cni/bin \ --mount volume=opt-cni-bin,target=/opt/cni/bin \ - docker://quay.io/poseidon/kubelet:v1.18.2 -- \ + $${KUBELET_IMAGE} -- \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ @@ -144,7 +145,7 @@ systemd: --volume script,kind=host,source=/opt/bootstrap/apply \ --mount volume=script,target=/apply \ --insecure-options=image \ - docker://quay.io/poseidon/kubelet:v1.18.2 \ + docker://quay.io/poseidon/kubelet:v1.18.8 \ --net=host \ --dns=host \ --exec=/apply diff --git a/digital-ocean/container-linux/kubernetes/cl/worker.yaml b/digital-ocean/container-linux/kubernetes/cl/worker.yaml index 2248e0770..4bb07e1cb 100644 --- a/digital-ocean/container-linux/kubernetes/cl/worker.yaml +++ b/digital-ocean/container-linux/kubernetes/cl/worker.yaml @@ -2,11 +2,11 @@ systemd: units: - name: docker.service - enable: true + enabled: true - name: locksmithd.service mask: true - name: kubelet.path - enable: true + enabled: true contents: | [Unit] Description=Watch for kubeconfig @@ -15,7 +15,7 @@ systemd: [Install] WantedBy=multi-user.target - name: wait-for-dns.service - enable: true + enabled: true contents: | [Unit] Description=Wait for DNS entries @@ -30,11 +30,12 @@ systemd: - name: kubelet.service contents: | [Unit] - Description=Kubelet via Hyperkube + Description=Kubelet Requires=coreos-metadata.service After=coreos-metadata.service Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.8 EnvironmentFile=/run/metadata/coreos ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests @@ -74,23 +75,23 @@ systemd: --mount volume=var-log,target=/var/log \ --volume opt-cni-bin,kind=host,source=/opt/cni/bin \ --mount volume=opt-cni-bin,target=/opt/cni/bin \ - docker://quay.io/poseidon/kubelet:v1.18.2 -- \ + $${KUBELET_IMAGE} -- \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --client-ca-file=/etc/kubernetes/ca.crt \ --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ --hostname-override=$${COREOS_DIGITALOCEAN_IPV4_PRIVATE_0} \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ --node-labels=node.kubernetes.io/node \ --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid Restart=always @@ -98,7 +99,7 @@ systemd: [Install] WantedBy=multi-user.target - name: delete-node.service - enable: true + enabled: true contents: | [Unit] Description=Waiting to delete Kubernetes node on shutdown @@ -113,9 +114,11 @@ storage: directories: - path: /etc/kubernetes filesystem: root + mode: 0755 files: - path: /etc/sysctl.d/max-user-watches.conf filesystem: root + mode: 0644 contents: inline: | fs.inotify.max_user_watches=16184 @@ -131,7 +134,7 @@ storage: --volume config,kind=host,source=/etc/kubernetes \ --mount volume=config,target=/etc/kubernetes \ --insecure-options=image \ - docker://quay.io/poseidon/kubelet:v1.18.2 \ + docker://quay.io/poseidon/kubelet:v1.18.8 \ --net=host \ --dns=host \ --exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname) diff --git a/digital-ocean/container-linux/kubernetes/controllers.tf b/digital-ocean/container-linux/kubernetes/controllers.tf index 6d183b952..dcb6f0dda 100644 --- a/digital-ocean/container-linux/kubernetes/controllers.tf +++ b/digital-ocean/container-linux/kubernetes/controllers.tf @@ -46,9 +46,10 @@ resource "digitalocean_droplet" "controllers" { size = var.controller_type # network - # only official DigitalOcean images support IPv6 - ipv6 = local.is_official_image private_networking = true + vpc_uuid = digitalocean_vpc.network.id + # TODO: Only official DigitalOcean images support IPv6 + ipv6 = false user_data = data.ct_config.controller-ignitions.*.rendered[count.index] ssh_keys = var.ssh_fingerprints @@ -69,10 +70,10 @@ resource "digitalocean_tag" "controllers" { # Controller Ignition configs data "ct_config" "controller-ignitions" { - count = var.controller_count - content = data.template_file.controller-configs.*.rendered[count.index] - pretty_print = false - snippets = var.controller_snippets + count = var.controller_count + content = data.template_file.controller-configs.*.rendered[count.index] + strict = true + snippets = var.controller_snippets } # Controller Container Linux configs diff --git a/digital-ocean/container-linux/kubernetes/network.tf b/digital-ocean/container-linux/kubernetes/network.tf index bc5434852..8ddcc6296 100644 --- a/digital-ocean/container-linux/kubernetes/network.tf +++ b/digital-ocean/container-linux/kubernetes/network.tf @@ -1,7 +1,22 @@ +# Network VPC +resource "digitalocean_vpc" "network" { + name = var.cluster_name + region = var.region + description = "Network for ${var.cluster_name} cluster" +} + resource "digitalocean_firewall" "rules" { name = var.cluster_name - tags = ["${var.cluster_name}-controller", "${var.cluster_name}-worker"] + tags = [ + digitalocean_tag.controllers.name, + digitalocean_tag.workers.name + ] + + inbound_rule { + protocol = "icmp" + source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name] + } # allow ssh, internal flannel, internal node-exporter, internal kubelet inbound_rule { @@ -10,12 +25,27 @@ resource "digitalocean_firewall" "rules" { source_addresses = ["0.0.0.0/0", "::/0"] } + # Cilium health + inbound_rule { + protocol = "tcp" + port_range = "4240" + source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name] + } + + # IANA vxlan (flannel, calico) inbound_rule { protocol = "udp" port_range = "4789" source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name] } + # Linux vxlan (Cilium) + inbound_rule { + protocol = "udp" + port_range = "8472" + source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name] + } + # Allow Prometheus to scrape node-exporter inbound_rule { protocol = "tcp" @@ -30,6 +60,7 @@ resource "digitalocean_firewall" "rules" { source_tags = [digitalocean_tag.workers.name] } + # Kubelet inbound_rule { protocol = "tcp" port_range = "10250" @@ -59,7 +90,7 @@ resource "digitalocean_firewall" "rules" { resource "digitalocean_firewall" "controllers" { name = "${var.cluster_name}-controllers" - tags = ["${var.cluster_name}-controller"] + tags = [digitalocean_tag.controllers.name] # etcd inbound_rule { @@ -93,7 +124,7 @@ resource "digitalocean_firewall" "controllers" { resource "digitalocean_firewall" "workers" { name = "${var.cluster_name}-workers" - tags = ["${var.cluster_name}-worker"] + tags = [digitalocean_tag.workers.name] # allow HTTP/HTTPS ingress inbound_rule { @@ -114,4 +145,3 @@ resource "digitalocean_firewall" "workers" { source_addresses = ["0.0.0.0/0"] } } - diff --git a/digital-ocean/container-linux/kubernetes/outputs.tf b/digital-ocean/container-linux/kubernetes/outputs.tf index 429893c58..616eaf48a 100644 --- a/digital-ocean/container-linux/kubernetes/outputs.tf +++ b/digital-ocean/container-linux/kubernetes/outputs.tf @@ -2,6 +2,8 @@ output "kubeconfig-admin" { value = module.bootstrap.kubeconfig-admin } +# Outputs for Kubernetes Ingress + output "controllers_dns" { value = digitalocean_record.controllers[0].fqdn } @@ -45,3 +47,9 @@ output "worker_tag" { value = digitalocean_tag.workers.name } +# Outputs for custom load balancing + +output "vpc_id" { + description = "ID of the cluster VPC" + value = digitalocean_vpc.network.id +} diff --git a/digital-ocean/container-linux/kubernetes/versions.tf b/digital-ocean/container-linux/kubernetes/versions.tf index c0e31a276..807b39006 100644 --- a/digital-ocean/container-linux/kubernetes/versions.tf +++ b/digital-ocean/container-linux/kubernetes/versions.tf @@ -1,12 +1,20 @@ # Terraform version and plugin versions terraform { - required_version = "~> 0.12.6" + required_version = ">= 0.12.26, < 0.14.0" required_providers { - digitalocean = "~> 1.3" - ct = "~> 0.3" - template = "~> 2.1" - null = "~> 2.1" + template = "~> 2.1" + null = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } + + digitalocean = { + source = "digitalocean/digitalocean" + version = "~> 1.20" + } } } diff --git a/digital-ocean/container-linux/kubernetes/workers.tf b/digital-ocean/container-linux/kubernetes/workers.tf index fa9b85d61..c89898090 100644 --- a/digital-ocean/container-linux/kubernetes/workers.tf +++ b/digital-ocean/container-linux/kubernetes/workers.tf @@ -35,9 +35,10 @@ resource "digitalocean_droplet" "workers" { size = var.worker_type # network - # only official DigitalOcean images support IPv6 - ipv6 = local.is_official_image private_networking = true + vpc_uuid = digitalocean_vpc.network.id + # only official DigitalOcean images support IPv6 + ipv6 = local.is_official_image user_data = data.ct_config.worker-ignition.rendered ssh_keys = var.ssh_fingerprints @@ -58,9 +59,9 @@ resource "digitalocean_tag" "workers" { # Worker Ignition config data "ct_config" "worker-ignition" { - content = data.template_file.worker-config.rendered - pretty_print = false - snippets = var.worker_snippets + content = data.template_file.worker-config.rendered + strict = true + snippets = var.worker_snippets } # Worker Container Linux config diff --git a/digital-ocean/fedora-coreos/kubernetes/README.md b/digital-ocean/fedora-coreos/kubernetes/README.md index 94ce6985f..6000d0aac 100644 --- a/digital-ocean/fedora-coreos/kubernetes/README.md +++ b/digital-ocean/fedora-coreos/kubernetes/README.md @@ -11,9 +11,9 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) +* Kubernetes v1.18.8 (upstream) * Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking -* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) +* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing * Advanced features like [snippets](https://typhoon.psdn.io/advanced/customization/) customization * Ready for Ingress, Prometheus, Grafana, CSI, and other [addons](https://typhoon.psdn.io/addons/overview/) diff --git a/digital-ocean/fedora-coreos/kubernetes/bootstrap.tf b/digital-ocean/fedora-coreos/kubernetes/bootstrap.tf index 2f95cc53f..331905b00 100644 --- a/digital-ocean/fedora-coreos/kubernetes/bootstrap.tf +++ b/digital-ocean/fedora-coreos/kubernetes/bootstrap.tf @@ -1,6 +1,6 @@ # Kubernetes assets (kubeconfig, manifests) module "bootstrap" { - source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc" + source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=8ef2fe7c992a8c15d696bd3e3a97be713b025e64" cluster_name = var.cluster_name api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] diff --git a/digital-ocean/fedora-coreos/kubernetes/controllers.tf b/digital-ocean/fedora-coreos/kubernetes/controllers.tf index 5bf2c18a8..95d43bace 100644 --- a/digital-ocean/fedora-coreos/kubernetes/controllers.tf +++ b/digital-ocean/fedora-coreos/kubernetes/controllers.tf @@ -41,9 +41,10 @@ resource "digitalocean_droplet" "controllers" { size = var.controller_type # network - # TODO: Only official DigitalOcean images support IPv6 - ipv6 = false private_networking = true + vpc_uuid = digitalocean_vpc.network.id + # TODO: Only official DigitalOcean images support IPv6 + ipv6 = false user_data = data.ct_config.controller-ignitions.*.rendered[count.index] ssh_keys = var.ssh_fingerprints @@ -64,10 +65,10 @@ resource "digitalocean_tag" "controllers" { # Controller Ignition configs data "ct_config" "controller-ignitions" { - count = var.controller_count - content = data.template_file.controller-configs.*.rendered[count.index] - strict = true - snippets = var.controller_snippets + count = var.controller_count + content = data.template_file.controller-configs.*.rendered[count.index] + strict = true + snippets = var.controller_snippets } # Controller Fedora CoreOS configs diff --git a/digital-ocean/fedora-coreos/kubernetes/fcc/controller.yaml b/digital-ocean/fedora-coreos/kubernetes/fcc/controller.yaml index 6ccfb5ae0..7e404fc20 100644 --- a/digital-ocean/fedora-coreos/kubernetes/fcc/controller.yaml +++ b/digital-ocean/fedora-coreos/kubernetes/fcc/controller.yaml @@ -28,7 +28,7 @@ systemd: --network host \ --volume /var/lib/etcd:/var/lib/etcd:rw,Z \ --volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \ - quay.io/coreos/etcd:v3.4.7 + quay.io/coreos/etcd:v3.4.10 ExecStop=/usr/bin/podman stop etcd [Install] WantedBy=multi-user.target @@ -50,11 +50,12 @@ systemd: - name: kubelet.service contents: | [Unit] - Description=Kubelet via Hyperkube (System Container) + Description=Kubelet (System Container) Requires=afterburn.service After=afterburn.service Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.8 EnvironmentFile=/run/metadata/afterburn ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests @@ -81,10 +82,11 @@ systemd: --volume /var/log:/var/log \ --volume /var/run/lock:/var/run/lock:z \ --volume /opt/cni/bin:/opt/cni/bin:z \ - quay.io/poseidon/kubelet:v1.18.2 \ + $${KUBELET_IMAGE} \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=systemd \ --cgroups-per-qos=true \ --enforce-node-allocatable=pods \ @@ -92,17 +94,15 @@ systemd: --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ --hostname-override=$${AFTERBURN_DIGITALOCEAN_IPV4_PRIVATE_0} \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ - --node-labels=node.kubernetes.io/master \ --node-labels=node.kubernetes.io/controller="true" \ --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ - --register-with-taints=node-role.kubernetes.io/master=:NoSchedule \ + --register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/podman stop kubelet Delegate=yes @@ -131,15 +131,17 @@ systemd: ExecStartPre=-/usr/bin/podman rm bootstrap ExecStart=/usr/bin/podman run --name bootstrap \ --network host \ - --volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,Z \ + --volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,z \ --volume /opt/bootstrap/assets:/assets:ro,Z \ --volume /opt/bootstrap/apply:/apply:ro,Z \ --entrypoint=/apply \ - quay.io/poseidon/kubelet:v1.18.2 + quay.io/poseidon/kubelet:v1.18.8 ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done ExecStartPost=-/usr/bin/podman stop bootstrap storage: directories: + - path: /var/lib/etcd + mode: 0700 - path: /etc/kubernetes - path: /opt/bootstrap files: @@ -158,11 +160,11 @@ storage: chmod -R 500 /etc/ssl/etcd mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/ mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/ - sudo mkdir -p /etc/kubernetes/manifests - sudo mv static-manifests/* /etc/kubernetes/manifests/ - sudo mkdir -p /opt/bootstrap/assets - sudo mv manifests /opt/bootstrap/assets/manifests - sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/ + mkdir -p /etc/kubernetes/manifests + mv static-manifests/* /etc/kubernetes/manifests/ + mkdir -p /opt/bootstrap/assets + mv manifests /opt/bootstrap/assets/manifests + mv manifests-networking/* /opt/bootstrap/assets/manifests/ rm -rf assets auth static-manifests tls manifests-networking - path: /opt/bootstrap/apply mode: 0544 @@ -182,6 +184,18 @@ storage: contents: inline: | fs.inotify.max_user_watches=16184 + - path: /etc/sysctl.d/reverse-path-filter.conf + contents: + inline: | + net.ipv4.conf.default.rp_filter=0 + net.ipv4.conf.*.rp_filter=0 + - path: /etc/systemd/network/50-flannel.link + contents: + inline: | + [Match] + OriginalName=flannel* + [Link] + MACAddressPolicy=none - path: /etc/systemd/system.conf.d/accounting.conf contents: inline: | diff --git a/digital-ocean/fedora-coreos/kubernetes/fcc/worker.yaml b/digital-ocean/fedora-coreos/kubernetes/fcc/worker.yaml index 8209e4e0b..d099fe29d 100644 --- a/digital-ocean/fedora-coreos/kubernetes/fcc/worker.yaml +++ b/digital-ocean/fedora-coreos/kubernetes/fcc/worker.yaml @@ -21,11 +21,12 @@ systemd: enabled: true contents: | [Unit] - Description=Kubelet via Hyperkube (System Container) + Description=Kubelet (System Container) Requires=afterburn.service After=afterburn.service Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.8 EnvironmentFile=/run/metadata/afterburn ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests @@ -52,10 +53,11 @@ systemd: --volume /var/log:/var/log \ --volume /var/run/lock:/var/run/lock:z \ --volume /opt/cni/bin:/opt/cni/bin:z \ - quay.io/poseidon/kubelet:v1.18.2 \ + $${KUBELET_IMAGE} \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=systemd \ --cgroups-per-qos=true \ --enforce-node-allocatable=pods \ @@ -63,15 +65,14 @@ systemd: --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ --hostname-override=$${AFTERBURN_DIGITALOCEAN_IPV4_PRIVATE_0} \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ --node-labels=node.kubernetes.io/node \ --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/podman stop kubelet Delegate=yes @@ -97,7 +98,7 @@ systemd: Type=oneshot RemainAfterExit=true ExecStart=/bin/true - ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME' + ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.8 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME' [Install] WantedBy=multi-user.target storage: @@ -108,6 +109,18 @@ storage: contents: inline: | fs.inotify.max_user_watches=16184 + - path: /etc/sysctl.d/reverse-path-filter.conf + contents: + inline: | + net.ipv4.conf.default.rp_filter=0 + net.ipv4.conf.*.rp_filter=0 + - path: /etc/systemd/network/50-flannel.link + contents: + inline: | + [Match] + OriginalName=flannel* + [Link] + MACAddressPolicy=none - path: /etc/systemd/system.conf.d/accounting.conf contents: inline: | diff --git a/digital-ocean/fedora-coreos/kubernetes/network.tf b/digital-ocean/fedora-coreos/kubernetes/network.tf index bc5434852..0d3438bbf 100644 --- a/digital-ocean/fedora-coreos/kubernetes/network.tf +++ b/digital-ocean/fedora-coreos/kubernetes/network.tf @@ -1,7 +1,22 @@ +# Network VPC +resource "digitalocean_vpc" "network" { + name = var.cluster_name + region = var.region + description = "Network for ${var.cluster_name} cluster" +} + resource "digitalocean_firewall" "rules" { name = var.cluster_name - tags = ["${var.cluster_name}-controller", "${var.cluster_name}-worker"] + tags = [ + digitalocean_tag.controllers.name, + digitalocean_tag.workers.name + ] + + inbound_rule { + protocol = "icmp" + source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name] + } # allow ssh, internal flannel, internal node-exporter, internal kubelet inbound_rule { @@ -10,12 +25,27 @@ resource "digitalocean_firewall" "rules" { source_addresses = ["0.0.0.0/0", "::/0"] } + # Cilium health + inbound_rule { + protocol = "tcp" + port_range = "4240" + source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name] + } + + # IANA vxlan (flannel, calico) inbound_rule { protocol = "udp" port_range = "4789" source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name] } + # Linux vxlan (Cilium) + inbound_rule { + protocol = "udp" + port_range = "8472" + source_tags = [digitalocean_tag.controllers.name, digitalocean_tag.workers.name] + } + # Allow Prometheus to scrape node-exporter inbound_rule { protocol = "tcp" @@ -30,6 +60,7 @@ resource "digitalocean_firewall" "rules" { source_tags = [digitalocean_tag.workers.name] } + # Kubelet inbound_rule { protocol = "tcp" port_range = "10250" @@ -59,7 +90,7 @@ resource "digitalocean_firewall" "rules" { resource "digitalocean_firewall" "controllers" { name = "${var.cluster_name}-controllers" - tags = ["${var.cluster_name}-controller"] + tags = [digitalocean_tag.controllers.name] # etcd inbound_rule { @@ -93,7 +124,7 @@ resource "digitalocean_firewall" "controllers" { resource "digitalocean_firewall" "workers" { name = "${var.cluster_name}-workers" - tags = ["${var.cluster_name}-worker"] + tags = [digitalocean_tag.workers.name] # allow HTTP/HTTPS ingress inbound_rule { diff --git a/digital-ocean/fedora-coreos/kubernetes/outputs.tf b/digital-ocean/fedora-coreos/kubernetes/outputs.tf index 429893c58..616eaf48a 100644 --- a/digital-ocean/fedora-coreos/kubernetes/outputs.tf +++ b/digital-ocean/fedora-coreos/kubernetes/outputs.tf @@ -2,6 +2,8 @@ output "kubeconfig-admin" { value = module.bootstrap.kubeconfig-admin } +# Outputs for Kubernetes Ingress + output "controllers_dns" { value = digitalocean_record.controllers[0].fqdn } @@ -45,3 +47,9 @@ output "worker_tag" { value = digitalocean_tag.workers.name } +# Outputs for custom load balancing + +output "vpc_id" { + description = "ID of the cluster VPC" + value = digitalocean_vpc.network.id +} diff --git a/digital-ocean/fedora-coreos/kubernetes/versions.tf b/digital-ocean/fedora-coreos/kubernetes/versions.tf index c0e31a276..807b39006 100644 --- a/digital-ocean/fedora-coreos/kubernetes/versions.tf +++ b/digital-ocean/fedora-coreos/kubernetes/versions.tf @@ -1,12 +1,20 @@ # Terraform version and plugin versions terraform { - required_version = "~> 0.12.6" + required_version = ">= 0.12.26, < 0.14.0" required_providers { - digitalocean = "~> 1.3" - ct = "~> 0.3" - template = "~> 2.1" - null = "~> 2.1" + template = "~> 2.1" + null = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } + + digitalocean = { + source = "digitalocean/digitalocean" + version = "~> 1.20" + } } } diff --git a/digital-ocean/fedora-coreos/kubernetes/workers.tf b/digital-ocean/fedora-coreos/kubernetes/workers.tf index f00e5510e..f4db41af0 100644 --- a/digital-ocean/fedora-coreos/kubernetes/workers.tf +++ b/digital-ocean/fedora-coreos/kubernetes/workers.tf @@ -37,9 +37,10 @@ resource "digitalocean_droplet" "workers" { size = var.worker_type # network - # TODO: Only official DigitalOcean images support IPv6 - ipv6 = false private_networking = true + vpc_uuid = digitalocean_vpc.network.id + # TODO: Only official DigitalOcean images support IPv6 + ipv6 = false user_data = data.ct_config.worker-ignition.rendered ssh_keys = var.ssh_fingerprints @@ -60,9 +61,9 @@ resource "digitalocean_tag" "workers" { # Worker Ignition config data "ct_config" "worker-ignition" { - content = data.template_file.worker-config.rendered - strict = true - snippets = var.worker_snippets + content = data.template_file.worker-config.rendered + strict = true + snippets = var.worker_snippets } # Worker Fedora CoreOS config diff --git a/docs/advanced/customization.md b/docs/advanced/customization.md index 91552eb40..1c87aa94e 100644 --- a/docs/advanced/customization.md +++ b/docs/advanced/customization.md @@ -83,7 +83,7 @@ module "mercury" { } ``` -### Container Linux +### Flatcar Linux Define a Container Linux Config (CLC) ([config](https://github.com/coreos/container-linux-config-transpiler/blob/master/doc/configuration.md), [examples](https://github.com/coreos/container-linux-config-transpiler/blob/master/doc/examples.md)) in version control near your Terraform workspace directory (e.g. perhaps in a `snippets` subdirectory). You may organize snippets into multiple files, if desired. @@ -125,7 +125,7 @@ systemd: Environment="ETCD_LOG_PACKAGE_LEVELS=etcdserver=WARNING,security=DEBUG" ``` -Reference the CLC contents by location (e.g. `file("./custom-units.yaml")`). On [AWS](/cl/aws/#cluster), [Azure](/cl/azure/#cluster), [DigitalOcean](/cl/digital-ocean/#cluster), or [Google Cloud](/cl/google-cloud/#cluster) extend the `controller_snippets` or `worker_snippets` list variables. +Reference the CLC contents by location (e.g. `file("./custom-units.yaml")`). On [AWS](/flatcar-linux/aws/#cluster), [Azure](/flatcar-linux/azure/#cluster), [DigitalOcean](/flatcar-linux/digital-ocean/#cluster), or [Google Cloud](/flatcar-linux/google-cloud/#cluster) extend the `controller_snippets` or `worker_snippets` list variables. ```tf module "nemo" { @@ -145,7 +145,7 @@ module "nemo" { } ``` -On [Bare-Metal](/cl/bare-metal/#cluster), different CLCs may be used for each node (since hardware may be heterogeneous). Extend the `snippets` map variable by mapping a controller or worker name key to a list of snippets. +On [Bare-Metal](/flatcar-linux/bare-metal/#cluster), different CLCs may be used for each node (since hardware may be heterogeneous). Extend the `snippets` map variable by mapping a controller or worker name key to a list of snippets. ```tf module "mercury" { @@ -174,3 +174,34 @@ module "nemo" { To customize low-level Kubernetes control plane bootstrapping, see the [poseidon/terraform-render-bootstrap](https://github.com/poseidon/terraform-render-bootstrap) Terraform module. +## Kubelet + +Typhoon publishes Kubelet [container images](/topics/security/#container-images) to Quay.io (default) and to Dockerhub (in case of a Quay [outage](https://github.com/poseidon/typhoon/issues/735) or breach). Quay automated builds also provide the option for fully verifiable tagged images (`build-{short_sha}`). + +To set an alternative Kubelet image, use a snippet to set a systemd dropin. + +``` +# host-image-override.yaml +variant: fcos <- remove for Flatcar Linux +version: 1.0.0 <- remove for Flatcar Linux +systemd: + units: + - name: kubelet.service + dropins: + - name: 10-image-override.conf + contents: | + [Service] + Environment=KUBELET_IMAGE=docker.io/psdn/kubelet:v1.18.3 +``` + +``` +module "nemo" { + ... + + worker_snippets = [ + file("./snippets/host-image-override.yaml") + ] + ... +} +``` + diff --git a/docs/advanced/worker-pools.md b/docs/advanced/worker-pools.md index 6f92646f1..4fce2d390 100644 --- a/docs/advanced/worker-pools.md +++ b/docs/advanced/worker-pools.md @@ -13,7 +13,7 @@ Internal Terraform Modules: ## AWS -Create a cluster following the AWS [tutorial](../cl/aws.md#cluster). Define a worker pool using the AWS internal `workers` module. +Create a cluster following the AWS [tutorial](../flatcar-linux/aws.md#cluster). Define a worker pool using the AWS internal `workers` module. ```tf module "tempest-worker-pool" { @@ -65,7 +65,8 @@ The AWS internal `workers` module supports a number of [variables](https://githu |:-----|:------------|:--------|:--------| | worker_count | Number of instances | 1 | 3 | | instance_type | EC2 instance type | "t3.small" | "t3.medium" | -| os_image | AMI channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, flatcar-alph, coreos-stable, coreos-beta, coreos-alpha | +| os_image | AMI channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, flatcar-alph, flatcar-edge | +| os_stream | Fedora CoreOS stream for compute instances | "stable" | "testing", "next" | | disk_size | Size of the EBS volume in GB | 40 | 100 | | disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 | | disk_iops | IOPS of the EBS volume | 0 (i.e. auto) | 400 | @@ -78,11 +79,11 @@ Check the list of valid [instance types](https://aws.amazon.com/ec2/instance-typ ## Azure -Create a cluster following the Azure [tutorial](../cl/azure.md#cluster). Define a worker pool using the Azure internal `workers` module. +Create a cluster following the Azure [tutorial](../flatcar-linux/azure.md#cluster). Define a worker pool using the Azure internal `workers` module. ```tf module "ramius-worker-pool" { - source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes/workers?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes/workers?ref=v1.18.8" # Azure region = module.ramius.region @@ -134,7 +135,7 @@ The Azure internal `workers` module supports a number of [variables](https://git |:-----|:------------|:--------|:--------| | worker_count | Number of instances | 1 | 3 | | vm_type | Machine type for instances | "Standard_DS1_v2" | See below | -| os_image | Channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, coreos-stable, coreos-beta, coreos-alpha | +| os_image | Channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge | | priority | Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | "Regular" | "Spot" | | snippets | Container Linux Config snippets | [] | [examples](/advanced/customization/) | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | @@ -144,11 +145,11 @@ Check the list of valid [machine types](https://azure.microsoft.com/en-us/pricin ## Google Cloud -Create a cluster following the Google Cloud [tutorial](../cl/google-cloud.md#cluster). Define a worker pool using the Google Cloud internal `workers` module. +Create a cluster following the Google Cloud [tutorial](../flatcar-linux/google-cloud.md#cluster). Define a worker pool using the Google Cloud internal `workers` module. ```tf module "yavin-worker-pool" { - source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes/workers?ref=v1.18.8" # Google Cloud region = "europe-west2" @@ -179,11 +180,11 @@ Verify a managed instance group of workers joins the cluster within a few minute ``` $ kubectl get nodes NAME STATUS AGE VERSION -yavin-controller-0.c.example-com.internal Ready 6m v1.18.2 -yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.2 -yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.2 -yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.18.2 -yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.18.2 +yavin-controller-0.c.example-com.internal Ready 6m v1.18.8 +yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.8 +yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.8 +yavin-16x-worker-jrbf.c.example-com.internal Ready 3m v1.18.8 +yavin-16x-worker-mzdm.c.example-com.internal Ready 3m v1.18.8 ``` ### Variables @@ -199,7 +200,7 @@ The Google Cloud internal `workers` module supports a number of [variables](http | region | Region for the worker pool instances. May differ from the cluster's region | "europe-west2" | | network | Must be set to `network_name` output by cluster | module.cluster.network_name | | kubeconfig | Must be set to `kubeconfig` output by cluster | module.cluster.kubeconfig | -| os_image | Container Linux image for compute instances | "fedora-coreos-or-flatcar-image", coreos-stable, coreos-beta, coreos-alpha | +| os_image | Container Linux image for compute instances | "uploaded-flatcar-image" | | ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." | Check the list of regions [docs](https://cloud.google.com/compute/docs/regions-zones/regions-zones) or with `gcloud compute regions list`. @@ -210,6 +211,7 @@ Check the list of regions [docs](https://cloud.google.com/compute/docs/regions-z |:-----|:------------|:--------|:--------| | worker_count | Number of instances | 1 | 3 | | machine_type | Compute instance machine type | "n1-standard-1" | See below | +| os_stream | Fedora CoreOS stream for compute instances | "stable" | "testing", "next" | | disk_size | Size of the disk in GB | 40 | 100 | | preemptible | If true, Compute Engine will terminate instances randomly within 24 hours | false | true | | snippets | Container Linux Config snippets | [] | [examples](/advanced/customization/) | diff --git a/docs/architecture/digitalocean.md b/docs/architecture/digitalocean.md index ba8a21dec..5bb045563 100644 --- a/docs/architecture/digitalocean.md +++ b/docs/architecture/digitalocean.md @@ -30,6 +30,7 @@ Add a DigitalOcean load balancer to distribute IPv4 TCP traffic (HTTP/HTTPS Ingr resource "digitalocean_loadbalancer" "ingress" { name = "ingress" region = "fra1" + vpc_uuid = module.nemo.vpc_id droplet_tag = module.nemo.worker_tag healthcheck { diff --git a/docs/architecture/operating-systems.md b/docs/architecture/operating-systems.md index 276aff1cb..3ebdbd051 100644 --- a/docs/architecture/operating-systems.md +++ b/docs/architecture/operating-systems.md @@ -1,6 +1,6 @@ # Operating Systems -Typhoon supports [Fedora CoreOS](https://getfedora.org/coreos/), [Flatcar Linux](https://www.flatcar-linux.org/) and Container Linux (EOL in May 2020). These operating systems were chosen because they offer: +Typhoon supports [Fedora CoreOS](https://getfedora.org/coreos/) and [Flatcar Linux](https://www.flatcar-linux.org/). These operating systems were chosen because they offer: * Minimalism and focus on clustered operation * Automated and atomic operating system upgrades diff --git a/docs/fedora-coreos/aws.md b/docs/fedora-coreos/aws.md index cdcef4d01..cff2cac7e 100644 --- a/docs/fedora-coreos/aws.md +++ b/docs/fedora-coreos/aws.md @@ -1,6 +1,6 @@ # AWS -In this tutorial, we'll create a Kubernetes v1.18.2 cluster on AWS with Fedora CoreOS. +In this tutorial, we'll create a Kubernetes v1.18.8 cluster on AWS with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets. @@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se * AWS Account and IAM credentials * AWS Route53 DNS Zone (registered Domain Name or delegated subdomain) -* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally +* Terraform v0.13.0+ ## Terraform Setup -Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system. +Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system. ```sh $ terraform version -Terraform v0.12.21 -``` - -Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0 +Terraform v0.13.0 ``` Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`). @@ -49,13 +41,23 @@ Configure the AWS provider to use your access key credentials in a `providers.tf ```tf provider "aws" { - version = "2.53.0" region = "eu-central-1" shared_credentials_file = "/home/user/.config/aws/credentials" } -provider "ct" { - version = "0.5.0" +provider "ct" {} + +terraform { + required_providers { + ct = { + source = "poseidon/ct" + version = "0.6.1" + } + aws = { + source = "hashicorp/aws" + version = "3.2.0" + } + } } ``` @@ -70,7 +72,7 @@ Define a Kubernetes cluster using the module `aws/fedora-coreos/kubernetes`. ```tf module "tempest" { - source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//aws/fedora-coreos/kubernetes?ref=v1.18.8" # AWS cluster_name = "tempest" @@ -143,9 +145,9 @@ List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/tempest-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION -ip-10-0-3-155 Ready 10m v1.18.2 -ip-10-0-26-65 Ready 10m v1.18.2 -ip-10-0-41-21 Ready 10m v1.18.2 +ip-10-0-3-155 Ready 10m v1.18.8 +ip-10-0-26-65 Ready 10m v1.18.8 +ip-10-0-41-21 Ready 10m v1.18.8 ``` List the pods. @@ -208,7 +210,7 @@ Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`. | worker_count | Number of workers | 1 | 3 | | controller_type | EC2 instance type for controllers | "t3.small" | See below | | worker_type | EC2 instance type for workers | "t3.small" | See below | -| os_image | AMI channel for Fedora CoreOS | not yet used | ? | +| os_stream | Fedora CoreOS stream for compute instances | "stable" | "testing", "next" | | disk_size | Size of the EBS volume in GB | 40 | 100 | | disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 | | disk_iops | IOPS of the EBS volume | 0 (i.e. auto) | 400 | @@ -216,7 +218,7 @@ Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`. | worker_price | Spot price in USD for worker instances or 0 to use on-demand instances | 0 | 0.10 | | controller_snippets | Controller Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) | | worker_snippets | Worker Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) | -| networking | Choice of networking provider | "calico" | "calico" or "flannel" | +| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" | | network_mtu | CNI interface MTU (calico only) | 1480 | 8981 | | host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | diff --git a/docs/fedora-coreos/azure.md b/docs/fedora-coreos/azure.md index 719e3030f..0e091ea8c 100644 --- a/docs/fedora-coreos/azure.md +++ b/docs/fedora-coreos/azure.md @@ -1,6 +1,6 @@ # Azure -In this tutorial, we'll create a Kubernetes v1.18.2 cluster on Azure with Fedora CoreOS. +In this tutorial, we'll create a Kubernetes v1.18.8 cluster on Azure with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets. @@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se * Azure account * Azure DNS Zone (registered Domain Name or delegated subdomain) -* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally +* Terraform v0.13.0+ ## Terraform Setup -Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system. +Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system. ```sh $ terraform version -Terraform v0.12.21 -``` - -Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0 +Terraform v0.13.0 ``` Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`). @@ -47,11 +39,22 @@ Configure the Azure provider in a `providers.tf` file. ```tf provider "azurerm" { - version = "2.5.0" + features {} } -provider "ct" { - version = "0.5.0" +provider "ct" {} + +terraform { + required_providers { + ct = { + source = "poseidon/ct" + version = "0.6.1" + } + azurerm = { + source = "hashicorp/azurerm" + version = "2.23.0" + } + } } ``` @@ -67,7 +70,7 @@ Fedora CoreOS publishes images for Azure, but does not yet upload them. Azure al xz -d fedora-coreos-31.20200323.3.2-azure.x86_64.vhd.xz ``` -Create an Azure disk (note its ID) and create an Azure image from it (note its ID). +Create an Azure disk (note disk ID) and create an Azure image from it (note image ID). ``` az disk create --name fedora-coreos-31.20200323.3.2 -g GROUP --source https://BUCKET.blob.core.windows.net/fedora-coreos/fedora-coreos-31.20200323.3.2-azure.x86_64.vhd @@ -83,7 +86,7 @@ Define a Kubernetes cluster using the module `azure/fedora-coreos/kubernetes`. ```tf module "ramius" { - source = "git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//azure/fedora-coreos/kubernetes?ref=v1.18.8" # Azure cluster_name = "ramius" @@ -158,9 +161,9 @@ List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/ramius-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION -ramius-controller-0 Ready 24m v1.18.2 -ramius-worker-000001 Ready 25m v1.18.2 -ramius-worker-000002 Ready 24m v1.18.2 +ramius-controller-0 Ready 24m v1.18.8 +ramius-worker-000001 Ready 25m v1.18.8 +ramius-worker-000002 Ready 24m v1.18.8 ``` List the pods. @@ -242,7 +245,7 @@ Reference the DNS zone with `azurerm_dns_zone.clusters.name` and its resource gr | worker_priority | Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Spot | | controller_snippets | Controller Fedora CoreOS Config snippets | [] | [example](/advanced/customization/#usage) | | worker_snippets | Worker Fedora CoreOS Config snippets | [] | [example](/advanced/customization/#usage) | -| networking | Choice of networking provider | "calico" | "flannel" or "calico" | +| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" | | host_cidr | CIDR IPv4 range to assign to instances | "10.0.0.0/16" | "10.0.0.0/20" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | diff --git a/docs/fedora-coreos/bare-metal.md b/docs/fedora-coreos/bare-metal.md index f789db64a..68e887021 100644 --- a/docs/fedora-coreos/bare-metal.md +++ b/docs/fedora-coreos/bare-metal.md @@ -1,6 +1,6 @@ # Bare-Metal -In this tutorial, we'll network boot and provision a Kubernetes v1.18.2 cluster on bare-metal with Fedora CoreOS. +In this tutorial, we'll network boot and provision a Kubernetes v1.18.8 cluster on bare-metal with Fedora CoreOS. First, we'll deploy a [Matchbox](https://github.com/poseidon/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Fedora CoreOS to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition. @@ -12,7 +12,7 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se * PXE-enabled [network boot](https://coreos.com/matchbox/docs/latest/network-setup.html) environment (with HTTPS support) * Matchbox v0.6+ deployment with API enabled * Matchbox credentials `client.crt`, `client.key`, `ca.crt` -* Terraform v0.12.6+, [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox), and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally +* Terraform v0.13.0+ ## Machines @@ -107,27 +107,11 @@ Read about the [many ways](https://coreos.com/matchbox/docs/latest/network-setup ## Terraform Setup -Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system. +Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system. ```sh $ terraform version -Terraform v0.12.21 -``` - -Add the [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-matchbox/releases/download/v0.3.0/terraform-provider-matchbox-v0.3.0-linux-amd64.tar.gz -tar xzf terraform-provider-matchbox-v0.3.0-linux-amd64.tar.gz -mv terraform-provider-matchbox-v0.3.0-linux-amd64/terraform-provider-matchbox ~/.terraform.d/plugins/terraform-provider-matchbox_v0.3.0 -``` - -Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0 +Terraform v0.13.0 ``` Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`). @@ -142,15 +126,25 @@ Configure the Matchbox provider to use your Matchbox API endpoint and client cer ```tf provider "matchbox" { - version = "0.3.0" endpoint = "matchbox.example.com:8081" client_cert = file("~/.config/matchbox/client.crt") client_key = file("~/.config/matchbox/client.key") ca = file("~/.config/matchbox/ca.crt") } -provider "ct" { - version = "0.5.0" +provider "ct" {} + +terraform { + required_providers { + ct = { + source = "poseidon/ct" + version = "0.6.1" + } + matchbox = { + source = "poseidon/matchbox" + version = "0.4.1" + } + } } ``` @@ -160,7 +154,7 @@ Define a Kubernetes cluster using the module `bare-metal/fedora-coreos/kubernete ```tf module "mercury" { - source = "git::https://github.com/poseidon/typhoon//bare-metal/fedora-coreos/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//bare-metal/fedora-coreos/kubernetes?ref=v1.18.8" # bare-metal cluster_name = "mercury" @@ -289,9 +283,9 @@ List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/mercury-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION -node1.example.com Ready 10m v1.18.2 -node2.example.com Ready 10m v1.18.2 -node3.example.com Ready 10m v1.18.2 +node1.example.com Ready 10m v1.18.8 +node2.example.com Ready 10m v1.18.8 +node3.example.com Ready 10m v1.18.8 ``` List the pods. @@ -339,7 +333,7 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-me |:-----|:------------|:--------|:--------| | cached_install | PXE boot and install from the Matchbox `/assets` cache. Admin MUST have downloaded Fedora CoreOS images into the cache | false | true | | install_disk | Disk device where Fedora CoreOS should be installed | "sda" (not "/dev/sda" like Container Linux) | "sdb" | -| networking | Choice of networking provider | "calico" | "calico" or "flannel" | +| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" | | network_mtu | CNI interface MTU (calico-only) | 1480 | - | | snippets | Map from machine names to lists of Fedora CoreOS Config snippets | {} | [examples](/advanced/customization/) | | network_ip_autodetection_method | Method to detect host IPv4 address (calico-only) | "first-found" | "can-reach=10.0.0.1" | diff --git a/docs/fedora-coreos/digitalocean.md b/docs/fedora-coreos/digitalocean.md index 81d4287c5..76bdb5fd4 100644 --- a/docs/fedora-coreos/digitalocean.md +++ b/docs/fedora-coreos/digitalocean.md @@ -1,6 +1,6 @@ -# Digital Ocean +# DigitalOcean -In this tutorial, we'll create a Kubernetes v1.18.2 cluster on DigitalOcean with Fedora CoreOS. +In this tutorial, we'll create a Kubernetes v1.18.8 cluster on DigitalOcean with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets. @@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se * Digital Ocean Account and Token * Digital Ocean Domain (registered Domain Name or delegated subdomain) -* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally +* Terraform v0.13.0+ ## Terraform Setup -Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system. +Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system. ```sh $ terraform version -Terraform v0.12.21 -``` - -Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0 +Terraform v0.13.0 ``` Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`). @@ -50,12 +42,22 @@ Configure the DigitalOcean provider to use your token in a `providers.tf` file. ```tf provider "digitalocean" { - version = "1.15.1" token = "${chomp(file("~/.config/digital-ocean/token"))}" } -provider "ct" { - version = "0.5.0" +provider "ct" {} + +terraform { + required_providers { + ct = { + source = "poseidon/ct" + version = "0.6.1" + } + digitalocean = { + source = "digitalocean/digitalocean" + version = "1.22.1" + } + } } ``` @@ -79,7 +81,7 @@ Define a Kubernetes cluster using the module `digital-ocean/fedora-coreos/kubern ```tf module "nemo" { - source = "git::https://github.com/poseidon/typhoon//digital-ocean/fedora-coreos/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//digital-ocean/fedora-coreos/kubernetes?ref=v1.18.8" # Digital Ocean cluster_name = "nemo" @@ -153,9 +155,9 @@ List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/nemo-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION -10.132.110.130 Ready 10m v1.18.2 -10.132.115.81 Ready 10m v1.18.2 -10.132.124.107 Ready 10m v1.18.2 +10.132.110.130 Ready 10m v1.18.8 +10.132.115.81 Ready 10m v1.18.8 +10.132.124.107 Ready 10m v1.18.8 ``` List the pods. @@ -238,7 +240,7 @@ Digital Ocean requires the SSH public key be uploaded to your account, so you ma | worker_type | Droplet type for workers | "s-1vcpu-2gb" | s-1vcpu-2gb, s-2vcpu-2gb, ... | | controller_snippets | Controller Fedora CoreOS Config snippets | [] | [example](/advanced/customization/) | | worker_snippets | Worker Fedora CoreOS Config snippets | [] | [example](/advanced/customization/) | -| networking | Choice of networking provider | "calico" | "flannel" or "calico" | +| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | diff --git a/docs/fedora-coreos/google-cloud.md b/docs/fedora-coreos/google-cloud.md index a635c6bf8..52dbf7a06 100644 --- a/docs/fedora-coreos/google-cloud.md +++ b/docs/fedora-coreos/google-cloud.md @@ -1,6 +1,6 @@ # Google Cloud -In this tutorial, we'll create a Kubernetes v1.18.2 cluster on Google Compute Engine with Fedora CoreOS. +In this tutorial, we'll create a Kubernetes v1.18.8 cluster on Google Compute Engine with Fedora CoreOS. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets. @@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se * Google Cloud Account and Service Account * Google Cloud DNS Zone (registered Domain Name or delegated subdomain) -* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally +* Terraform v0.13.0+ ## Terraform Setup -Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system. +Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system. ```sh $ terraform version -Terraform v0.12.21 -``` - -Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0 +Terraform v0.13.0 ``` Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`). @@ -49,14 +41,24 @@ Configure the Google Cloud provider to use your service account key, project-id, ```tf provider "google" { - version = "3.12.0" project = "project-id" region = "us-central1" credentials = file("~/.config/google-cloud/terraform.json") } -provider "ct" { - version = "0.5.0" +provider "ct" {} + +terraform { + required_providers { + ct = { + source = "poseidon/ct" + version = "0.6.1" + } + google = { + source = "hashicorp/google" + version = "3.34.0" + } + } } ``` @@ -65,25 +67,6 @@ Additional configuration options are described in the `google` provider [docs](h !!! tip Regions are listed in [docs](https://cloud.google.com/compute/docs/regions-zones/regions-zones) or with `gcloud compute regions list`. A project may contain multiple clusters across different regions. -## Fedora CoreOS Images - -Fedora CoreOS publishes images for Google Cloud, but does not yet upload them. Google Cloud allows [custom boot images](https://cloud.google.com/compute/docs/images/import-existing-image) to be uploaded to a bucket and imported into your project. - -[Download](https://getfedora.org/coreos/download/) a Fedora CoreOS GCP gzipped tarball and upload it to a Google Cloud storage bucket. - -``` -gsutil list -gsutil cp fedora-coreos-31.20200323.3.2-gcp.x86_64.tar.gz gs://BUCKET -``` - -Create a Compute Engine image from the file. - -``` -gcloud compute images create fedora-coreos-31-20200323-3-2 --source-uri gs://BUCKET/fedora-coreos-31.20200323.3.2-gcp.x86_64.tar.gz -``` - -Set the [os_image](#variables) in the next step. - ## Cluster Define a Kubernetes cluster using the module `google-cloud/fedora-coreos/kubernetes`. @@ -99,7 +82,6 @@ module "yavin" { dns_zone_name = "example-zone" # configuration - os_image = "fedora-coreos-31-20200323-3-2" ssh_authorized_key = "ssh-rsa AAAAB3Nz..." # optional @@ -165,9 +147,9 @@ List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/yavin-config $ kubectl get nodes NAME ROLES STATUS AGE VERSION -yavin-controller-0.c.example-com.internal Ready 6m v1.18.2 -yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.2 -yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.2 +yavin-controller-0.c.example-com.internal Ready 6m v1.18.8 +yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.8 +yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.8 ``` List the pods. @@ -204,7 +186,6 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/google- | region | Google Cloud region | "us-central1" | | dns_zone | Google Cloud DNS zone | "google-cloud.example.com" | | dns_zone_name | Google Cloud DNS zone name | "example-zone" | -| os_image | Fedora CoreOS image for compute instances | "fedora-coreos-31-20200323-3-2" | | ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." | Check the list of valid [regions](https://cloud.google.com/compute/docs/regions-zones/regions-zones) and list Fedora CoreOS [images](https://cloud.google.com/compute/docs/images) with `gcloud compute images list | grep fedora-coreos`. @@ -234,11 +215,12 @@ resource "google_dns_managed_zone" "zone-for-clusters" { | worker_count | Number of workers | 1 | 3 | | controller_type | Machine type for controllers | "n1-standard-1" | See below | | worker_type | Machine type for workers | "n1-standard-1" | See below | +| os_stream | Fedora CoreOS stream for compute instances | "stable" | "stable", "testing", "next" | | disk_size | Size of the disk in GB | 40 | 100 | | worker_preemptible | If enabled, Compute Engine will terminate workers randomly within 24 hours | false | true | | controller_snippets | Controller Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) | | worker_snippets | Worker Fedora CoreOS Config snippets | [] | [examples](/advanced/customization/) | -| networking | Choice of networking provider | "calico" | "calico" or "flannel" | +| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | | worker_node_labels | List of initial worker node labels | [] | ["worker-pool=default"] | diff --git a/docs/cl/aws.md b/docs/flatcar-linux/aws.md similarity index 89% rename from docs/cl/aws.md rename to docs/flatcar-linux/aws.md index afc469cea..1c371b9b2 100644 --- a/docs/cl/aws.md +++ b/docs/flatcar-linux/aws.md @@ -1,6 +1,6 @@ # AWS -In this tutorial, we'll create a Kubernetes v1.18.2 cluster on AWS with CoreOS Container Linux or Flatcar Linux. +In this tutorial, we'll create a Kubernetes v1.18.8 cluster on AWS with CoreOS Container Linux or Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a VPC, gateway, subnets, security groups, controller instances, worker auto-scaling group, network load balancer, and TLS assets. @@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se * AWS Account and IAM credentials * AWS Route53 DNS Zone (registered Domain Name or delegated subdomain) -* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally +* Terraform v0.13.0+ ## Terraform Setup -Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system. +Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system. ```sh $ terraform version -Terraform v0.12.21 -``` - -Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0 +Terraform v0.13.0 ``` Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`). @@ -49,13 +41,23 @@ Configure the AWS provider to use your access key credentials in a `providers.tf ```tf provider "aws" { - version = "2.53.0" region = "eu-central-1" shared_credentials_file = "/home/user/.config/aws/credentials" } -provider "ct" { - version = "0.5.0" +provider "ct" {} + +terraform { + required_providers { + ct = { + source = "poseidon/ct" + version = "0.6.1" + } + aws = { + source = "hashicorp/aws" + version = "3.2.0" + } + } } ``` @@ -70,7 +72,7 @@ Define a Kubernetes cluster using the module `aws/container-linux/kubernetes`. ```tf module "tempest" { - source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//aws/container-linux/kubernetes?ref=v1.18.8" # AWS cluster_name = "tempest" @@ -143,9 +145,9 @@ List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/tempest-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION -ip-10-0-3-155 Ready 10m v1.18.2 -ip-10-0-26-65 Ready 10m v1.18.2 -ip-10-0-41-21 Ready 10m v1.18.2 +ip-10-0-3-155 Ready 10m v1.18.8 +ip-10-0-26-65 Ready 10m v1.18.8 +ip-10-0-41-21 Ready 10m v1.18.8 ``` List the pods. @@ -208,7 +210,7 @@ Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`. | worker_count | Number of workers | 1 | 3 | | controller_type | EC2 instance type for controllers | "t3.small" | See below | | worker_type | EC2 instance type for workers | "t3.small" | See below | -| os_image | AMI channel for a Container Linux derivative | "flatcar-stable" | coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge | +| os_image | AMI channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge | | disk_size | Size of the EBS volume in GB | 40 | 100 | | disk_type | Type of the EBS volume | "gp2" | standard, gp2, io1 | | disk_iops | IOPS of the EBS volume | 0 (i.e. auto) | 400 | @@ -216,7 +218,7 @@ Reference the DNS zone id with `aws_route53_zone.zone-for-clusters.zone_id`. | worker_price | Spot price in USD for worker instances or 0 to use on-demand instances | 0/null | 0.10 | | controller_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) | | worker_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) | -| networking | Choice of networking provider | "calico" | "calico" or "flannel" | +| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" | | network_mtu | CNI interface MTU (calico only) | 1480 | 8981 | | host_cidr | CIDR IPv4 range to assign to EC2 instances | "10.0.0.0/16" | "10.1.0.0/16" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | diff --git a/docs/cl/azure.md b/docs/flatcar-linux/azure.md similarity index 89% rename from docs/cl/azure.md rename to docs/flatcar-linux/azure.md index a721a06ba..af38aca60 100644 --- a/docs/cl/azure.md +++ b/docs/flatcar-linux/azure.md @@ -1,6 +1,6 @@ # Azure -In this tutorial, we'll create a Kubernetes v1.18.2 cluster on Azure with CoreOS Container Linux or Flatcar Linux. +In this tutorial, we'll create a Kubernetes v1.18.8 cluster on Azure with CoreOS Container Linux or Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a resource group, virtual network, subnets, security groups, controller availability set, worker scale set, load balancer, and TLS assets. @@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se * Azure account * Azure DNS Zone (registered Domain Name or delegated subdomain) -* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally +* Terraform v0.13.0+ ## Terraform Setup -Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system. +Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system. ```sh $ terraform version -Terraform v0.12.21 -``` - -Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0 +Terraform v0.13.0 ``` Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`). @@ -47,23 +39,43 @@ Configure the Azure provider in a `providers.tf` file. ```tf provider "azurerm" { - version = "2.5.0" + features {} } -provider "ct" { - version = "0.5.0" +provider "ct" {} + +terraform { + required_providers { + ct = { + source = "poseidon/ct" + version = "0.6.1" + } + azurerm = { + source = "hashicorp/azurerm" + version = "2.23.0" + } + } } ``` Additional configuration options are described in the `azurerm` provider [docs](https://www.terraform.io/docs/providers/azurerm/). +## Flatcar Linux Images + +Flatcar Linux publishes images to the Azure Marketplace and requires accepting terms. + +``` +az vm image terms show --publish kinvolk --offer flatcar-container-linux-free --plan stable +az vm image terms accept --publish kinvolk --offer flatcar-container-linux-free --plan stable +``` + ## Cluster Define a Kubernetes cluster using the module `azure/container-linux/kubernetes`. ```tf module "ramius" { - source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//azure/container-linux/kubernetes?ref=v1.18.8" # Azure cluster_name = "ramius" @@ -146,9 +158,9 @@ List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/ramius-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION -ramius-controller-0 Ready 24m v1.18.2 -ramius-worker-000001 Ready 25m v1.18.2 -ramius-worker-000002 Ready 24m v1.18.2 +ramius-controller-0 Ready 24m v1.18.8 +ramius-worker-000001 Ready 25m v1.18.8 +ramius-worker-000002 Ready 24m v1.18.8 ``` List the pods. @@ -225,12 +237,12 @@ Reference the DNS zone with `azurerm_dns_zone.clusters.name` and its resource gr | worker_count | Number of workers | 1 | 3 | | controller_type | Machine type for controllers | "Standard_B2s" | See below | | worker_type | Machine type for workers | "Standard_DS1_v2" | See below | -| os_image | Channel for a Container Linux derivative | "flatcar-stable" | coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta | +| os_image | Channel for a Container Linux derivative | "flatcar-stable" | flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge | | disk_size | Size of the disk in GB | 40 | 100 | | worker_priority | Set priority to Spot to use reduced cost surplus capacity, with the tradeoff that instances can be deallocated at any time | Regular | Spot | | controller_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/#usage) | | worker_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/#usage) | -| networking | Choice of networking provider | "calico" | "flannel" or "calico" | +| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" | | host_cidr | CIDR IPv4 range to assign to instances | "10.0.0.0/16" | "10.0.0.0/20" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | diff --git a/docs/cl/bare-metal.md b/docs/flatcar-linux/bare-metal.md similarity index 89% rename from docs/cl/bare-metal.md rename to docs/flatcar-linux/bare-metal.md index e9b8ab1ab..2a5dbda98 100644 --- a/docs/cl/bare-metal.md +++ b/docs/flatcar-linux/bare-metal.md @@ -1,6 +1,6 @@ # Bare-Metal -In this tutorial, we'll network boot and provision a Kubernetes v1.18.2 cluster on bare-metal with CoreOS Container Linux or Flatcar Linux. +In this tutorial, we'll network boot and provision a Kubernetes v1.18.8 cluster on bare-metal with CoreOS Container Linux or Flatcar Linux. First, we'll deploy a [Matchbox](https://github.com/poseidon/matchbox) service and setup a network boot environment. Then, we'll declare a Kubernetes cluster using the Typhoon Terraform module and power on machines. On PXE boot, machines will install Container Linux to disk, reboot into the disk install, and provision themselves as Kubernetes controllers or workers via Ignition. @@ -12,7 +12,7 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se * PXE-enabled [network boot](https://coreos.com/matchbox/docs/latest/network-setup.html) environment (with HTTPS support) * Matchbox v0.6+ deployment with API enabled * Matchbox credentials `client.crt`, `client.key`, `ca.crt` -* Terraform v0.12.6+, [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox), and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally +* Terraform v0.13.0+ ## Machines @@ -107,27 +107,11 @@ Read about the [many ways](https://coreos.com/matchbox/docs/latest/network-setup ## Terraform Setup -Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system. +Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system. ```sh $ terraform version -Terraform v0.12.21 -``` - -Add the [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-matchbox/releases/download/v0.3.0/terraform-provider-matchbox-v0.3.0-linux-amd64.tar.gz -tar xzf terraform-provider-matchbox-v0.3.0-linux-amd64.tar.gz -mv terraform-provider-matchbox-v0.3.0-linux-amd64/terraform-provider-matchbox ~/.terraform.d/plugins/terraform-provider-matchbox_v0.3.0 -``` - -Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0 +Terraform v0.13.0 ``` Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`). @@ -142,15 +126,25 @@ Configure the Matchbox provider to use your Matchbox API endpoint and client cer ```tf provider "matchbox" { - version = "0.3.0" endpoint = "matchbox.example.com:8081" client_cert = file("~/.config/matchbox/client.crt") client_key = file("~/.config/matchbox/client.key") ca = file("~/.config/matchbox/ca.crt") } -provider "ct" { - version = "0.5.0" +provider "ct" {} + +terraform { + required_providers { + ct = { + source = "poseidon/ct" + version = "0.6.1" + } + matchbox = { + source = "poseidon/matchbox" + version = "0.4.1" + } + } } ``` @@ -160,7 +154,7 @@ Define a Kubernetes cluster using the module `bare-metal/container-linux/kuberne ```tf module "mercury" { - source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.18.8" # bare-metal cluster_name = "mercury" @@ -299,9 +293,9 @@ List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/mercury-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION -node1.example.com Ready 10m v1.18.2 -node2.example.com Ready 10m v1.18.2 -node3.example.com Ready 10m v1.18.2 +node1.example.com Ready 10m v1.18.8 +node2.example.com Ready 10m v1.18.8 +node3.example.com Ready 10m v1.18.8 ``` List the pods. @@ -336,7 +330,7 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-me |:-----|:------------|:--------| | cluster_name | Unique cluster name | "mercury" | | matchbox_http_endpoint | Matchbox HTTP read-only endpoint | "http://matchbox.example.com:port" | -| os_channel | Channel for a Container Linux derivative | coreos-stable, coreos-beta, coreos-alpha, flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge | +| os_channel | Channel for a Container Linux derivative | flatcar-stable, flatcar-beta, flatcar-alpha, flatcar-edge | | os_version | Version for a Container Linux derivative to PXE and install | "2345.3.1" | | k8s_domain_name | FQDN resolving to the controller(s) nodes. Workers and kubectl will communicate with this endpoint | "myk8s.example.com" | | ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3Nz..." | @@ -350,7 +344,7 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/bare-me | download_protocol | Protocol iPXE uses to download the kernel and initrd. iPXE must be compiled with [crypto](https://ipxe.org/crypto) support for https. Unused if cached_install is true | "https" | "http" | | cached_install | PXE boot and install from the Matchbox `/assets` cache. Admin MUST have downloaded Container Linux or Flatcar images into the cache | false | true | | install_disk | Disk device where Container Linux should be installed | "/dev/sda" | "/dev/sdb" | -| networking | Choice of networking provider | "calico" | "calico" or "flannel" | +| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" | | network_mtu | CNI interface MTU (calico-only) | 1480 | - | | snippets | Map from machine names to lists of Container Linux Config snippets | {} | [examples](/advanced/customization/) | | network_ip_autodetection_method | Method to detect host IPv4 address (calico-only) | "first-found" | "can-reach=10.0.0.1" | diff --git a/docs/cl/digital-ocean.md b/docs/flatcar-linux/digitalocean.md similarity index 88% rename from docs/cl/digital-ocean.md rename to docs/flatcar-linux/digitalocean.md index 164ba9ed7..341868c9e 100644 --- a/docs/cl/digital-ocean.md +++ b/docs/flatcar-linux/digitalocean.md @@ -1,6 +1,6 @@ -# Digital Ocean +# DigitalOcean -In this tutorial, we'll create a Kubernetes v1.18.2 cluster on DigitalOcean with CoreOS Container Linux or Flatcar Linux. +In this tutorial, we'll create a Kubernetes v1.18.8 cluster on DigitalOcean with CoreOS Container Linux or Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create controller droplets, worker droplets, DNS records, tags, and TLS assets. @@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se * Digital Ocean Account and Token * Digital Ocean Domain (registered Domain Name or delegated subdomain) -* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally +* Terraform v0.13.0+ ## Terraform Setup -Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system. +Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system. ```sh $ terraform version -Terraform v0.12.21 -``` - -Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0 +Terraform v0.13.0 ``` Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`). @@ -50,12 +42,22 @@ Configure the DigitalOcean provider to use your token in a `providers.tf` file. ```tf provider "digitalocean" { - version = "1.15.1" token = "${chomp(file("~/.config/digital-ocean/token"))}" } -provider "ct" { - version = "0.5.0" +provider "ct" {} + +terraform { + required_providers { + ct = { + source = "poseidon/ct" + version = "0.6.1" + } + digitalocean = { + source = "digitalocean/digitalocean" + version = "1.22.1" + } + } } ``` @@ -79,7 +81,7 @@ Define a Kubernetes cluster using the module `digital-ocean/container-linux/kube ```tf module "nemo" { - source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//digital-ocean/container-linux/kubernetes?ref=v1.18.8" # Digital Ocean cluster_name = "nemo" @@ -153,9 +155,9 @@ List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/nemo-config $ kubectl get nodes NAME STATUS ROLES AGE VERSION -10.132.110.130 Ready 10m v1.18.2 -10.132.115.81 Ready 10m v1.18.2 -10.132.124.107 Ready 10m v1.18.2 +10.132.110.130 Ready 10m v1.18.8 +10.132.115.81 Ready 10m v1.18.8 +10.132.124.107 Ready 10m v1.18.8 ``` List the pods. @@ -190,7 +192,7 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/digital | cluster_name | Unique cluster name (prepended to dns_zone) | "nemo" | | region | Digital Ocean region | "nyc1", "sfo2", "fra1", tor1" | | dns_zone | Digital Ocean domain (i.e. DNS zone) | "do.example.com" | -| os_image | Container Linux image for instances | "custom-image-id", coreos-stable, coreos-beta, coreos-alpha | +| os_image | Container Linux image for instances | "uploaded-flatcar-image-id" | | ssh_fingerprints | SSH public key fingerprints | ["d7:9d..."] | #### DNS Zone @@ -238,7 +240,7 @@ Digital Ocean requires the SSH public key be uploaded to your account, so you ma | worker_type | Droplet type for workers | "s-1vcpu-2gb" | s-1vcpu-2gb, s-2vcpu-2gb, ... | | controller_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) | | worker_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) | -| networking | Choice of networking provider | "calico" | "flannel" or "calico" | +| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | diff --git a/docs/cl/google-cloud.md b/docs/flatcar-linux/google-cloud.md similarity index 91% rename from docs/cl/google-cloud.md rename to docs/flatcar-linux/google-cloud.md index 511c09c56..9f3a97c14 100644 --- a/docs/cl/google-cloud.md +++ b/docs/flatcar-linux/google-cloud.md @@ -1,6 +1,6 @@ # Google Cloud -In this tutorial, we'll create a Kubernetes v1.18.2 cluster on Google Compute Engine with CoreOS Container Linux or Flatcar Linux. +In this tutorial, we'll create a Kubernetes v1.18.8 cluster on Google Compute Engine with CoreOS Container Linux or Flatcar Linux. We'll declare a Kubernetes cluster using the Typhoon Terraform module. Then apply the changes to create a network, firewall rules, health checks, controller instances, worker managed instance group, load balancers, and TLS assets. @@ -10,23 +10,15 @@ Controller hosts are provisioned to run an `etcd-member` peer and a `kubelet` se * Google Cloud Account and Service Account * Google Cloud DNS Zone (registered Domain Name or delegated subdomain) -* Terraform v0.12.6+ and [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) installed locally +* Terraform v0.13.0+ ## Terraform Setup -Install [Terraform](https://www.terraform.io/downloads.html) v0.12.6+ on your system. +Install [Terraform](https://www.terraform.io/downloads.html) v0.13.0+ on your system. ```sh $ terraform version -Terraform v0.12.21 -``` - -Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. - -```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.5.0-linux-amd64.tar.gz -mv terraform-provider-ct-v0.5.0-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.5.0 +Terraform v0.13.0 ``` Read [concepts](/architecture/concepts/) to learn about Terraform, modules, and organizing resources. Change to your infrastructure repository (e.g. `infra`). @@ -49,14 +41,24 @@ Configure the Google Cloud provider to use your service account key, project-id, ```tf provider "google" { - version = "3.12.0" project = "project-id" region = "us-central1" credentials = file("~/.config/google-cloud/terraform.json") } -provider "ct" { - version = "0.5.0" +provider "ct" {} + +terraform { + required_providers { + ct = { + source = "poseidon/ct" + version = "0.6.1" + } + google = { + source = "hashicorp/google" + version = "3.34.0" + } + } } ``` @@ -90,7 +92,7 @@ Define a Kubernetes cluster using the module `google-cloud/container-linux/kuber ```tf module "yavin" { - source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.18.8" # Google Cloud cluster_name = "yavin" @@ -165,9 +167,9 @@ List nodes in the cluster. $ export KUBECONFIG=/home/user/.kube/configs/yavin-config $ kubectl get nodes NAME ROLES STATUS AGE VERSION -yavin-controller-0.c.example-com.internal Ready 6m v1.18.2 -yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.2 -yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.2 +yavin-controller-0.c.example-com.internal Ready 6m v1.18.8 +yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.8 +yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.8 ``` List the pods. @@ -204,7 +206,7 @@ Check the [variables.tf](https://github.com/poseidon/typhoon/blob/master/google- | region | Google Cloud region | "us-central1" | | dns_zone | Google Cloud DNS zone | "google-cloud.example.com" | | dns_zone_name | Google Cloud DNS zone name | "example-zone" | -| os_image | Container Linux image for compute instances | "flatcar-linux-2303-4-0", coreos-stable, coreos-beta, coreos-alpha | +| os_image | Container Linux image for compute instances | "flatcar-linux-2303-4-0" | | ssh_authorized_key | SSH public key for user 'core' | "ssh-rsa AAAAB3NZ..." | Check the list of valid [regions](https://cloud.google.com/compute/docs/regions-zones/regions-zones) and list Container Linux [images](https://cloud.google.com/compute/docs/images) with `gcloud compute images list | grep coreos`. @@ -238,7 +240,7 @@ resource "google_dns_managed_zone" "zone-for-clusters" { | worker_preemptible | If enabled, Compute Engine will terminate workers randomly within 24 hours | false | true | | controller_snippets | Controller Container Linux Config snippets | [] | [example](/advanced/customization/) | | worker_snippets | Worker Container Linux Config snippets | [] | [example](/advanced/customization/) | -| networking | Choice of networking provider | "calico" | "calico" or "flannel" | +| networking | Choice of networking provider | "calico" | "calico" or "cilium" or "flannel" | | pod_cidr | CIDR IPv4 range to assign to Kubernetes pods | "10.2.0.0/16" | "10.22.0.0/16" | | service_cidr | CIDR IPv4 range to assign to Kubernetes services | "10.3.0.0/16" | "10.3.0.0/24" | | worker_node_labels | List of initial worker node labels | [] | ["worker-pool=default"] | diff --git a/docs/index.md b/docs/index.md index 9d9fedc51..96a20f557 100644 --- a/docs/index.md +++ b/docs/index.md @@ -11,10 +11,10 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) -* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking -* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) -* Advanced features like [worker pools](advanced/worker-pools/), [preemptible](cl/google-cloud/#preemption) workers, and [snippets](advanced/customization/#container-linux) customization +* Kubernetes v1.18.8 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking +* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing +* Advanced features like [worker pools](advanced/worker-pools/), [preemptible](fedora-coreos/google-cloud/#preemption) workers, and [snippets](advanced/customization/#container-linux) customization * Ready for Ingress, Prometheus, Grafana, CSI, or other [addons](addons/overview/) ## Modules @@ -28,35 +28,24 @@ Typhoon is available for [Fedora CoreOS](https://getfedora.org/coreos/). | AWS | Fedora CoreOS | [aws/fedora-coreos/kubernetes](fedora-coreos/aws.md) | stable | | Azure | Fedora CoreOS | [azure/fedora-coreos/kubernetes](fedora-coreos/azure.md) | alpha | | Bare-Metal | Fedora CoreOS | [bare-metal/fedora-coreos/kubernetes](fedora-coreos/bare-metal.md) | beta | -| DigitalOcean | Fedora CoreOS | [digital-ocean/fedora-coreos/kubernetes](fedora-coreos/digitalocean.md) | alpha | -| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](google-cloud/fedora-coreos/kubernetes) | beta | +| DigitalOcean | Fedora CoreOS | [digital-ocean/fedora-coreos/kubernetes](fedora-coreos/digitalocean.md) | beta | +| Google Cloud | Fedora CoreOS | [google-cloud/fedora-coreos/kubernetes](fedora-coreos/google-cloud/kubernetes) | stable | -Typhoon is available for [Flatcar Container Linux](https://www.flatcar-linux.org/releases/). +Typhoon is available for [Flatcar Linux](https://www.flatcar-linux.org/releases/). | Platform | Operating System | Terraform Module | Status | |---------------|------------------|------------------|--------| -| AWS | Flatcar Linux | [aws/container-linux/kubernetes](cl/aws.md) | stable | -| Azure | Flatcar Linux | [azure/container-linux/kubernetes](cl/azure.md) | alpha | -| Bare-Metal | Flatcar Linux | [bare-metal/container-linux/kubernetes](cl/bare-metal.md) | stable | -| DigitalOcean | Flatcar Linux | [digital-ocean/container-linux/kubernetes](cl/digital-ocean.md) | alpha | -| Google Cloud | Flatcar Linux | [google-cloud/container-linux/kubernetes](cl/google-cloud.md) | alpha | - -Typhoon is available for CoreOS Container Linux ([no updates](https://coreos.com/os/eol/) after May 2020). - -| Platform | Operating System | Terraform Module | Status | -|---------------|------------------|------------------|--------| -| AWS | Container Linux | [aws/container-linux/kubernetes](cl/aws.md) | stable | -| Azure | Container Linux | [azure/container-linux/kubernetes](cl/azure.md) | alpha | -| Bare-Metal | Container Linux | [bare-metal/container-linux/kubernetes](cl/bare-metal.md) | stable | -| Digital Ocean | Container Linux | [digital-ocean/container-linux/kubernetes](cl/digital-ocean.md) | beta | -| Google Cloud | Container Linux | [google-cloud/container-linux/kubernetes](cl/google-cloud.md) | stable | - +| AWS | Flatcar Linux | [aws/container-linux/kubernetes](flatcar-linux/aws.md) | stable | +| Azure | Flatcar Linux | [azure/container-linux/kubernetes](flatcar-linux/azure.md) | alpha | +| Bare-Metal | Flatcar Linux | [bare-metal/container-linux/kubernetes](flatcar-linux/bare-metal.md) | stable | +| DigitalOcean | Flatcar Linux | [digital-ocean/container-linux/kubernetes](flatcar-linux/digitalocean.md) | beta | +| Google Cloud | Flatcar Linux | [google-cloud/container-linux/kubernetes](flatcar-linux/google-cloud.md) | beta | ## Documentation * Architecture [concepts](architecture/concepts.md) and [operating-systems](architecture/operating-systems.md) * Fedora CoreOS tutorials for [AWS](fedora-coreos/aws.md), [Azure](fedora-coreos/azure.md), [Bare-Metal](fedora-coreos/bare-metal.md), [DigitalOcean](fedora-coreos/digitalocean.md), and [Google Cloud](fedora-coreos/google-cloud.md) -* Flatcar Linux tutorials for [AWS](cl/aws.md), [Azure](cl/azure.md), [Bare-Metal](cl/bare-metal.md), [DigitalOcean](cl/digital-ocean.md), and [Google Cloud](cl/google-cloud.md) +* Flatcar Linux tutorials for [AWS](flatcar-linux/aws.md), [Azure](flatcar-linux/azure.md), [Bare-Metal](flatcar-linux/bare-metal.md), [DigitalOcean](flatcar-linux/digitalocean.md), and [Google Cloud](flatcar-linux/google-cloud.md) ## Example @@ -64,7 +53,7 @@ Define a Kubernetes cluster by using the Terraform module for your chosen platfo ```tf module "yavin" { - source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.18.8" # Google Cloud cluster_name = "yavin" @@ -102,9 +91,9 @@ In 4-8 minutes (varies by platform), the cluster will be ready. This Google Clou $ export KUBECONFIG=/home/user/.kube/configs/yavin-config $ kubectl get nodes NAME ROLES STATUS AGE VERSION -yavin-controller-0.c.example-com.internal Ready 6m v1.18.2 -yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.2 -yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.2 +yavin-controller-0.c.example-com.internal Ready 6m v1.18.8 +yavin-worker-jrbf.c.example-com.internal Ready 5m v1.18.8 +yavin-worker-mzdm.c.example-com.internal Ready 5m v1.18.8 ``` List the pods. diff --git a/docs/topics/faq.md b/docs/topics/faq.md index 1a8eef263..3a8fa7a77 100644 --- a/docs/topics/faq.md +++ b/docs/topics/faq.md @@ -6,22 +6,13 @@ Typhoon provides a Terraform Module for each supported operating system and plat Formats rise and evolve. Typhoon may choose to adapt the format over time (with lots of forewarning). However, the authors' have built several Kubernetes "distros" before and learned from mistakes - Terraform modules are the right format for now. -## Operating Systems - -Typhoon supports Container Linux and the Flatcar Linux derivative. These operating systems were chosen because they offer: - -* Minimalism and focus on clustered operation -* Automated and atomic operating system upgrades -* Declarative and immutable configuration -* Optimization for containerized applications - ## Get Help Ask questions on the IRC #typhoon channel on [freenode.net](http://freenode.net/). ## Security Issues -If you find security issues, please see [security disclosures](/topics/security.md#disclosures). +If you find security issues, please see [security disclosures](/topics/security/#disclosures). ## Maintainers diff --git a/docs/topics/hardware.md b/docs/topics/hardware.md index d95ad2275..6087c62b0 100644 --- a/docs/topics/hardware.md +++ b/docs/topics/hardware.md @@ -183,7 +183,7 @@ show ip route bgp ### Port Forwarding -Expose the [Ingress Controller](/addons/ingress.md#bare-metal) by adding `port-forward` rules that DNAT a port on the router's WAN interface to an internal IP and port. By convention, a public Ingress controller is assigned a fixed service IP (e.g. 10.3.0.12). +Expose the [Ingress Controller](/addons/ingress/#bare-metal) by adding `port-forward` rules that DNAT a port on the router's WAN interface to an internal IP and port. By convention, a public Ingress controller is assigned a fixed service IP (e.g. 10.3.0.12). ``` configure diff --git a/docs/topics/maintenance.md b/docs/topics/maintenance.md index b01a610f8..47dd67600 100644 --- a/docs/topics/maintenance.md +++ b/docs/topics/maintenance.md @@ -13,12 +13,12 @@ Typhoon provides tagged releases to allow clusters to be versioned using ordinar ``` module "yavin" { - source = "git::https://github.com/poseidon/typhoon//google-cloud/container-linux/kubernetes?ref=v1.8.6" + source = "git::https://github.com/poseidon/typhoon//google-cloud/fedora-coreos/kubernetes?ref=v1.18.8" ... } module "mercury" { - source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.18.2" + source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.18.8" ... } ``` @@ -74,11 +74,11 @@ Delete or comment the Terraform config for the cluster. Apply to delete old provisioning configs from Matchbox. ``` -$ terraform apply +$ terraform apply Apply complete! Resources: 0 added, 0 changed, 55 destroyed. ``` -Re-provision a new cluster by following the bare-metal [tutorial](../cl/bare-metal.md#cluster). +Re-provision a new cluster by following the bare-metal [tutorial](../fedora-coreos/bare-metal.md#cluster). ### Cloud @@ -102,7 +102,7 @@ Once you're confident in the new cluster, delete the Terraform config for the ol Apply to delete the cluster. ``` -$ terraform apply +$ terraform apply Apply complete! Resources: 0 added, 0 changed, 55 destroyed. ``` @@ -125,86 +125,18 @@ In certain scenarios, in-place edits can be useful for quickly rolling out secur Typhoon supports multi-controller clusters, so it is possible to upgrade a cluster by deleting and replacing nodes one by one. !!! warning - Typhoon does not support or document node replacement as an upgrade strategy. It limits Typhoon's ability to make infrastructure and architectural changes between tagged releases. - -### Terraform Plugins Directory - -Use the Terraform 3rd-party [plugin directory](https://www.terraform.io/docs/configuration/providers.html#third-party-plugins) `~/.terraform.d/plugins` to keep versioned copies of the `terraform-provider-ct` and `terraform-provider-matchbox` plugins. The plugin directory replaces the `~/.terraformrc` file to allow 3rd party plugins to be defined and versioned independently (rather than globally). - -``` -# ~/.terraformrc (DEPRECATED) -providers { - ct = "/usr/local/bin/terraform-provider-ct" - matchbox = "/usr/local/bin/terraform-provider-matchbox" -} -``` - -Migrate to using the Terraform plugin directory. Move `~/.terraformrc` to a backup location. - -``` -mv ~/.terraformrc ~/.terraform-backup -``` - -Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`. Download the **same version** of `terraform-provider-ct` you were using with `~/.terraformrc`, updating only be done as a followup and is **only** safe for v1.12.2+ clusters! - -```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.2.1/terraform-provider-ct-v0.2.1-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.2.1-linux-amd64.tar.gz -mv terraform-provider-ct-v0.2.1-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.2.1 -``` - -If you use bare-metal, add the [terraform-provider-matchbox](https://github.com/poseidon/terraform-provider-matchbox) plugin binary for your system to `~/.terraform.d/plugins/`, noting the versioned name. - -```sh -wget https://github.com/poseidon/terraform-provider-matchbox/releases/download/v0.2.3/terraform-provider-matchbox-v0.2.3-linux-amd64.tar.gz -tar xzf terraform-provider-matchbox-v0.2.3-linux-amd64.tar.gz -mv terraform-provider-matchbox-v0.2.3-linux-amd64/terraform-provider-matchbox ~/.terraform.d/plugins/terraform-provider-matchbox_v0.2.3 -``` - -Binary names are versioned. This enables the ability to upgrade different plugins and have clusters pin different versions. - -``` -$ tree ~/.terraform.d/ -/home/user/.terraform.d/ -└── plugins - ├── terraform-provider-ct_v0.2.1 - └── terraform-provider-matchbox_v0.2.3 -``` - -In each Terraform working directory, set the version of each provider. - -``` -# providers.tf - -provider "matchbox" { - version = "0.2.3" - ... -} - -provider "ct" { - version = "0.2.1" -} -``` - -Run `terraform init` to ensure plugin version requirements are met. Verify `terraform plan` does not produce a diff, since the plugin versions should be the same as previously. - -``` -$ terraform init -$ terraform plan -``` + Typhoon does not support or document node replacement as an upgrade strategy. It limits Typhoon's ability to make infrastructure and architectural changes between tagged releases. ### Upgrade terraform-provider-ct -The [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin parses, validates, and converts Container Linux Configs into Ignition user-data for provisioning instances. Previously, updating the plugin re-provisioned controller nodes and was destructive to clusters. With Typhoon v1.12.2+, the plugin can be updated in-place and on apply, only workers will be replaced. - -First, [migrate](#terraform-plugins-directory) to the Terraform 3rd-party plugin directory to allow 3rd-party plugins to be defined and versioned independently (rather than globally). +The [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin parses, validates, and converts Container Linux Configs into Ignition user-data for provisioning instances. Since Typhoon v1.12.2+, the plugin can be updated in-place so that on apply, only workers will be replaced. Add the [terraform-provider-ct](https://github.com/poseidon/terraform-provider-ct) plugin binary for your system to `~/.terraform.d/plugins/`, noting the final name. ```sh -wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.3.1/terraform-provider-ct-v0.3.1-linux-amd64.tar.gz -tar xzf terraform-provider-ct-v0.3.1-linux-amd64.tar.gz -mv terraform-provider-ct-v0.3.1-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.3.1 +wget https://github.com/poseidon/terraform-provider-ct/releases/download/v0.5.0/terraform-provider-ct-v0.6.1-linux-amd64.tar.gz +tar xzf terraform-provider-ct-v0.6.1-linux-amd64.tar.gz +mv terraform-provider-ct-v0.6.1-linux-amd64/terraform-provider-ct ~/.terraform.d/plugins/terraform-provider-ct_v0.6.1 ``` Binary names are versioned. This enables the ability to upgrade different plugins and have clusters pin different versions. @@ -215,17 +147,16 @@ $ tree ~/.terraform.d/ └── plugins ├── terraform-provider-ct_v0.2.1 ├── terraform-provider-ct_v0.3.0 - ├── terraform-provider-ct_v0.3.1 - └── terraform-provider-matchbox_v0.2.3 + ├── terraform-provider-ct_v0.6.1 + └── terraform-provider-matchbox_v0.4.1 ``` Update the version of the `ct` plugin in each Terraform working directory. Typhoon clusters managed in the working directory **must** be v1.12.2 or higher. -``` -# providers.tf +```tf provider "ct" { - version = "0.3.0" + version = "0.6.1" } ``` @@ -261,7 +192,7 @@ terraform apply # add kubeconfig to new workers terraform state list | grep null_resource -terraform taint -module digital-ocean-nemo null_resource.copy-worker-secrets[N] +terraform taint module.nemo.null_resource.copy-worker-secrets[N] terraform apply ``` @@ -271,161 +202,90 @@ Expect downtime. Google Cloud creates a new worker template and edits the worker instance group instantly. Manually terminate workers and replacement workers will use the user-data. -## Terraform v0.12.x +## Terraform Versions -Terraform [v0.12](https://www.hashicorp.com/blog/announcing-terraform-0-12) introduced major changes to the provider plugin protocol and HCL language (first-class expressions, formal list and map types, nullable variables, variable constraints, and short-circuiting ternary operators). +Terraform [v0.13](https://www.hashicorp.com/blog/announcing-hashicorp-terraform-0-13) introduced major changes to the provider plugin system. Terraform `init` can automatically install both `hashicorp` and `poseidon` provider plugins, eliminating the need to manually install plugin binaries. -Typhoon modules have been adapted for Terraform v0.12. Provider plugins requirements now enforce v0.12 compatibility. However, some HCL language changes were breaking (list [type hint](https://www.terraform.io/upgrade-guides/0-12.html#referring-to-list-variables) workarounds in v0.11 now have new meaning). Typhoon cannot offer both v0.11 and v0.12 compatibility in the same release. Upcoming releases require upgrading Terraform to v0.12. +Typhoon modules have been updated for v0.13.x, but retain compatibility with v0.12.26+ to ease migration. Poseidon publishes [providers](/topics/security/#terraform-providers) to the Terraform Provider Registry for usage with v0.13+. | Typhoon Release | Terraform version | |-------------------|---------------------| -| v1.18.2 - ? | v0.12.x | -| v1.10.3 - v1.18.2 | v0.11.x | +| v1.18.8 - ? | v0.12.26+, v0.13.x | +| v1.15.0 - v1.18.8 | v0.12.x | +| v1.10.3 - v1.15.0 | v0.11.x | | v1.9.2 - v1.10.2 | v0.10.4+ or v0.11.x | | v1.7.3 - v1.9.1 | v0.10.x | | v1.6.4 - v1.7.2 | v0.9.x | -### New users +### New Workspace -New users can start with Terraform v0.12.x and follow the docs for Typhoon v1.18.2+ without issue. - -### Existing users - -Migrate from Terraform v0.11 to v0.12 either **in-place** (easier, riskier) or by **moving resources** (safer, tedious). - -Install [Terraform](https://www.terraform.io/downloads.html) v0.12.x on your system alongside Terraform v0.11.x. - -```shell -sudo ln -sf ~/Downloads/terraform-0.12.0/terraform /usr/local/bin/terraform12 -``` +With a new Terraform workspace, use Terraform v0.13.x and the updated Typhoon [tutorials](/fedora-coreos/aws/#provider). -!!! note - For example, `terraform` may refer Terraform v0.11.14, while `terraform12` is symlinked to Terraform v0.12.1. Once migration is complete, Terraform v0.11.x can be deleted and `terraform12` renamed. +### Existing Workspace -#### In-place +An existing Terraform workspace may already manage earlier Typhoon clusters created with Terraform v0.12.x. -For existing Typhoon v1.14.2 or v1.14.3 clusters, edit the Typhoon `ref` to first SHA that introduced Terraform v0.12 support (`3276bf587850218b8f967978a4bf2b05d5f440a2`). The aim is to minimize the diff and convert to using Terraform v0.12.x. For example: +First, upgrade `terraform-provider-ct` to v0.6.1 following the [guide](#upgrade-terraform-provider-ct) above. As usual, read about how `apply` affects existing cluster nodes when `ct` is upgraded. But `terraform-provider-ct` v0.6.1 is compatible with both Terraform v0.12 and v0.13, so do this first. ```tf - module "mercury" { -- source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=v1.14.3" -+ source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=3276bf587850218b8f967978a4bf2b05d5f440a2" - ... +provider "ct" { + version = "0.6.1" +} ``` -With Terraform v0.12, Typhoon clusters no longer require the `providers` block (unless you actually need to pass an [aliased provider](https://www.terraform.io/docs/configuration/providers.html#alias-multiple-provider-instances)). A regression in Terraform v0.11 made it neccessary to explicitly pass aliased providers in order for Typhoon to continue to enforce constraints (see [terraform#16824](https://github.com/hashicorp/terraform/issues/16824)). Terraform v0.12 resolves this issue. - -```tf - module "mercury" { - source = "git::https://github.com/poseidon/typhoon//bare-metal/container-linux/kubernetes?ref=3276bf587850218b8f967978a4bf2b05d5f440a2" +Next, create Typhoon clusters using the `ref` that introduced Terraform v0.13 forward compatibility (`v1.18.8`) or later. You will see a compatibility warning. Use blue/green cluster replacement to shift to these new clusters, then eliminate older clusters. -- providers = { -- local = "local.default" -- null = "null.default" -- template = "template.default" -- tls = "tls.default" -- } ``` - -Provider constrains ensure suitable plugin versions are used. Install new versions of `terraform-provider-ct` (v0.3.2+) and `terraform-provider-matchbox` (bare-metal only, v0.3.0+) according to the [changelog](https://github.com/poseidon/typhoon/blob/master/CHANGES.md#v1144) or tutorial docs. The `local`, `null`, `template`, and `tls` blocks in `providers.tf` are no longer needed. - -```tf - provider "matchbox" { -- version = "0.2.3" -+ version = "0.3.0" - endpoint = "matchbox.example.com:8081" - client_cert = "${file("~/.config/matchbox/client.crt")}" - client_key = "${file("~/.config/matchbox/client.key")}" - } - - provider "ct" { -- version = "0.3.2" -+ version = "0.3.3" - } -- --provider "local" { -- version = "~> 1.0" -- alias = "default" --} -- --provider "null" { -- version = "~> 1.0" -- alias = "default" --} -- --provider "template" { -- version = "~> 1.0" -- alias = "default" --} -- --provider "tls" { -- version = "~> 1.0" -- alias = "default" --} +module "nemo" { + source = "git::https://github.com/poseidon/typhoon//digital-ocean/fedora-coreos/kubernetes?ref=v1.18.8" + ... +} ``` -Within the Terraform config directory (i.e. working directory), initialize to fetch suitable provider plugins. +Install Terraform v0.13. Once all clusters in a workspace are on `v1.18.8` or above, you are ready to start using Terraform v0.13. -```shell -terraform12 init # using Terraform v0.12 binary, not v0.11 ``` - -Use the Terraform v0.12 upgrade subcommand to convert v0.11 syntax to v0.12. This _will_ edit resource definitions in `*.tf` files in the working directory. Start from a clean version control state. Inspect the changes. Resolve any "TODO" items. - -```shell -terraform12 0.12upgrade -git diff +terraform version +v0.13.0 ``` -Finally, plan. +Update `providers.tf` to match the Typhoon [tutorials](/fedora-coreos/aws/#provider) and use new `required_providers` block. -```shell -terraform12 plan ``` - -Verify no changes are proposed and commit changes to version control. You've migrated to Terraform v0.12! Repeat for other config directories. Use the Terraform v0.12 binary going forward. +terraform init +terraform 0.13upgrade # sometimes helpful +``` !!! note - It is known that plan may propose re-creating `template_dir` resources. This is harmless. - -!!! error - If plan produced errors, seek to address them (they may be in non-Typhoon resources). If plan proposed a diff, you'll need to evaluate whether that's expected and safe to apply. In-place edits between Typhoon releases aren't supported (favoring blue/green replacement). The larger the version skew, the greater the risk. Use good judgement. If in doubt, abandon the generated changes, delete `.terraform` as [suggested](https://www.terraform.io/upgrade-guides/0-12.html#upgrading-to-terraform-0-12), and try the move resources approach. + You will see `Could not retrieve the list of available versions for provider -/ct: provider` -#### Moving Resources +In state files, existing clusters use Terraform v0.12 providers (e.g. `-/aws`). Pivot to Terraform v0.13 providers (e.g. `hashicorp/aws`) with the following commands, as applicable. Repeat until `terraform init` no longer shows old-style providers. -Alternately, continue maintaining existing clusters using Terraform v0.11.x and existing Terraform configuration directory(ies). Create new Terraform directory(ies) and move resources there to be managed with Terraform v0.12. This approach allows resources to be migrated incrementally and ensures existing resources can always be managed (e.g. emergency patches). - -Create a new Terraform [config directory](/architecture/concepts/#organize) for *new* resources. - -```shell -mkdir infra2 -tree . -├── infraA <- existing Terraform v0.11.x configs -└── infraB <- new Terraform v0.12.x configs ``` +terraform state replace-provider -- -/aws hashicorp/aws +terraform state replace-provider -- -/azurerm hashicorp/azurerm +terraform state replace-provider -- -/google hashicorp/google -Define Typhoon clusters in the new config directory using Terraform v0.12 syntax. Follow the Typhoon v1.15.0+ docs (e.g. use `terraform12` in the `infraB` dir). See [AWS](/cl/aws), [Azure](/cl/azure), [Bare-Metal](/cl/bare-metal), [Digital Ocean](/cl/digital-ocean), or [Google-Cloud](/cl/google-cloud)) to create new clusters. Follow the usual [upgrade](/topics/maintenance/#upgrades) process to apply workloads and shift traffic. Later, switch back to the old config directory and deprovision clusters with Terraform v0.11. +terraform state replace-provider -- -/digitalocean digitalocean/digitalocean +terraform state replace-provider -- -/ct poseidon/ct +terraform state replace-provider -- -/matchbox poseidon/matchbox -```shell -terraform12 init -terraform12 plan -terraform12 apply +terraform state replace-provider -- -/local hashicorp/local +terraform state replace-provider -- -/null hashicorp/null +terraform state replace-provider -- -/random hashicorp/random +terraform state replace-provider -- -/template hashicorp/template +terraform state replace-provider -- -/tls hashicorp/tls ``` -Your Terraform configuration directory likely defines resources other than just Typhoon modules (e.g. application DNS records, firewall rules, etc.). While such migrations are outside Typhoon's scope, you'll probably want to move existing resource definitions into your new Terraform configuration directory. Use Terraform v0.12 to import the resource into the state associated with the new config directory (to avoid trying to recreate a resource that exists). Then with Terraform v0.11 in the old directory, remove the resource from the state (to avoid trying to delete the resource). Verify neither `plan` produces a diff. - -```sh -# move google_dns_record_set.some-app from infraA to infraB -cd infraA -terraform state list -terraform state show google_dns_record_set.some-app - -cd ../infraB -terraform12 import google_dns_record_set.some-app SOMEID -terraform12 plan +Finally, verify Terraform v0.13 plan shows no diff. -cd ../infraA -terraform state rm google_dns_record_set.some-app +``` terraform plan +No changes. Infrastructure is up-to-date. ``` +### v0.12.x + +Terraform [v0.12](https://www.hashicorp.com/blog/announcing-terraform-0-12) introduced major changes to the provider plugin protocol and HCL language (first-class expressions, formal list and map types, nullable variables, variable constraints, and short-circuiting ternary operators). + +Typhoon modules have been adapted for Terraform v0.12. Provider plugins requirements now enforce v0.12 compatibility. However, some HCL language changes were breaking (list [type hint](https://www.terraform.io/upgrade-guides/0-12.html#referring-to-list-variables) workarounds in v0.11 now have new meaning). Typhoon cannot offer both v0.11 and v0.12 compatibility in the same release. Upcoming releases require upgrading Terraform to v0.12. diff --git a/docs/topics/performance.md b/docs/topics/performance.md index d7aa60c04..364ac15e1 100644 --- a/docs/topics/performance.md +++ b/docs/topics/performance.md @@ -38,7 +38,7 @@ Network performance varies based on the platform and CNI plugin. `iperf` was use Notes: -* Calico and Flannel have comparable performance. Platform and configuration differences dominate. +* Calico, Cilium, and Flannel have comparable performance. Platform and configuration differences dominate. * Azure and DigitalOcean network performance can be quite variable or depend on machine type * Only [certain AWS EC2 instance types](http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/network_mtu.html#jumbo_frame_instances) allow jumbo frames. This is why the default MTU on AWS must be 1480. diff --git a/docs/topics/security.md b/docs/topics/security.md index fc4cbb420..2c1400030 100644 --- a/docs/topics/security.md +++ b/docs/topics/security.md @@ -7,8 +7,10 @@ Typhoon aims to be minimal and secure. We're running it ourselves after all. **Kubernetes** * etcd with peer-to-peer and client-auth TLS -* Generated kubelet TLS certificates and `kubeconfig` (365 days) -* [Role-Based Access Control](https://kubernetes.io/docs/admin/authorization/rbac/) is enabled. Apps must define RBAC policies +* Kubelets TLS bootstrap certificates (72 hours) +* Generated TLS certificate (365 days) for admin `kubeconfig` +* [NodeRestriction](https://kubernetes.io/docs/reference/access-authn-authz/node/) is enabled to limit Kubelet authorization +* [Role-Based Access Control](https://kubernetes.io/docs/admin/authorization/rbac/) is enabled. Apps must define RBAC policies for API access * Workloads run on worker nodes only, unless they tolerate the master taint * Kubernetes [Network Policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) and Calico [NetworkPolicy](https://docs.projectcalico.org/latest/reference/calicoctl/resources/networkpolicy) support [^1] @@ -18,6 +20,9 @@ Typhoon aims to be minimal and secure. We're running it ourselves after all. * Container Linux auto-updates are enabled * Hosts limit logins to SSH key-based auth (user "core") +* SELinux enforcing mode [^2] + +[^2]: SELinux is enforcing on Fedora CoreOS, permissive on Flatcar Linux. **Platform** @@ -47,9 +52,36 @@ Typhoon uses upstream container images (where possible) and upstream binaries. !!! note Kubernetes releases `kubelet` as a binary for distros to package, either as a DEB/RPM on traditional distros or as a container image for container-optimized operating systems. -Typhoon [packages](https://github.com/poseidon/kubelet) the upstream Kubelet and its dependencies as a [container image](https://quay.io/repository/poseidon/kubelet) for use in Typhoon. The upstream Kubelet binary is checksummed and packaged directly. Quay automated builds provide verifiability and confidence in image contents. +Typhoon [packages](https://github.com/poseidon/kubelet) the upstream Kubelet and its dependencies as a [container image](https://quay.io/repository/poseidon/kubelet). Builds fetch the upstream Kubelet binary and verify its checksum. + +The Kubelet image is published to Quay.io and Dockerhub. + +* [quay.io/poseidon/kubelet](https://quay.io/repository/poseidon/kubelet) (official) +* [docker.io/psdn/kubelet](https://hub.docker.com/r/psdn/kubelet) (fallback) + +Two tag styles indicate the build strategy used. + +* Typhoon internal infra publishes single and multi-arch images (e.g. `v1.18.4`, `v1.18.4-amd64`, `v1.18.4-arm64`, `v1.18.4-2-g23228e6-amd64`, `v1.18.4-2-g23228e6-arm64`) +* Quay and Dockerhub automated builds publish verifiable images (e.g. `build-SHA` on Quay, `build-TAG` on Dockerhub) + +The Typhoon-built Kubelet image is used as the official image. Automated builds provide an alternative image for those preferring to trust images built by Quay/Dockerhub (albeit lacking multi-arch). To use the fallback registry or an alternative tag, see [customization](/advanced/customization/#kubelet). + +### flannel-cni + +Typhoon packages the [flannel-cni](https://github.com/poseidon/flannel-cni) container image to provide security patches. + +* [quay.io/poseidon/flannel-cni](https://quay.io/repository/poseidon/flannel-cni) (official) + +## Terraform Providers + +Typhoon publishes Terraform providers to the Terraform Registry, GPG signed by 0x8F515AD1602065C8. + +| Name | Source | Registry | +|----------|--------|----------| +| ct | [github](https://github.com/poseidon/terraform-provider-ct) | [poseidon/ct](https://registry.terraform.io/providers/poseidon/ct/latest) | +| matchbox | [github](https://github.com/poseidon/terraform-provider-matchbox) | [poseidon/matchbox](https://registry.terraform.io/providers/poseidon/matchbox/latest) | ## Disclosures -If you find security issues, please email dghubble at gmail. If the issue lies in upstream Kubernetes, please inform upstream Kubernetes as well. +If you find security issues, please email `security@psdn.io`. If the issue lies in upstream Kubernetes, please inform upstream Kubernetes as well. diff --git a/google-cloud/container-linux/kubernetes/README.md b/google-cloud/container-linux/kubernetes/README.md index 1d46bb9f3..0aec57ecc 100644 --- a/google-cloud/container-linux/kubernetes/README.md +++ b/google-cloud/container-linux/kubernetes/README.md @@ -11,11 +11,11 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) -* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking +* Kubernetes v1.18.8 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking * On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/cl/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization -* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/) +* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/) ## Docs diff --git a/google-cloud/container-linux/kubernetes/bootstrap.tf b/google-cloud/container-linux/kubernetes/bootstrap.tf index a8729f733..e5dc239c6 100644 --- a/google-cloud/container-linux/kubernetes/bootstrap.tf +++ b/google-cloud/container-linux/kubernetes/bootstrap.tf @@ -1,6 +1,6 @@ # Kubernetes assets (kubeconfig, manifests) module "bootstrap" { - source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc" + source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=8ef2fe7c992a8c15d696bd3e3a97be713b025e64" cluster_name = var.cluster_name api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] diff --git a/google-cloud/container-linux/kubernetes/cl/controller.yaml b/google-cloud/container-linux/kubernetes/cl/controller.yaml index a07c23bb6..2464c2ace 100644 --- a/google-cloud/container-linux/kubernetes/cl/controller.yaml +++ b/google-cloud/container-linux/kubernetes/cl/controller.yaml @@ -52,6 +52,7 @@ systemd: Description=Kubelet via Hyperkube Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.8 ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -90,7 +91,7 @@ systemd: --mount volume=var-log,target=/var/log \ --volume opt-cni-bin,kind=host,source=/opt/cni/bin \ --mount volume=opt-cni-bin,target=/opt/cni/bin \ - docker://quay.io/poseidon/kubelet:v1.18.2 -- \ + $${KUBELET_IMAGE} -- \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ @@ -132,7 +133,7 @@ systemd: --volume script,kind=host,source=/opt/bootstrap/apply \ --mount volume=script,target=/apply \ --insecure-options=image \ - docker://quay.io/poseidon/kubelet:v1.18.2 \ + docker://quay.io/poseidon/kubelet:v1.18.8 \ --net=host \ --dns=host \ --exec=/apply diff --git a/google-cloud/container-linux/kubernetes/controllers.tf b/google-cloud/container-linux/kubernetes/controllers.tf index 6ab09219d..58fb93121 100644 --- a/google-cloud/container-linux/kubernetes/controllers.tf +++ b/google-cloud/container-linux/kubernetes/controllers.tf @@ -65,10 +65,10 @@ resource "google_compute_instance" "controllers" { # Controller Ignition configs data "ct_config" "controller-ignitions" { - count = var.controller_count - content = data.template_file.controller-configs.*.rendered[count.index] - pretty_print = false - snippets = var.controller_snippets + count = var.controller_count + content = data.template_file.controller-configs.*.rendered[count.index] + strict = true + snippets = var.controller_snippets } # Controller Container Linux configs diff --git a/google-cloud/container-linux/kubernetes/network.tf b/google-cloud/container-linux/kubernetes/network.tf index bd7067d7b..28481c3b8 100644 --- a/google-cloud/container-linux/kubernetes/network.tf +++ b/google-cloud/container-linux/kubernetes/network.tf @@ -112,6 +112,32 @@ resource "google_compute_firewall" "internal-vxlan" { target_tags = ["${var.cluster_name}-controller", "${var.cluster_name}-worker"] } +# Cilium VXLAN +resource "google_compute_firewall" "internal-linux-vxlan" { + count = var.networking == "cilium" ? 1 : 0 + + name = "${var.cluster_name}-linux-vxlan" + network = google_compute_network.network.name + + allow { + protocol = "udp" + ports = [4789] + } + + # Cilium health + allow { + protocol = "icmp" + } + + allow { + protocol = "tcp" + ports = [4240] + } + + source_tags = ["${var.cluster_name}-controller", "${var.cluster_name}-worker"] + target_tags = ["${var.cluster_name}-controller", "${var.cluster_name}-worker"] +} + # Allow Prometheus to scrape node-exporter daemonset resource "google_compute_firewall" "internal-node-exporter" { name = "${var.cluster_name}-internal-node-exporter" diff --git a/google-cloud/container-linux/kubernetes/versions.tf b/google-cloud/container-linux/kubernetes/versions.tf index 26ea74cac..178e248e3 100644 --- a/google-cloud/container-linux/kubernetes/versions.tf +++ b/google-cloud/container-linux/kubernetes/versions.tf @@ -1,11 +1,15 @@ # Terraform version and plugin versions terraform { - required_version = "~> 0.12.6" + required_version = ">= 0.12.26, < 0.14.0" required_providers { google = ">= 2.19, < 4.0" - ct = "~> 0.3" template = "~> 2.1" null = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } } } diff --git a/google-cloud/container-linux/kubernetes/workers/cl/worker.yaml b/google-cloud/container-linux/kubernetes/workers/cl/worker.yaml index 52a0cfae8..9137924fa 100644 --- a/google-cloud/container-linux/kubernetes/workers/cl/worker.yaml +++ b/google-cloud/container-linux/kubernetes/workers/cl/worker.yaml @@ -2,11 +2,11 @@ systemd: units: - name: docker.service - enable: true + enabled: true - name: locksmithd.service mask: true - name: wait-for-dns.service - enable: true + enabled: true contents: | [Unit] Description=Wait for DNS entries @@ -19,12 +19,13 @@ systemd: [Install] RequiredBy=kubelet.service - name: kubelet.service - enable: true + enabled: true contents: | [Unit] - Description=Kubelet via Hyperkube + Description=Kubelet Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=docker://quay.io/poseidon/kubelet:v1.18.8 ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -63,18 +64,17 @@ systemd: --mount volume=var-log,target=/var/log \ --volume opt-cni-bin,kind=host,source=/opt/cni/bin \ --mount volume=opt-cni-bin,target=/opt/cni/bin \ - docker://quay.io/poseidon/kubelet:v1.18.2 -- \ + $${KUBELET_IMAGE} -- \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --client-ca-file=/etc/kubernetes/ca.crt \ --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ --node-labels=node.kubernetes.io/node \ %{~ for label in split(",", node_labels) ~} @@ -82,6 +82,7 @@ systemd: %{~ endfor ~} --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/rkt stop --uuid-file=/var/cache/kubelet-pod.uuid Restart=always @@ -89,7 +90,7 @@ systemd: [Install] WantedBy=multi-user.target - name: delete-node.service - enable: true + enabled: true contents: | [Unit] Description=Waiting to delete Kubernetes node on shutdown @@ -110,6 +111,7 @@ storage: ${kubeconfig} - path: /etc/sysctl.d/max-user-watches.conf filesystem: root + mode: 0644 contents: inline: | fs.inotify.max_user_watches=16184 @@ -125,7 +127,7 @@ storage: --volume config,kind=host,source=/etc/kubernetes \ --mount volume=config,target=/etc/kubernetes \ --insecure-options=image \ - docker://quay.io/poseidon/kubelet:v1.18.2 \ + docker://quay.io/poseidon/kubelet:v1.18.8 \ --net=host \ --dns=host \ --exec=/usr/local/bin/kubectl -- --kubeconfig=/etc/kubernetes/kubeconfig delete node $(hostname) diff --git a/google-cloud/container-linux/kubernetes/workers/versions.tf b/google-cloud/container-linux/kubernetes/workers/versions.tf index ac97c6ac8..c0b899ee0 100644 --- a/google-cloud/container-linux/kubernetes/workers/versions.tf +++ b/google-cloud/container-linux/kubernetes/workers/versions.tf @@ -1,4 +1,14 @@ +# Terraform version and plugin versions terraform { - required_version = ">= 0.12" + required_version = ">= 0.12.26, < 0.14.0" + required_providers { + google = ">= 2.19, < 4.0" + template = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } + } } diff --git a/google-cloud/container-linux/kubernetes/workers/workers.tf b/google-cloud/container-linux/kubernetes/workers/workers.tf index 7c130ed20..e635592ff 100644 --- a/google-cloud/container-linux/kubernetes/workers/workers.tf +++ b/google-cloud/container-linux/kubernetes/workers/workers.tf @@ -71,9 +71,9 @@ resource "google_compute_instance_template" "worker" { # Worker Ignition config data "ct_config" "worker-ignition" { - content = data.template_file.worker-config.rendered - pretty_print = false - snippets = var.snippets + content = data.template_file.worker-config.rendered + strict = true + snippets = var.snippets } # Worker Container Linux config diff --git a/google-cloud/fedora-coreos/kubernetes/README.md b/google-cloud/fedora-coreos/kubernetes/README.md index 6c319b0b2..581838194 100644 --- a/google-cloud/fedora-coreos/kubernetes/README.md +++ b/google-cloud/fedora-coreos/kubernetes/README.md @@ -11,11 +11,15 @@ Typhoon distributes upstream Kubernetes, architectural conventions, and cluster ## Features -* Kubernetes v1.18.2 (upstream) -* Single or multi-master, [Calico](https://www.projectcalico.org/) or [flannel](https://github.com/coreos/flannel) networking -* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/) +* Kubernetes v1.18.8 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking +* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing +* Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [spot](https://typhoon.psdn.io/cl/aws/#spot) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/#container-linux) customization +* Kubernetes v1.18.6 (upstream) +* Single or multi-master, [Calico](https://www.projectcalico.org/) or [Cilium](https://github.com/cilium/cilium) or [flannel](https://github.com/coreos/flannel) networking +* On-cluster etcd with TLS, [RBAC](https://kubernetes.io/docs/admin/authorization/rbac/)-enabled, [network policy](https://kubernetes.io/docs/concepts/services-networking/network-policies/), SELinux enforcing * Advanced features like [worker pools](https://typhoon.psdn.io/advanced/worker-pools/), [preemptible](https://typhoon.psdn.io/fedora-coreos/google-cloud/#preemption) workers, and [snippets](https://typhoon.psdn.io/advanced/customization/) customization -* Ready for Ingress, Prometheus, Grafana, and other optional [addons](https://typhoon.psdn.io/addons/overview/) +* Ready for Ingress, Prometheus, Grafana, CSI, and other optional [addons](https://typhoon.psdn.io/addons/overview/) ## Docs diff --git a/google-cloud/fedora-coreos/kubernetes/bootstrap.tf b/google-cloud/fedora-coreos/kubernetes/bootstrap.tf index 168e55564..c0bea1cd3 100644 --- a/google-cloud/fedora-coreos/kubernetes/bootstrap.tf +++ b/google-cloud/fedora-coreos/kubernetes/bootstrap.tf @@ -1,6 +1,6 @@ # Kubernetes assets (kubeconfig, manifests) module "bootstrap" { - source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=14d0b2087962a0f2557c184f3f523548ce19bbdc" + source = "git::https://github.com/poseidon/terraform-render-bootstrap.git?ref=8ef2fe7c992a8c15d696bd3e3a97be713b025e64" cluster_name = var.cluster_name api_servers = [format("%s.%s", var.cluster_name, var.dns_zone)] diff --git a/google-cloud/fedora-coreos/kubernetes/controllers.tf b/google-cloud/fedora-coreos/kubernetes/controllers.tf index b2cde4341..cbf1b7f8c 100644 --- a/google-cloud/fedora-coreos/kubernetes/controllers.tf +++ b/google-cloud/fedora-coreos/kubernetes/controllers.tf @@ -42,7 +42,7 @@ resource "google_compute_instance" "controllers" { auto_delete = true initialize_params { - image = var.os_image + image = data.google_compute_image.fedora-coreos.self_link size = var.disk_size } } @@ -59,7 +59,10 @@ resource "google_compute_instance" "controllers" { tags = ["${var.cluster_name}-controller"] lifecycle { - ignore_changes = [metadata] + ignore_changes = [ + metadata, + boot_disk[0].initialize_params + ] } } diff --git a/google-cloud/fedora-coreos/kubernetes/fcc/controller.yaml b/google-cloud/fedora-coreos/kubernetes/fcc/controller.yaml index 33416d2c3..5dbf6a4ee 100644 --- a/google-cloud/fedora-coreos/kubernetes/fcc/controller.yaml +++ b/google-cloud/fedora-coreos/kubernetes/fcc/controller.yaml @@ -28,7 +28,7 @@ systemd: --network host \ --volume /var/lib/etcd:/var/lib/etcd:rw,Z \ --volume /etc/ssl/etcd:/etc/ssl/certs:ro,Z \ - quay.io/coreos/etcd:v3.4.7 + quay.io/coreos/etcd:v3.4.10 ExecStop=/usr/bin/podman stop etcd [Install] WantedBy=multi-user.target @@ -51,9 +51,10 @@ systemd: enabled: true contents: | [Unit] - Description=Kubelet via Hyperkube (System Container) + Description=Kubelet (System Container) Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.8 ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -79,10 +80,11 @@ systemd: --volume /var/log:/var/log \ --volume /var/run/lock:/var/run/lock:z \ --volume /opt/cni/bin:/opt/cni/bin:z \ - quay.io/poseidon/kubelet:v1.18.2 \ + $${KUBELET_IMAGE} \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=systemd \ --cgroups-per-qos=true \ --enforce-node-allocatable=pods \ @@ -90,16 +92,14 @@ systemd: --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ - --node-labels=node.kubernetes.io/master \ --node-labels=node.kubernetes.io/controller="true" \ --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ - --register-with-taints=node-role.kubernetes.io/master=:NoSchedule \ + --register-with-taints=node-role.kubernetes.io/controller=:NoSchedule \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/podman stop kubelet Delegate=yes @@ -119,15 +119,17 @@ systemd: ExecStartPre=-/usr/bin/podman rm bootstrap ExecStart=/usr/bin/podman run --name bootstrap \ --network host \ - --volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,Z \ + --volume /etc/kubernetes/bootstrap-secrets:/etc/kubernetes/secrets:ro,z \ --volume /opt/bootstrap/assets:/assets:ro,Z \ --volume /opt/bootstrap/apply:/apply:ro,Z \ --entrypoint=/apply \ - quay.io/poseidon/kubelet:v1.18.2 + quay.io/poseidon/kubelet:v1.18.8 ExecStartPost=/bin/touch /opt/bootstrap/bootstrap.done ExecStartPost=-/usr/bin/podman stop bootstrap storage: directories: + - path: /var/lib/etcd + mode: 0700 - path: /etc/kubernetes - path: /opt/bootstrap files: @@ -151,11 +153,11 @@ storage: chmod -R 500 /etc/ssl/etcd mv auth/kubeconfig /etc/kubernetes/bootstrap-secrets/ mv tls/k8s/* /etc/kubernetes/bootstrap-secrets/ - sudo mkdir -p /etc/kubernetes/manifests - sudo mv static-manifests/* /etc/kubernetes/manifests/ - sudo mkdir -p /opt/bootstrap/assets - sudo mv manifests /opt/bootstrap/assets/manifests - sudo mv manifests-networking/* /opt/bootstrap/assets/manifests/ + mkdir -p /etc/kubernetes/manifests + mv static-manifests/* /etc/kubernetes/manifests/ + mkdir -p /opt/bootstrap/assets + mv manifests /opt/bootstrap/assets/manifests + mv manifests-networking/* /opt/bootstrap/assets/manifests/ rm -rf assets auth static-manifests tls manifests-networking - path: /opt/bootstrap/apply mode: 0544 @@ -175,6 +177,18 @@ storage: contents: inline: | fs.inotify.max_user_watches=16184 + - path: /etc/sysctl.d/reverse-path-filter.conf + contents: + inline: | + net.ipv4.conf.default.rp_filter=0 + net.ipv4.conf.*.rp_filter=0 + - path: /etc/systemd/network/50-flannel.link + contents: + inline: | + [Match] + OriginalName=flannel* + [Link] + MACAddressPolicy=none - path: /etc/systemd/system.conf.d/accounting.conf contents: inline: | diff --git a/google-cloud/fedora-coreos/kubernetes/image.tf b/google-cloud/fedora-coreos/kubernetes/image.tf new file mode 100644 index 000000000..e35e274c4 --- /dev/null +++ b/google-cloud/fedora-coreos/kubernetes/image.tf @@ -0,0 +1,6 @@ + +# Fedora CoreOS most recent image from stream +data "google_compute_image" "fedora-coreos" { + project = "fedora-coreos-cloud" + family = "fedora-coreos-${var.os_stream}" +} diff --git a/google-cloud/fedora-coreos/kubernetes/network.tf b/google-cloud/fedora-coreos/kubernetes/network.tf index bd7067d7b..28481c3b8 100644 --- a/google-cloud/fedora-coreos/kubernetes/network.tf +++ b/google-cloud/fedora-coreos/kubernetes/network.tf @@ -112,6 +112,32 @@ resource "google_compute_firewall" "internal-vxlan" { target_tags = ["${var.cluster_name}-controller", "${var.cluster_name}-worker"] } +# Cilium VXLAN +resource "google_compute_firewall" "internal-linux-vxlan" { + count = var.networking == "cilium" ? 1 : 0 + + name = "${var.cluster_name}-linux-vxlan" + network = google_compute_network.network.name + + allow { + protocol = "udp" + ports = [4789] + } + + # Cilium health + allow { + protocol = "icmp" + } + + allow { + protocol = "tcp" + ports = [4240] + } + + source_tags = ["${var.cluster_name}-controller", "${var.cluster_name}-worker"] + target_tags = ["${var.cluster_name}-controller", "${var.cluster_name}-worker"] +} + # Allow Prometheus to scrape node-exporter daemonset resource "google_compute_firewall" "internal-node-exporter" { name = "${var.cluster_name}-internal-node-exporter" diff --git a/google-cloud/fedora-coreos/kubernetes/variables.tf b/google-cloud/fedora-coreos/kubernetes/variables.tf index c7be56d0a..74b59d41b 100644 --- a/google-cloud/fedora-coreos/kubernetes/variables.tf +++ b/google-cloud/fedora-coreos/kubernetes/variables.tf @@ -46,9 +46,10 @@ variable "worker_type" { default = "n1-standard-1" } -variable "os_image" { +variable "os_stream" { type = string - description = "Fedora CoreOS image for compute instances (e.g. fedora-coreos)" + description = "Fedora CoreOS stream for compute instances (e.g. stable, testing, next)" + default = "stable" } variable "disk_size" { diff --git a/google-cloud/fedora-coreos/kubernetes/versions.tf b/google-cloud/fedora-coreos/kubernetes/versions.tf index 26ea74cac..178e248e3 100644 --- a/google-cloud/fedora-coreos/kubernetes/versions.tf +++ b/google-cloud/fedora-coreos/kubernetes/versions.tf @@ -1,11 +1,15 @@ # Terraform version and plugin versions terraform { - required_version = "~> 0.12.6" + required_version = ">= 0.12.26, < 0.14.0" required_providers { google = ">= 2.19, < 4.0" - ct = "~> 0.3" template = "~> 2.1" null = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } } } diff --git a/google-cloud/fedora-coreos/kubernetes/workers.tf b/google-cloud/fedora-coreos/kubernetes/workers.tf index 91a32bd0c..d35db25f5 100644 --- a/google-cloud/fedora-coreos/kubernetes/workers.tf +++ b/google-cloud/fedora-coreos/kubernetes/workers.tf @@ -8,7 +8,7 @@ module "workers" { network = google_compute_network.network.name worker_count = var.worker_count machine_type = var.worker_type - os_image = var.os_image + os_stream = var.os_stream disk_size = var.disk_size preemptible = var.worker_preemptible diff --git a/google-cloud/fedora-coreos/kubernetes/workers/fcc/worker.yaml b/google-cloud/fedora-coreos/kubernetes/workers/fcc/worker.yaml index 0501b8de9..efab8cde3 100644 --- a/google-cloud/fedora-coreos/kubernetes/workers/fcc/worker.yaml +++ b/google-cloud/fedora-coreos/kubernetes/workers/fcc/worker.yaml @@ -21,9 +21,10 @@ systemd: enabled: true contents: | [Unit] - Description=Kubelet via Hyperkube (System Container) + Description=Kubelet (System Container) Wants=rpc-statd.service [Service] + Environment=KUBELET_IMAGE=quay.io/poseidon/kubelet:v1.18.8 ExecStartPre=/bin/mkdir -p /etc/kubernetes/cni/net.d ExecStartPre=/bin/mkdir -p /etc/kubernetes/manifests ExecStartPre=/bin/mkdir -p /opt/cni/bin @@ -49,10 +50,11 @@ systemd: --volume /var/log:/var/log \ --volume /var/run/lock:/var/run/lock:z \ --volume /opt/cni/bin:/opt/cni/bin:z \ - quay.io/poseidon/kubelet:v1.18.2 \ + $${KUBELET_IMAGE} \ --anonymous-auth=false \ --authentication-token-webhook \ --authorization-mode=Webhook \ + --bootstrap-kubeconfig=/etc/kubernetes/kubeconfig \ --cgroup-driver=systemd \ --cgroups-per-qos=true \ --enforce-node-allocatable=pods \ @@ -60,10 +62,8 @@ systemd: --cluster_dns=${cluster_dns_service_ip} \ --cluster_domain=${cluster_domain_suffix} \ --cni-conf-dir=/etc/kubernetes/cni/net.d \ - --exit-on-lock-contention \ --healthz-port=0 \ - --kubeconfig=/etc/kubernetes/kubeconfig \ - --lock-file=/var/run/lock/kubelet.lock \ + --kubeconfig=/var/lib/kubelet/kubeconfig \ --network-plugin=cni \ --node-labels=node.kubernetes.io/node \ %{~ for label in split(",", node_labels) ~} @@ -71,6 +71,7 @@ systemd: %{~ endfor ~} --pod-manifest-path=/etc/kubernetes/manifests \ --read-only-port=0 \ + --rotate-certificates \ --volume-plugin-dir=/var/lib/kubelet/volumeplugins ExecStop=-/usr/bin/podman stop kubelet Delegate=yes @@ -87,7 +88,7 @@ systemd: Type=oneshot RemainAfterExit=true ExecStart=/bin/true - ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.2 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME' + ExecStop=/bin/bash -c '/usr/bin/podman run --volume /etc/kubernetes:/etc/kubernetes:ro,z --entrypoint /usr/local/bin/kubectl quay.io/poseidon/kubelet:v1.18.8 --kubeconfig=/etc/kubernetes/kubeconfig delete node $HOSTNAME' [Install] WantedBy=multi-user.target storage: @@ -103,6 +104,18 @@ storage: contents: inline: | fs.inotify.max_user_watches=16184 + - path: /etc/sysctl.d/reverse-path-filter.conf + contents: + inline: | + net.ipv4.conf.default.rp_filter=0 + net.ipv4.conf.*.rp_filter=0 + - path: /etc/systemd/network/50-flannel.link + contents: + inline: | + [Match] + OriginalName=flannel* + [Link] + MACAddressPolicy=none - path: /etc/systemd/system.conf.d/accounting.conf contents: inline: | diff --git a/google-cloud/fedora-coreos/kubernetes/workers/image.tf b/google-cloud/fedora-coreos/kubernetes/workers/image.tf new file mode 100644 index 000000000..e35e274c4 --- /dev/null +++ b/google-cloud/fedora-coreos/kubernetes/workers/image.tf @@ -0,0 +1,6 @@ + +# Fedora CoreOS most recent image from stream +data "google_compute_image" "fedora-coreos" { + project = "fedora-coreos-cloud" + family = "fedora-coreos-${var.os_stream}" +} diff --git a/google-cloud/fedora-coreos/kubernetes/workers/variables.tf b/google-cloud/fedora-coreos/kubernetes/workers/variables.tf index 8f1ef933b..b3802388b 100644 --- a/google-cloud/fedora-coreos/kubernetes/workers/variables.tf +++ b/google-cloud/fedora-coreos/kubernetes/workers/variables.tf @@ -34,9 +34,10 @@ variable "machine_type" { default = "n1-standard-1" } -variable "os_image" { +variable "os_stream" { type = string - description = "Fedora CoreOS image for compute instanges (e.g. gcloud compute images list)" + description = "Fedora CoreOS stream for compute instances (e.g. stable, testing, next)" + default = "stable" } variable "disk_size" { diff --git a/google-cloud/fedora-coreos/kubernetes/workers/versions.tf b/google-cloud/fedora-coreos/kubernetes/workers/versions.tf index ac97c6ac8..c0b899ee0 100644 --- a/google-cloud/fedora-coreos/kubernetes/workers/versions.tf +++ b/google-cloud/fedora-coreos/kubernetes/workers/versions.tf @@ -1,4 +1,14 @@ +# Terraform version and plugin versions terraform { - required_version = ">= 0.12" + required_version = ">= 0.12.26, < 0.14.0" + required_providers { + google = ">= 2.19, < 4.0" + template = "~> 2.1" + + ct = { + source = "poseidon/ct" + version = "~> 0.6.1" + } + } } diff --git a/google-cloud/fedora-coreos/kubernetes/workers/workers.tf b/google-cloud/fedora-coreos/kubernetes/workers/workers.tf index 1d3d65894..3c36b1aa9 100644 --- a/google-cloud/fedora-coreos/kubernetes/workers/workers.tf +++ b/google-cloud/fedora-coreos/kubernetes/workers/workers.tf @@ -43,7 +43,7 @@ resource "google_compute_instance_template" "worker" { disk { auto_delete = true boot = true - source_image = var.os_image + source_image = data.google_compute_image.fedora-coreos.self_link disk_size_gb = var.disk_size } @@ -64,6 +64,9 @@ resource "google_compute_instance_template" "worker" { } lifecycle { + ignore_changes = [ + disk[0].source_image + ] # To update an Instance Template, Terraform should replace the existing resource create_before_destroy = true } diff --git a/mkdocs.yml b/mkdocs.yml index afca2372e..c5dc1f445 100644 --- a/mkdocs.yml +++ b/mkdocs.yml @@ -1,26 +1,28 @@ -site_name: 'Typhoon' -site_description: 'A minimal and free Kubernetes distribution' -site_author: 'Dalton Hubble' -repo_name: 'poseidon/typhoon' -repo_url: 'https://github.com/poseidon/typhoon' +site_name: Typhoon +site_description: A minimal and free Kubernetes distribution +site_author: Dalton Hubble +repo_name: poseidon/typhoon +repo_url: https://github.com/poseidon/typhoon theme: - name: 'material' - feature: - tabs: 'true' - palette: - primary: 'blue' - accent: 'pink' + name: material + features: + - tabs logo: 'img/spin.png' favicon: 'img/favicon.ico' + icon: + repo: fontawesome/brands/github-alt + palette: + primary: blue + accent: pink font: text: 'Roboto Slab' code: 'Roboto Mono' extra: social: - - type: 'github' - link: 'https://github.com/poseidon' - - type: 'twitter' - link: 'https://twitter.com/typhoon8s' + - icon: fontawesome/brands/github-alt + link: https://github.com/poseidon + - icon: fontawesome/brands/twitter + link: https://twitter.com/typhoon8s markdown_extensions: - admonition - codehilite @@ -55,15 +57,16 @@ nav: - 'Google Cloud': 'architecture/google-cloud.md' - 'Fedora CoreOS': - 'AWS': 'fedora-coreos/aws.md' + - 'Azure': 'fedora-coreos/azure.md' - 'Bare-Metal': 'fedora-coreos/bare-metal.md' - - 'Digital Ocean': 'fedora-coreos/digitalocean.md' + - 'DigitalOcean': 'fedora-coreos/digitalocean.md' - 'Google Cloud': 'fedora-coreos/google-cloud.md' - - 'Container Linux': - - 'AWS': 'cl/aws.md' - - 'Azure': 'cl/azure.md' - - 'Bare-Metal': 'cl/bare-metal.md' - - 'Digital Ocean': 'cl/digital-ocean.md' - - 'Google Cloud': 'cl/google-cloud.md' + - 'Flatcar Linux': + - 'AWS': 'flatcar-linux/aws.md' + - 'Azure': 'flatcar-linux/azure.md' + - 'Bare-Metal': 'flatcar-linux/bare-metal.md' + - 'DigitalOcean': 'flatcar-linux/digitalocean.md' + - 'Google Cloud': 'flatcar-linux/google-cloud.md' - 'Topics': - 'Maintenance': 'topics/maintenance.md' - 'Hardware': 'topics/hardware.md' diff --git a/output/bootstrap-manifests/bootstrap-apiserver.yaml b/output/bootstrap-manifests/bootstrap-apiserver.yaml deleted file mode 100644 index 422c7b529..000000000 --- a/output/bootstrap-manifests/bootstrap-apiserver.yaml +++ /dev/null @@ -1,59 +0,0 @@ -apiVersion: v1 -kind: Pod -metadata: - name: bootstrap-kube-apiserver - namespace: kube-system -spec: - containers: - - name: kube-apiserver - image: gcr.io/google_containers/hyperkube:v1.9.3 - command: - - /hyperkube - - apiserver - - --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ValidatingAdmissionWebhook,ResourceQuota,DefaultTolerationSeconds,MutatingAdmissionWebhook - - --advertise-address=$(POD_IP) - - --allow-privileged=true - - --authorization-mode=RBAC - - --bind-address=0.0.0.0 - - --client-ca-file=/etc/kubernetes/secrets/ca.crt - - --etcd-cafile=/etc/kubernetes/secrets/etcd-client-ca.crt - - --etcd-certfile=/etc/kubernetes/secrets/etcd-client.crt - - --etcd-keyfile=/etc/kubernetes/secrets/etcd-client.key - - --etcd-servers=https://bastion-test-etcd0.k8s-playground.takescoop.com:2379,https://bastion-test-etcd1.k8s-playground.takescoop.com:2379 - - --insecure-port=0 - - --kubelet-client-certificate=/etc/kubernetes/secrets/apiserver.crt - - --kubelet-client-key=/etc/kubernetes/secrets/apiserver.key - - --secure-port=443 - - --service-account-key-file=/etc/kubernetes/secrets/service-account.pub - - --service-cluster-ip-range=10.3.0.0/16 - - --cloud-provider= - - --storage-backend=etcd3 - - --tls-ca-file=/etc/kubernetes/secrets/ca.crt - - --tls-cert-file=/etc/kubernetes/secrets/apiserver.crt - - --tls-private-key-file=/etc/kubernetes/secrets/apiserver.key - env: - - name: POD_IP - valueFrom: - fieldRef: - fieldPath: status.podIP - volumeMounts: - - mountPath: /etc/ssl/certs - name: ssl-certs-host - readOnly: true - - mountPath: /etc/kubernetes/secrets - name: secrets - readOnly: true - - mountPath: /var/lock - name: var-lock - readOnly: false - hostNetwork: true - volumes: - - name: secrets - hostPath: - path: /etc/kubernetes/bootstrap-secrets - - name: ssl-certs-host - hostPath: - path: /usr/share/ca-certificates - - name: var-lock - hostPath: - path: /var/lock diff --git a/output/bootstrap-manifests/bootstrap-controller-manager.yaml b/output/bootstrap-manifests/bootstrap-controller-manager.yaml deleted file mode 100644 index 705e75528..000000000 --- a/output/bootstrap-manifests/bootstrap-controller-manager.yaml +++ /dev/null @@ -1,36 +0,0 @@ -apiVersion: v1 -kind: Pod -metadata: - name: bootstrap-kube-controller-manager - namespace: kube-system -spec: - containers: - - name: kube-controller-manager - image: gcr.io/google_containers/hyperkube:v1.9.3 - command: - - ./hyperkube - - controller-manager - - --allocate-node-cidrs=true - - --cluster-cidr=10.2.0.0/16 - - --service-cluster-ip-range=10.3.0.0/16 - - --cloud-provider= - - --configure-cloud-routes=false - - --kubeconfig=/etc/kubernetes/kubeconfig - - --leader-elect=true - - --root-ca-file=/etc/kubernetes/bootstrap-secrets/ca.crt - - --service-account-private-key-file=/etc/kubernetes/bootstrap-secrets/service-account.key - volumeMounts: - - name: kubernetes - mountPath: /etc/kubernetes - readOnly: true - - name: ssl-host - mountPath: /etc/ssl/certs - readOnly: true - hostNetwork: true - volumes: - - name: kubernetes - hostPath: - path: /etc/kubernetes - - name: ssl-host - hostPath: - path: /usr/share/ca-certificates diff --git a/output/bootstrap-manifests/bootstrap-scheduler.yaml b/output/bootstrap-manifests/bootstrap-scheduler.yaml deleted file mode 100644 index 75481b1b7..000000000 --- a/output/bootstrap-manifests/bootstrap-scheduler.yaml +++ /dev/null @@ -1,23 +0,0 @@ -apiVersion: v1 -kind: Pod -metadata: - name: bootstrap-kube-scheduler - namespace: kube-system -spec: - containers: - - name: kube-scheduler - image: gcr.io/google_containers/hyperkube:v1.9.3 - command: - - ./hyperkube - - scheduler - - --kubeconfig=/etc/kubernetes/kubeconfig - - --leader-elect=true - volumeMounts: - - name: kubernetes - mountPath: /etc/kubernetes - readOnly: true - hostNetwork: true - volumes: - - name: kubernetes - hostPath: - path: /etc/kubernetes diff --git a/output/manifests-networking/bgpconfigurations-crd.yaml b/output/manifests-networking/bgpconfigurations-crd.yaml deleted file mode 100644 index c48ff4886..000000000 --- a/output/manifests-networking/bgpconfigurations-crd.yaml +++ /dev/null @@ -1,13 +0,0 @@ -apiVersion: apiextensions.k8s.io/v1beta1 -description: Calico BGP Configuration -kind: CustomResourceDefinition -metadata: - name: bgpconfigurations.crd.projectcalico.org -spec: - scope: Cluster - group: crd.projectcalico.org - version: v1 - names: - kind: BGPConfiguration - plural: bgpconfigurations - singular: bgpconfiguration diff --git a/output/manifests-networking/bgppeers-crd.yaml b/output/manifests-networking/bgppeers-crd.yaml deleted file mode 100644 index d10f528ae..000000000 --- a/output/manifests-networking/bgppeers-crd.yaml +++ /dev/null @@ -1,13 +0,0 @@ -apiVersion: apiextensions.k8s.io/v1beta1 -description: Calico BGP Peers -kind: CustomResourceDefinition -metadata: - name: bgppeers.crd.projectcalico.org -spec: - scope: Cluster - group: crd.projectcalico.org - version: v1 - names: - kind: BGPPeer - plural: bgppeers - singular: bgppeer diff --git a/output/manifests-networking/calico-cluster-role-binding.yaml b/output/manifests-networking/calico-cluster-role-binding.yaml deleted file mode 100644 index f76449264..000000000 --- a/output/manifests-networking/calico-cluster-role-binding.yaml +++ /dev/null @@ -1,12 +0,0 @@ -apiVersion: rbac.authorization.k8s.io/v1 -kind: ClusterRoleBinding -metadata: - name: calico-node -roleRef: - apiGroup: rbac.authorization.k8s.io - kind: ClusterRole - name: calico-node -subjects: -- kind: ServiceAccount - name: calico-node - namespace: kube-system diff --git a/output/manifests-networking/calico-cluster-role.yaml b/output/manifests-networking/calico-cluster-role.yaml deleted file mode 100644 index 2d317b500..000000000 --- a/output/manifests-networking/calico-cluster-role.yaml +++ /dev/null @@ -1,68 +0,0 @@ -apiVersion: rbac.authorization.k8s.io/v1 -kind: ClusterRole -metadata: - name: calico-node -rules: - - apiGroups: [""] - resources: - - namespaces - verbs: - - get - - list - - watch - - apiGroups: [""] - resources: - - pods/status - verbs: - - update - - apiGroups: [""] - resources: - - pods - verbs: - - get - - list - - watch - - patch - - apiGroups: [""] - resources: - - services - verbs: - - get - - apiGroups: [""] - resources: - - endpoints - verbs: - - get - - apiGroups: [""] - resources: - - nodes - verbs: - - get - - list - - update - - watch - - apiGroups: ["extensions"] - resources: - - networkpolicies - verbs: - - get - - list - - watch - - apiGroups: ["crd.projectcalico.org"] - resources: - - globalfelixconfigs - - felixconfigurations - - bgppeers - - globalbgpconfigs - - bgpconfigurations - - ippools - - globalnetworkpolicies - - globalnetworksets - - networkpolicies - - clusterinformations - verbs: - - create - - get - - list - - update - - watch diff --git a/output/manifests-networking/calico-config.yaml b/output/manifests-networking/calico-config.yaml deleted file mode 100644 index 66be5c2af..000000000 --- a/output/manifests-networking/calico-config.yaml +++ /dev/null @@ -1,39 +0,0 @@ -apiVersion: v1 -kind: ConfigMap -metadata: - name: calico-config - namespace: kube-system -data: - typha_service_name: "none" - # The CNI network configuration to install on each node. - cni_network_config: |- - { - "name": "k8s-pod-network", - "cniVersion": "0.3.1", - "plugins": [ - { - "type": "calico", - "log_level": "info", - "datastore_type": "kubernetes", - "nodename": "__KUBERNETES_NODE_NAME__", - "mtu": 1480, - "ipam": { - "type": "host-local", - "subnet": "usePodCidr" - }, - "policy": { - "type": "k8s", - "k8s_auth_token": "__SERVICEACCOUNT_TOKEN__" - }, - "kubernetes": { - "k8s_api_root": "https://__KUBERNETES_SERVICE_HOST__:__KUBERNETES_SERVICE_PORT__", - "kubeconfig": "__KUBECONFIG_FILEPATH__" - } - }, - { - "type": "portmap", - "snat": true, - "capabilities": {"portMappings": true} - } - ] - } diff --git a/output/manifests-networking/calico-service-account.yaml b/output/manifests-networking/calico-service-account.yaml deleted file mode 100644 index f16b4b0e0..000000000 --- a/output/manifests-networking/calico-service-account.yaml +++ /dev/null @@ -1,5 +0,0 @@ -apiVersion: v1 -kind: ServiceAccount -metadata: - name: calico-node - namespace: kube-system diff --git a/output/manifests-networking/calico.yaml b/output/manifests-networking/calico.yaml deleted file mode 100644 index 50b45c38f..000000000 --- a/output/manifests-networking/calico.yaml +++ /dev/null @@ -1,146 +0,0 @@ -apiVersion: apps/v1 -kind: DaemonSet -metadata: - name: calico-node - namespace: kube-system - labels: - k8s-app: calico-node -spec: - selector: - matchLabels: - k8s-app: calico-node - updateStrategy: - type: RollingUpdate - rollingUpdate: - maxUnavailable: 1 - template: - metadata: - labels: - k8s-app: calico-node - spec: - hostNetwork: true - serviceAccountName: calico-node - tolerations: - - effect: NoSchedule - operator: Exists - - effect: NoExecute - operator: Exists - containers: - - name: calico-node - image: quay.io/calico/node:v3.0.2 - env: - # Use Kubernetes API as the backing datastore. - - name: DATASTORE_TYPE - value: "kubernetes" - # Enable felix info logging. - - name: FELIX_LOGSEVERITYSCREEN - value: "info" - # Cluster type to identify the deployment type - - name: CLUSTER_TYPE - value: "k8s,bgp" - # Disable file logging so `kubectl logs` works. - - name: CALICO_DISABLE_FILE_LOGGING - value: "true" - # Set Felix endpoint to host default action to ACCEPT. - - name: FELIX_DEFAULTENDPOINTTOHOSTACTION - value: "ACCEPT" - # Disable IPV6 on Kubernetes. - - name: FELIX_IPV6SUPPORT - value: "false" - # Set MTU for tunnel device used if ipip is enabled - - name: FELIX_IPINIPMTU - value: "1480" - # Wait for the datastore. - - name: WAIT_FOR_DATASTORE - value: "true" - # The Calico IPv4 pool CIDR (should match `--cluster-cidr`). - - name: CALICO_IPV4POOL_CIDR - value: "10.2.0.0/16" - # Enable IPIP - - name: CALICO_IPV4POOL_IPIP - value: "Always" - # Enable IP-in-IP within Felix. - - name: FELIX_IPINIPENABLED - value: "true" - # Typha support: controlled by the ConfigMap. - - name: FELIX_TYPHAK8SSERVICENAME - valueFrom: - configMapKeyRef: - name: calico-config - key: typha_service_name - # Set node name based on k8s nodeName. - - name: NODENAME - valueFrom: - fieldRef: - fieldPath: spec.nodeName - # Auto-detect the BGP IP address. - - name: IP - value: "autodetect" - - name: FELIX_HEALTHENABLED - value: "true" - securityContext: - privileged: true - resources: - requests: - cpu: 250m - livenessProbe: - httpGet: - path: /liveness - port: 9099 - periodSeconds: 10 - initialDelaySeconds: 10 - failureThreshold: 6 - readinessProbe: - httpGet: - path: /readiness - port: 9099 - periodSeconds: 10 - volumeMounts: - - mountPath: /lib/modules - name: lib-modules - readOnly: true - - mountPath: /var/run/calico - name: var-run-calico - readOnly: false - # Install Calico CNI binaries and CNI network config file on nodes - - name: install-cni - image: quay.io/calico/cni:v2.0.0 - command: ["/install-cni.sh"] - env: - # Name of the CNI config file to create on each node. - - name: CNI_CONF_NAME - value: "10-calico.conflist" - # Contents of the CNI config to create on each node. - - name: CNI_NETWORK_CONFIG - valueFrom: - configMapKeyRef: - name: calico-config - key: cni_network_config - # Set node name based on k8s nodeName - - name: KUBERNETES_NODE_NAME - valueFrom: - fieldRef: - fieldPath: spec.nodeName - - name: CNI_NET_DIR - value: "/etc/kubernetes/cni/net.d" - volumeMounts: - - mountPath: /host/opt/cni/bin - name: cni-bin-dir - - mountPath: /host/etc/cni/net.d - name: cni-net-dir - terminationGracePeriodSeconds: 0 - volumes: - # Used by calico/node - - name: lib-modules - hostPath: - path: /lib/modules - - name: var-run-calico - hostPath: - path: /var/run/calico - # Used by install-cni - - name: cni-bin-dir - hostPath: - path: /opt/cni/bin - - name: cni-net-dir - hostPath: - path: /etc/kubernetes/cni/net.d diff --git a/output/manifests-networking/clusterinformations-crd.yaml b/output/manifests-networking/clusterinformations-crd.yaml deleted file mode 100644 index 3fbc7d8dd..000000000 --- a/output/manifests-networking/clusterinformations-crd.yaml +++ /dev/null @@ -1,13 +0,0 @@ -apiVersion: apiextensions.k8s.io/v1beta1 -description: Calico Cluster Information -kind: CustomResourceDefinition -metadata: - name: clusterinformations.crd.projectcalico.org -spec: - scope: Cluster - group: crd.projectcalico.org - version: v1 - names: - kind: ClusterInformation - plural: clusterinformations - singular: clusterinformation diff --git a/output/manifests-networking/felixconfigurations-crd.yaml b/output/manifests-networking/felixconfigurations-crd.yaml deleted file mode 100644 index e9f1385a0..000000000 --- a/output/manifests-networking/felixconfigurations-crd.yaml +++ /dev/null @@ -1,13 +0,0 @@ -apiVersion: apiextensions.k8s.io/v1beta1 -description: Calico Felix Configuration -kind: CustomResourceDefinition -metadata: - name: felixconfigurations.crd.projectcalico.org -spec: - scope: Cluster - group: crd.projectcalico.org - version: v1 - names: - kind: FelixConfiguration - plural: felixconfigurations - singular: felixconfiguration diff --git a/output/manifests-networking/globalnetworkpolicies-crd.yaml b/output/manifests-networking/globalnetworkpolicies-crd.yaml deleted file mode 100644 index b28cc3544..000000000 --- a/output/manifests-networking/globalnetworkpolicies-crd.yaml +++ /dev/null @@ -1,13 +0,0 @@ -apiVersion: apiextensions.k8s.io/v1beta1 -description: Calico Global Network Policies -kind: CustomResourceDefinition -metadata: - name: globalnetworkpolicies.crd.projectcalico.org -spec: - scope: Cluster - group: crd.projectcalico.org - version: v1 - names: - kind: GlobalNetworkPolicy - plural: globalnetworkpolicies - singular: globalnetworkpolicy diff --git a/output/manifests-networking/globalnetworksets-crd.yaml b/output/manifests-networking/globalnetworksets-crd.yaml deleted file mode 100644 index cef0fe57c..000000000 --- a/output/manifests-networking/globalnetworksets-crd.yaml +++ /dev/null @@ -1,13 +0,0 @@ -apiVersion: apiextensions.k8s.io/v1beta1 -description: Calico Global Network Sets -kind: CustomResourceDefinition -metadata: - name: globalnetworksets.crd.projectcalico.org -spec: - scope: Cluster - group: crd.projectcalico.org - version: v1 - names: - kind: GlobalNetworkSet - plural: globalnetworksets - singular: globalnetworkset diff --git a/output/manifests-networking/ippools-crd.yaml b/output/manifests-networking/ippools-crd.yaml deleted file mode 100644 index 3bb6804b1..000000000 --- a/output/manifests-networking/ippools-crd.yaml +++ /dev/null @@ -1,13 +0,0 @@ -apiVersion: apiextensions.k8s.io/v1beta1 -description: Calico IP Pools -kind: CustomResourceDefinition -metadata: - name: ippools.crd.projectcalico.org -spec: - scope: Cluster - group: crd.projectcalico.org - version: v1 - names: - kind: IPPool - plural: ippools - singular: ippool diff --git a/output/manifests-networking/networkpolicies-crd.yaml b/output/manifests-networking/networkpolicies-crd.yaml deleted file mode 100644 index 4d34ad01b..000000000 --- a/output/manifests-networking/networkpolicies-crd.yaml +++ /dev/null @@ -1,13 +0,0 @@ -apiVersion: apiextensions.k8s.io/v1beta1 -description: Calico Network Policies -kind: CustomResourceDefinition -metadata: - name: networkpolicies.crd.projectcalico.org -spec: - scope: Namespaced - group: crd.projectcalico.org - version: v1 - names: - kind: NetworkPolicy - plural: networkpolicies - singular: networkpolicy diff --git a/output/manifests/kube-apiserver-secret.yaml b/output/manifests/kube-apiserver-secret.yaml deleted file mode 100644 index 81abce2da..000000000 --- a/output/manifests/kube-apiserver-secret.yaml +++ /dev/null @@ -1,14 +0,0 @@ -apiVersion: v1 -kind: Secret -metadata: - name: kube-apiserver - namespace: kube-system -type: Opaque -data: - apiserver.key: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBMUFsTTZKRG9pTk5RT2ZCRXpEOVZVWmpRU1RFZEdLMkpmSFBRclJROGMyaGlMWm5rCmIyTUo5OGsyNVNybWppbWhPdC9FMHZKL1E3enErb3JmekNKd1AvZ2sxMGFQTmFyT1VQaUdSNzZrQU4vWitDTHgKS1J1Tm9EWWpZaGdZaG51bjB1djd3cXRjMEgxY282c05zbnlOTjJBUGhER1VuY2M5cncxUUdCRUN1YVRyQVVqYgorRnJLaTNLQ3RNVmdyemVCNGFUWkdqTE5nTGxOQStxbGVmbXVjcFIvdnh4Rzh5aGxwNU14cmFoLzVyS0ZBellGCnVQM05hamwxN1oxb0JYazFqaUFZS3E3SzkyYVp1aFYvbUo2WWVCbFVzMFRvb3NxNCtFRUFsM0w2ZW1OaWpMVGsKajNhNk5ZRHVRSGxlK2JNaW9Kc3UvQzhGS3I5cmhrekwvV1hMbVFJREFRQUJBb0lCQURXWnNFWnVNMG83V09GOAptbmVqWHZjRWtVcWZUc0twUThNaEo5Ukk5RXNjVFExSUJOWWZqQ3FHUkFsRWdnblgvamo2emkraE80aXRIaDE5CnM3dFB6VjV1WlNuQ1hYdHNsVUVrd2hVcTNSeVhlZXRmTWVWNVlLRHFicUZpZy9pakU3YWZEd0tUL1I5N1FVcmkKZDlEeDZXVGhOS3J2T2FsMDcyUHNFcDR5MXFTRTFVWDBrQUxRWXlVSFhVWlNvWU9Nckl3Y21HNXdteVM2TGZqcwo3UUFoUkdvakF1OGpXUGFEQSsvWDhKSzZsKzBWN3pKZHlBMGVKWUsrQitTLzhQNG5UMmJ1R042OUl4dWJOWTdyClNJMEgzS2RTSTlwaURZRVJCZWJzZVZ4ZTRKbzVkVXVLRUZBR01NME8vSHBRRkE0anNWWVlMUWpRT01hK1g4U2EKTVd3VkVRRUNnWUVBOEZWVFRISGRuQ3BzYmwyZ1ZZaWFpdnFCb1hsa2YzcnZNTWcxVFNxNTNUMXFJVnJRV1hoLwozMkpuZVVCQlBnN2dwY3Z1QURVYnh4alhtclczc2FPR0Z3SGlQOXNacEd5RTJUQStnM2JTc2hldm5jZ2hmakQ4CjcrTURDUHRwa2ppZ0RZNUlyTHlJejlsQ3B4SGVtT28xOXJ6OEJDZk1GOVpBN2kvbzNFWUFmWGtDZ1lFQTRkdkIKakdINEExSjVwSytvVkdoc0NtZzVpYUl1T1d0ZVhZU2RTV1dxcFpCWDg4U1RnR3NSTVRFQjNtNFFzcnBTeFlPUgorR1hMUk43MjJxSnh0eGVLcXYyemJPZGJKUVJGWXZ6UzFoMC9RY3BMeDlDN1RKcW00SGFhRXhvTUpJcjV2MUptCnFGRHJwWXdDaTlVY2NjK0ZGMWM2amdEYXdGbEFTU2V4bllGZDF5RUNnWUVBdGx3SUdMbE5ybkdDVlR3MXJMRTYKa3JvQ0lzUTV2WUZLZlhscytHQ3pKMnl5V3h6TmV4WXo4UXg1OTBjS09reVBxVDVVR1ZReS81K1orWXBwR0NFOApYYmpRTkNQTUVUZEdsb1pFNlB3QVk2SVZYMk84QmtTbHFHQllyVGdYb3h2VVZuVGdNREhlbmRmOCswaFQzelBZClBxQ25tWCtaSFgwMVI0YVM2cEV6VGdFQ2dZQUw0RGcvSDIraERSY2tWN0FzTUFsdVNxaXIyZ0ZBTjZzUWs4YUoKYzNVVG01RmtXZk8vanVHcWluOGtxUGpyek94SlFtL01kZDNJVTBqN21nc01xNG81RDNuOXdmU0M3OFNPUGVrQQpKUzJNVWd6R0J1MnlTM2QyMmdXajkzeW45ejdHbHBpYlJSWCs4R3U1Mm96U1Z5MFNXeDNURmF4cTdNWjZra0crCm5Hekl3UUtCZ1FEdWEyOTg4MVY5ZkxQU2lRM01xcUR2OGgxTHdOSlNVTHVtT1RoMlpIWnRJMjBxZ3pTSWt0dFYKQ0dGbmllc3I1ekJnTUtEM3I2QUFrNWMzUVcrUEk4L0Y0UnRqczBRR3ltZlNIS1JxL2xDMVd3WUJtL2NNWUQzagpJSlNXSG1VdjRjWVBNb1J0VkYrbkJQMDBIS2dTc1BIbDI1MFdtdEZ6VWlvYmk1ZVVCMmszTWc9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo= - apiserver.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUQzakNDQXNhZ0F3SUJBZ0lSQVBGcVlYbmNrNnlPTUlkbm5laVdzYkF3RFFZSktvWklodmNOQVFFTEJRQXcKSlRFUk1BOEdBMVVFQ2hNSVltOXZkR3QxWW1VeEVEQU9CZ05WQkFNVEIydDFZbVV0WTJFd0hoY05NVGd3TXpFMQpNRFF4TmpFMVdoY05NVGt3TXpFMU1EUXhOakUxV2pBdk1SUXdFZ1lEVlFRS0V3dHJkV0psTFcxaGMzUmxjakVYCk1CVUdBMVVFQXhNT2EzVmlaUzFoY0dselpYSjJaWEl3Z2dFaU1BMEdDU3FHU0liM0RRRUJBUVVBQTRJQkR3QXcKZ2dFS0FvSUJBUURVQ1V6b2tPaUkwMUE1OEVUTVAxVlJtTkJKTVIwWXJZbDhjOUN0RkR4emFHSXRtZVJ2WXduMwp5VGJsS3VhT0thRTYzOFRTOG45RHZPcjZpdC9NSW5BLytDVFhSbzgxcXM1UStJWkh2cVFBMzluNEl2RXBHNDJnCk5pTmlHQmlHZTZmUzYvdkNxMXpRZlZ5anF3MnlmSTAzWUErRU1aU2R4ejJ2RFZBWUVRSzVwT3NCU052NFdzcUwKY29LMHhXQ3ZONEhocE5rYU1zMkF1VTBENnFWNSthNXlsSCsvSEViektHV25rekd0cUgvbXNvVUROZ1c0L2MxcQpPWFh0bldnRmVUV09JQmdxcnNyM1pwbTZGWCtZbnBoNEdWU3pST2lpeXJqNFFRQ1hjdnA2WTJLTXRPU1Bkcm8xCmdPNUFlVjc1c3lLZ215NzhMd1VxdjJ1R1RNdjlaY3VaQWdNQkFBR2pnZjR3Z2Zzd0RnWURWUjBQQVFIL0JBUUQKQWdXZ01CMEdBMVVkSlFRV01CUUdDQ3NHQVFVRkJ3TUJCZ2dyQmdFRkJRY0RBakFNQmdOVkhSTUJBZjhFQWpBQQpNQjhHQTFVZEl3UVlNQmFBRkprNTNjODZoazVRM0ttMVIyN1F2VnJUYjRvR01JR2FCZ05WSFJFRWdaSXdnWStDCktXSmhjM1JwYjI0dGRHVnpkQzVyT0hNdGNHeGhlV2R5YjNWdVpDNTBZV3RsYzJOdmIzQXVZMjl0Z2dwcmRXSmwKY201bGRHVnpnaEpyZFdKbGNtNWxkR1Z6TG1SbFptRjFiSFNDRm10MVltVnlibVYwWlhNdVpHVm1ZWFZzZEM1egpkbU9DSkd0MVltVnlibVYwWlhNdVpHVm1ZWFZzZEM1emRtTXVZMngxYzNSbGNpNXNiMk5oYkljRUNnTUFBVEFOCkJna3Foa2lHOXcwQkFRc0ZBQU9DQVFFQWVrdndDa0x6VWYyaCtHTzg1K2w4MXljRGcvcU4xczR5aU9saE1WdzIKdncxcnVRUHBGY3c1U3loRDVzeXpJQVZFdkhlby8reXRiZHB3azFGclFsS2dwdmo4dS9YTlRDcUliQVVsNDMvZQpFQStlMVpGcXBTK0o5TTVCSGJnNVJVNFRsaG1qVGZNRXpWS3hSanJLYnJMM2I0K29FMi80d3VKQWw5MU1UYk1mCjVVYXZoOWV3ZGNEUEJKb2FCK1c2MERSTlEwbisvYXQ0NlIyTUN4WXYvRDNxZlBZTDZPbkp1VTJ5OURzZDdHcWMKbkpJZTNqczhLUGczaXdxdEVlMDA3TUgrTng1eTJ5VFgwZGlCcUlpM3IxQk8xVGdxK25YSmtxdEdHV05wMERTNApZR1VkUCs1QVppSWtnZDd5STRONmZNVW03S0dhbDRPNFNQME5pNVk5Y3UwcjRBPT0KLS0tLS1FTkQgQ0VSVElGSUNBVEUtLS0tLQo= - service-account.pub: LS0tLS1CRUdJTiBQVUJMSUMgS0VZLS0tLS0KTUlJQklqQU5CZ2txaGtpRzl3MEJBUUVGQUFPQ0FROEFNSUlCQ2dLQ0FRRUEwaytNd3RPK2lKQzcyQWlaek0yRQpxQ1VZbEhPblpPMDNOMWJPS2pBQWZUaWN1bHdFQmFKV2tna0ZwdmRjci8wR1JDN0pMY0Zob0lwUnlzZ3p2dmE5CjFtVXBGRnV3WWJsT2VCMlhHTDQ4V0toZktaWTFDZ3JFNHpiRkpUdmQ4ZUdhVXdMdVdPOTlobEFSKzBMR2dsTC8KUDFtdnNNMmk3WFhhbUVkcWluekJRQWZWWHZROEUxRHZ0bmhZVndIdFNxWWhSN1lvMXozVlJFbFpwZEhjYjlvMApIUjE4aGppak9aT2xpSXowejJRdXhxcWtkaDlLTjNLRldRVUtTVVFFcjJ2UFV2R3VXdFh1TUU5bXYrSWV4TEs1CmRVeitsT1VPS0RnTEgvaStqeUQ2K1g2aGR3c0ZoYTVBT2V3bnVPd2tnTHZTQnllSUZNVG9KSEVtMStNZFBrTlkKQXdJREFRQUIKLS0tLS1FTkQgUFVCTElDIEtFWS0tLS0tCg== - ca.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURGakNDQWY2Z0F3SUJBZ0lRRTRlK0cySTZTL0VtTlZ3d05oRWJiVEFOQmdrcWhraUc5dzBCQVFzRkFEQWwKTVJFd0R3WURWUVFLRXdoaWIyOTBhM1ZpWlRFUU1BNEdBMVVFQXhNSGEzVmlaUzFqWVRBZUZ3MHhPREF6TVRVdwpOREUyTVRWYUZ3MHhPVEF6TVRVd05ERTJNVFZhTUNVeEVUQVBCZ05WQkFvVENHSnZiM1JyZFdKbE1SQXdEZ1lEClZRUURFd2RyZFdKbExXTmhNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXZpVGsKbjNBSDJ6VE1WL3pGL1E4V0N2NWNBYWVYUmNacXBxd3ZQeDJZRGlFVGZPY1N0NFIyaHJNeXFQRDVVa1dEOTluOApteWJFeDdmMW9jM1IxdXM0Zm9NR1d2OXR2SUpKa0Q5ckkya0MrU1lGcWtCOEtlZCtBVktJRmoza3JUL3p6NEhwCnpJckdvMzhSQWpDWER2UHNRVWNhWDV0WWZpWk5OR3ptQjB1YW9ZYkNnUWM3aDB5MWRWbU5KQ0tHVmR6TDRWd1UKbEZLK0lSSW9NRGtYN0NXN2VQYkhORXdKck81VEwrczg1T2kyM2xkRFl2UXNkMmtReXB0aDFSMU1aSzIrdW1UcgpHRlpZOEE2RmpHZDVLS0gwQkZmc1pBRFRXTXJ3K3JwSnpXcmtSQTR4VHBNU3V5TTZVZDc2R29KNXNUampsbU1mCmw4ZFdUSGFUZlM4T2lUVmVOd0lEQVFBQm8wSXdRREFPQmdOVkhROEJBZjhFQkFNQ0FxUXdEd1lEVlIwVEFRSC8KQkFVd0F3RUIvekFkQmdOVkhRNEVGZ1FVbVRuZHp6cUdUbERjcWJWSGJ0QzlXdE52aWdZd0RRWUpLb1pJaHZjTgpBUUVMQlFBRGdnRUJBRzZETU5GbFlVZjVTOW1TVWVyOERENVYzMENZWXdkNWpXeTZFcGl4WmhTdzJsc0VkMGNPCmtKSnA4enpIUTZrdXAwMkRWNGRqbHZGWUlGaW9GbTFpYkl5WkIvNEQyN1Mzc0JMT3lwTExnR3FmczZsMXBjdFIKTEpHUnRDSmRGZGx1Y2pWZG0veCtRc1NsRUM1c21ucm85OTFDdEV3WTFvd3VSQ1VFdFd6Rk9VaGRmYnBXSXJSMQpXcyt2dkNOdkhFQlBjMmxzY2R3OU00S0hScUNtWmtOVlRuM0tXbzFYUkNnSERPdWVwREdDWXdHc1NZZy9LMmpTCmVSWU9HWHpGKyt5d0g1cGNtMXhBeVZaVXJTQ25HYlJxS0lYeTlDN1lHcWtIb21hRUtJNWhNaVZDRDVrRGNMZm4KQ3JNaDAvUHpqQ3lwUWxGYXd3eW9BdWUyRU81YSsyc1drcW89Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K - etcd-client-ca.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSUREakNDQWZhZ0F3SUJBZ0lRU0hBVURjN3pmWFo4S3E0b3lOK3VRakFOQmdrcWhraUc5dzBCQVFzRkFEQWgKTVEwd0N3WURWUVFLRXdSbGRHTmtNUkF3RGdZRFZRUURFd2RsZEdOa0xXTmhNQjRYRFRFNE1ETXhOVEEwTVRZeApOVm9YRFRFNU1ETXhOVEEwTVRZeE5Wb3dJVEVOTUFzR0ExVUVDaE1FWlhSalpERVFNQTRHQTFVRUF4TUhaWFJqClpDMWpZVENDQVNJd0RRWUpLb1pJaHZjTkFRRUJCUUFEZ2dFUEFEQ0NBUW9DZ2dFQkFORENPaVloV250cWNtaHQKZE5BME90VTlLR09IL05UZHRxT2pUOHBkTjZXUzNiZFRHK3JKVEJlQU55SjRHRjlQRmtEVDlRLzA2clRaUVAzYwpOTkpDbUxzT2pLa1I2Nm9lTkY2Z2dtYVJnRm9jUjJWTTR2R0xpUXFqV2ZCK0tIQmdvU0NYdUU3TDM4dmw2QTR0CjBDREc0UzBnbWV1TVA3eG55Qjg1WWMwb3VNdndSVlE0SkQvQ3VUT2RGMTRsTXBuWTlaU21WYXdHd0gxOER1cEcKV2F3REE0czlTL3c0M1NscWVadFM4SG1HUjMxblZ3a20vMEJINVdhYlBqMng0cGZRWlJOYzJ6bGxzRDRvSENwSwpRNlR1cHUrNTN6aG16UVZpNTdDWjhvWEVOVHBML2dBc01GdTFndlRNTThMamNKM1BCNGZsRUdua2ZOMXowK2pLCjBwdGI0UHNDQXdFQUFhTkNNRUF3RGdZRFZSMFBBUUgvQkFRREFnS2tNQThHQTFVZEV3RUIvd1FGTUFNQkFmOHcKSFFZRFZSME9CQllFRkFGMG9CNHF5Sk42WDFpYmZzVU81SFlrbWxaOU1BMEdDU3FHU0liM0RRRUJDd1VBQTRJQgpBUUF3c1oyV3hYQzYrWmdmU1VFTG1FR2gxV3Z5cHVtMHBmUTMweVhncnZXK0pMVW8rY1QxSmFNaGpINDdTR3g0ClZ6ekl2dmliSGtSUEduSjQvQ3VEKzdKY05OTldDdGNSbTVXUXYySzJ5VTVvOGI2V0NSQWVhUGVWNGt4NW1NSWYKT2JqcTFlZWZEcUN2UU1nUU1xMXlXa3RXMHVJSmQzVGtxTVBZcFRNR3V5aHl1Q3JjaFlKU25CS3V2eldJampYYwp5bzl5WUJEVG53VHJFWG53ZEhnNjB5alUvUlB2YVE3MUNEZFlvZFQ4ZGFuekU5dCtHYWQ1bWRwNkVvbHZEdXJlCmp4b1Qzd1dlMjNTRG9kUlFidFpPd1VOUzhPeThLVWdOZFR4UWZ1YjNldjRCbEtXM0hIVmlrbkYvNGZ3SFErSzUKdWQvQlU5YUVHaWhPOXl2bm1jMHM0WC83Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K - etcd-client.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURzVENDQXBtZ0F3SUJBZ0lSQU5GSXB0OE1oZXFOckNDUlE3VkVBZVl3RFFZSktvWklodmNOQVFFTEJRQXcKSVRFTk1Bc0dBMVVFQ2hNRVpYUmpaREVRTUE0R0ExVUVBeE1IWlhSalpDMWpZVEFlRncweE9EQXpNVFV3TkRFNApNVEJhRncweE9UQXpNVFV3TkRFNE1UQmFNQ1V4RFRBTEJnTlZCQW9UQkdWMFkyUXhGREFTQmdOVkJBTVRDMlYwClkyUXRZMnhwWlc1ME1JSUJJakFOQmdrcWhraUc5dzBCQVFFRkFBT0NBUThBTUlJQkNnS0NBUUVBNkd3MDBBRVUKUWIrSFlPY1BwRTZXZ1NDRlk2eTM0d1BWcHNDMzdYbWUrdjg2Mk5qc2srMDBZNG44QkVIU0V5MDR6YTkrTmFWUwovc2g2R0tQS1ZlYjlDM041VFhuYVpJVThxanIrL0IyVnhOSkRtODMvNjIwUitFbDRjUHVjdGdtS0xuNmZnNnNXCnZQcFF5dTRBOHZXK0pzaUlIT09xQkhobWJORk9yNE1mRnlSOFpRbm9uakRxS0JpNUgvU0FFb0l0WWIzcmJxMEcKS21GKzhPek1DQ1BpazU5bzc4R0tMaVN2RFVDVzc1c2czdVNWNmpFNTlzSzY2bEdOYWFtZ2V1RFZyOVFUUHpMaApONlVSQmhhRHM4c1ExTW1nZVJtUmNVSTJ3dWdxd3Fydkhld3RVSlp6dkNsTUdnSjdzemlHRXRBMSs3UlpJUkRUCnF3MzFvMXZNeU1VNm93SURBUUFCbzRIZk1JSGNNQTRHQTFVZER3RUIvd1FFQXdJRm9EQWRCZ05WSFNVRUZqQVUKQmdnckJnRUZCUWNEQVFZSUt3WUJCUVVIQXdJd0RBWURWUjBUQVFIL0JBSXdBREFmQmdOVkhTTUVHREFXZ0JRQgpkS0FlS3NpVGVsOVltMzdGRHVSMkpKcFdmVEI4QmdOVkhSRUVkVEJ6Z2k5aVlYTjBhVzl1TFhSbGMzUXRaWFJqClpEQXVhemh6TFhCc1lYbG5jbTkxYm1RdWRHRnJaWE5qYjI5d0xtTnZiWUl2WW1GemRHbHZiaTEwWlhOMExXVjAKWTJReExtczRjeTF3YkdGNVozSnZkVzVrTG5SaGEyVnpZMjl2Y0M1amIyMkNDV3h2WTJGc2FHOXpkSWNFZndBQQpBVEFOQmdrcWhraUc5dzBCQVFzRkFBT0NBUUVBZTh6T1YwYnZhWngwSUU2QUFXeVBzMS9LaFBQbXBqTFovU3Q5CmFMdkVQY3JrSm5Cck1MK0NRS2ZyWEgydjRib3RuWlpOZHA0bXg0UlNxdUlyaHArM2M1RGNQVnVBaVlja1J0bUUKR0Vya204RGtQYitYR2dyL3RGTkloenp0dkV0T2l6ZmJFTndZZlFFUjZVdTFBYi8wWE9QNWFyY2lHQ0tRa3BvMAp2RHp0VGpBd2NXMXIyVW91OWt2aitBSnFiUHoveWdlRS8vUDV1c2lPcHQzU3FubHdYZUhha2Vpck9QQTA4b0hECmFUQjdDMUc1bmJMRlB4QTN6ZEZzbHorNmNuRXdLTGFLVHo4WTJLSHJLeDlHalVsUllRcFhvanBwY0l1bXhjaHkKT1dRZlIyd0RHVTE1L2xtL2gxazFxRzNDakhpaFVqWVBMTzBTVTZ3cGtUZCtpNXh6T1E9PQotLS0tLUVORCBDRVJUSUZJQ0FURS0tLS0tCg== - etcd-client.key: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFcEFJQkFBS0NBUUVBNkd3MDBBRVVRYitIWU9jUHBFNldnU0NGWTZ5MzR3UFZwc0MzN1htZSt2ODYyTmpzCmsrMDBZNG44QkVIU0V5MDR6YTkrTmFWUy9zaDZHS1BLVmViOUMzTjVUWG5hWklVOHFqcisvQjJWeE5KRG04My8KNjIwUitFbDRjUHVjdGdtS0xuNmZnNnNXdlBwUXl1NEE4dlcrSnNpSUhPT3FCSGhtYk5GT3I0TWZGeVI4WlFubwpuakRxS0JpNUgvU0FFb0l0WWIzcmJxMEdLbUYrOE96TUNDUGlrNTlvNzhHS0xpU3ZEVUNXNzVzZzN1U1Y2akU1CjlzSzY2bEdOYWFtZ2V1RFZyOVFUUHpMaE42VVJCaGFEczhzUTFNbWdlUm1SY1VJMnd1Z3F3cXJ2SGV3dFVKWnoKdkNsTUdnSjdzemlHRXRBMSs3UlpJUkRUcXczMW8xdk15TVU2b3dJREFRQUJBb0lCQUh3eU5oWmdQVDdVNWJaMgpRZmwrdFJYVEZ2UW9PeXJueGFjUm5EY2Rva0psV0VDL3ljdFNHWWlIRjAvL0RBNkxQNnRKZDV1YStEcUZUaGtVCmpPNVNQQzErU3ZlSGdaZnRTbmw4aFB5Ym9vaEdBektpWlhxY0Vkb25DR0QzVXNwRFZyOTVraXQ5cE96ZXBZV0sKb0o4emlhU1h5NFFFYzdsbnpQT2c5UGI4amdTQ3liRm5KK3VtQlhUT2gzRTQ4eGsrYnRNUnNpaEx1VzNWczVMRgplVGE4T1dvMit5bEdHa2FEV29tNEFZMUQzb09tSmx2cE1sY0w5bHFOSzI0SjYxczRpK0g5Q2NBenV6S3dJU2RnClNOcDV1RGhUMjc4MnEyM3UrY1hOQmVMUDdSSU1ubFR1cFo5Mk9DREY1b1d1Y0tNVVJkN2loY0oxQ2YxTmJaZmEKTUlBRWw4RUNnWUVBNlN6Ny9CQ1JWSjNPc3hqbm82YzlKdjliZFNENUN3VWhraGhJRnNRaGVyLzdsdzBSZVRZVgpweFQvZnV2SVI4WG1xaGIrRkxpWjJJTzFzSHJrUG5oRHRxOUUyalpSVS9ZajBpd2JBUHUwTGU1NURyU2M3SktqClBVcG9xU2Rhb1NuakNkU2MxcU05R2VNbUltdVpJUjVtUjBOc3NJT3RjT2ZsNnI2ckprTVVzRXNDZ1lFQS95eGEKSFd3M2MyellHZHE0Q0R3QmdoNXRFVU5saTZ0cFFzZkxFbFlnd1l3N1hPd0JsNGRwbzduSE1oNEt1bVlRbDFXVwpES1piRkhpblZ3dnA4OUxzWG9YckdSL1M3NXBmR0FZWWcvVE5qNVFtQmlhM2tzdHI4N3ZnZUVlN05WajlsZnA5Cnd3RzhjeE95NjVLZUNFZmIvaFBGTmwyQnRIRm5pQmRzZlFzTUdBa0NnWUJ1RUh6VlM2QytGMHRWUU1FK2Y1ZWYKQzlSSTRvcUx5QjFEajlDZlptOERPUkh5Q0FvaWRBUWVmUXZwQmpUZ3BDcXdTUEFnS2M3ODQ1Ymt1ZTE1QzEyegpJdUpXT21PRFJXRTlPUEo2TVZXb2hMT0IzSUZpTGdsOXlkekRVNzgwNmNld2dUcVRHalNpUHBWbWsvR1JMMzlKCnppckUyek1JWTM0a28yRzRTdHUrSndLQmdRRExuVkYvSHVZVWRhcnUzb2R4RXFqRmNvL25jWmNxM3ltTVB5NzgKdjdzOWxpK2NVenBsOW9qR082MEdnZEJmc3FmVWlsZkVXazVkUkhXTFVSZHJGMGpEbUNya0RtL2IvNXVYNk8xUgpCbHV0RVROU1B6ekdwd25LSUlYYWxLcCt4RGI5b1RjUEQyaVhqd1Y3VXJCRnZVbC9NYmx4U3lYL25XcFd2eEl6CnFVZ0tPUUtCZ1FDQWJpcEhITmRkTEo1S01vdFd6ZW9hR2hTTHBKbG5MaWtvS1h2QURQck1Vd1MwZWFNL2xRZ1kKRUp1OC9SbE5CdW9GcWloNHJQblhUWXRYUVhsbGVhRWNaSFVBRFgySXVUbHRpNHRET0QySWRNdHUxbnRIZ2VwKwpUdy8rT3loRmpua2dLS2UzNDM4UzJGU2hUOTFzWWcwNk9DNTFxaXQ4MkY3K1h5aEQybTNnVWc9PQotLS0tLUVORCBSU0EgUFJJVkFURSBLRVktLS0tLQo= diff --git a/output/manifests/kube-apiserver.yaml b/output/manifests/kube-apiserver.yaml deleted file mode 100644 index 96bf48176..000000000 --- a/output/manifests/kube-apiserver.yaml +++ /dev/null @@ -1,88 +0,0 @@ -apiVersion: apps/v1 -kind: DaemonSet -metadata: - name: kube-apiserver - namespace: kube-system - labels: - tier: control-plane - k8s-app: kube-apiserver -spec: - selector: - matchLabels: - tier: control-plane - k8s-app: kube-apiserver - template: - metadata: - labels: - tier: control-plane - k8s-app: kube-apiserver - annotations: - checkpointer.alpha.coreos.com/checkpoint: "true" - spec: - containers: - - name: kube-apiserver - image: gcr.io/google_containers/hyperkube:v1.9.3 - command: - - /hyperkube - - apiserver - - --admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ValidatingAdmissionWebhook,ResourceQuota,DefaultTolerationSeconds,MutatingAdmissionWebhook - - --advertise-address=$(POD_IP) - - --allow-privileged=true - - --anonymous-auth=false - - --authorization-mode=RBAC - - --bind-address=0.0.0.0 - - --client-ca-file=/etc/kubernetes/secrets/ca.crt - - --cloud-provider= - - --etcd-cafile=/etc/kubernetes/secrets/etcd-client-ca.crt - - --etcd-certfile=/etc/kubernetes/secrets/etcd-client.crt - - --etcd-keyfile=/etc/kubernetes/secrets/etcd-client.key - - --etcd-servers=https://bastion-test-etcd0.k8s-playground.takescoop.com:2379,https://bastion-test-etcd1.k8s-playground.takescoop.com:2379 - - --insecure-port=0 - - --kubelet-client-certificate=/etc/kubernetes/secrets/apiserver.crt - - --kubelet-client-key=/etc/kubernetes/secrets/apiserver.key - - --secure-port=443 - - --service-account-key-file=/etc/kubernetes/secrets/service-account.pub - - --service-cluster-ip-range=10.3.0.0/16 - - --storage-backend=etcd3 - - --tls-ca-file=/etc/kubernetes/secrets/ca.crt - - --tls-cert-file=/etc/kubernetes/secrets/apiserver.crt - - --tls-private-key-file=/etc/kubernetes/secrets/apiserver.key - - --oidc-issuer-url=https://accounts.google.com - - --oidc-client-id=727680134807-s74j3n1h3nhmkpvg0j8v3ovnl04p27oo.apps.googleusercontent.com - - --oidc-username-claim=email - env: - - name: POD_IP - valueFrom: - fieldRef: - fieldPath: status.podIP - volumeMounts: - - mountPath: /etc/ssl/certs - name: ssl-certs-host - readOnly: true - - mountPath: /etc/kubernetes/secrets - name: secrets - readOnly: true - - mountPath: /var/lock - name: var-lock - readOnly: false - hostNetwork: true - nodeSelector: - node-role.kubernetes.io/master: "" - tolerations: - - key: node-role.kubernetes.io/master - operator: Exists - effect: NoSchedule - volumes: - - name: ssl-certs-host - hostPath: - path: /usr/share/ca-certificates - - name: secrets - secret: - secretName: kube-apiserver - - name: var-lock - hostPath: - path: /var/lock - updateStrategy: - rollingUpdate: - maxUnavailable: 1 - type: RollingUpdate diff --git a/output/manifests/kube-controller-manager-disruption.yaml b/output/manifests/kube-controller-manager-disruption.yaml deleted file mode 100644 index 1d1d02359..000000000 --- a/output/manifests/kube-controller-manager-disruption.yaml +++ /dev/null @@ -1,11 +0,0 @@ -apiVersion: policy/v1beta1 -kind: PodDisruptionBudget -metadata: - name: kube-controller-manager - namespace: kube-system -spec: - minAvailable: 1 - selector: - matchLabels: - tier: control-plane - k8s-app: kube-controller-manager diff --git a/output/manifests/kube-controller-manager-role-binding.yaml b/output/manifests/kube-controller-manager-role-binding.yaml deleted file mode 100644 index 267856a9a..000000000 --- a/output/manifests/kube-controller-manager-role-binding.yaml +++ /dev/null @@ -1,12 +0,0 @@ -apiVersion: rbac.authorization.k8s.io/v1 -kind: ClusterRoleBinding -metadata: - name: controller-manager -roleRef: - apiGroup: rbac.authorization.k8s.io - kind: ClusterRole - name: system:kube-controller-manager -subjects: -- kind: ServiceAccount - name: kube-controller-manager - namespace: kube-system diff --git a/output/manifests/kube-controller-manager-sa.yaml b/output/manifests/kube-controller-manager-sa.yaml deleted file mode 100644 index bb8f0aab9..000000000 --- a/output/manifests/kube-controller-manager-sa.yaml +++ /dev/null @@ -1,5 +0,0 @@ -apiVersion: v1 -kind: ServiceAccount -metadata: - namespace: kube-system - name: kube-controller-manager diff --git a/output/manifests/kube-controller-manager-secret.yaml b/output/manifests/kube-controller-manager-secret.yaml deleted file mode 100644 index 5d331ce9e..000000000 --- a/output/manifests/kube-controller-manager-secret.yaml +++ /dev/null @@ -1,9 +0,0 @@ -apiVersion: v1 -kind: Secret -metadata: - name: kube-controller-manager - namespace: kube-system -type: Opaque -data: - service-account.key: LS0tLS1CRUdJTiBSU0EgUFJJVkFURSBLRVktLS0tLQpNSUlFb2dJQkFBS0NBUUVBMGsrTXd0TytpSkM3MkFpWnpNMkVxQ1VZbEhPblpPMDNOMWJPS2pBQWZUaWN1bHdFCkJhSldrZ2tGcHZkY3IvMEdSQzdKTGNGaG9JcFJ5c2d6dnZhOTFtVXBGRnV3WWJsT2VCMlhHTDQ4V0toZktaWTEKQ2dyRTR6YkZKVHZkOGVHYVV3THVXTzk5aGxBUiswTEdnbEwvUDFtdnNNMmk3WFhhbUVkcWluekJRQWZWWHZROApFMUR2dG5oWVZ3SHRTcVloUjdZbzF6M1ZSRWxacGRIY2I5bzBIUjE4aGppak9aT2xpSXowejJRdXhxcWtkaDlLCk4zS0ZXUVVLU1VRRXIydlBVdkd1V3RYdU1FOW12K0lleExLNWRVeitsT1VPS0RnTEgvaStqeUQ2K1g2aGR3c0YKaGE1QU9ld251T3drZ0x2U0J5ZUlGTVRvSkhFbTErTWRQa05ZQXdJREFRQUJBb0lCQURUbDFWM2JySHpsQ3BwWAo3M2RYNmhudzJySGNOU3Bwa0EzWFE1dlEzdzZnQXF2TklTWFpvelN3R0QvYXovRmtEd052VVNLMUZUMHdEVXFYCitJdjd1OXdGTGNQMUcvUTRpOGdpaVRLc0JybTEvOW1SOGwxSVFDVjJUVGdFU3RyZ0I5VUJVN29DNHV1NWtBeEcKeTI5VU9PZFNRNktRMW40cnVvTzYwczFxZTZFQzRtdlMyS2lBekprODZOWitIdGRZeTdCV1pzQnBUdDVoMGhERgpQV1JsdFJ4UHNtTytMQm1Ed09VYytqTGN5UXlLR094V0R5NFdQSjZ3UEJzc1JUR05aSkp6WG9VdmRRQlJHMUZ2ClIxd3JFcEpKeDhIdER6aW1ldUh4VmNaSU1sSXRDL0lQcTJPMDdleVhGQW5hUlhncXkwMi9qa0Znc09WOTNIbWMKRzBQdWM0RUNnWUVBMlhmdUdtVDBKaFFTNTFwKysreVRWcWdqQXpDVnN5bkNmc1JHNXZpZHVFcjFneE5wRERybAp0WHdveUdmeFRxcWFHS29NN0FJbVVMY3l2VE9iUlJLN2FNK1BJRE9KS1AvdVZCdWZvcExNOUJ6d3crb1pEYXlxCnVPUlZXOURLUlpNOEk0blM1TmdIbmliZHhVRkNVRjRkYm8rQkhFYUk5bmpHOFBqSVpOZExJWEVDZ1lFQTk1TDAKYTIyNzhJMXdKL0VKQ29EdHlFNnBOdENXWUxuckRJbVhLYjcvOUYyNnFYSU9oVXVXVzBGcDFrNmR2WHpjdHU3dgo5ZWV2emk0emlmamtNVVBRNnpTdHhML0duZzV0c0dPZEcydFNVTWtLd2NqWnhsK0NWVVV6a3BVZk9hNFV6TlYrCnZJeVpBTmpsUURseHptSGFoNTBBbnRxOG40Tk9QYlQ2TjYvdlZyTUNnWUFyelF3WUpOMUlEaU1BbGltZGREajQKNjBTaUQ5Y1hEd0l0cGpyaHFwR1ozUDgyTjJLaEkvdkFZaEdVeTlxK2pYNGNHYVFncFE0eWs3T1VpQ0J0K1NmbQpKR2dmaEVITUVFQmdrRy9HdnVxcEFHcytDcGloT0hYcVo1TUp1elFDYjNWZGN4VVhJcXZtSHMzc1BRaXVSMGFHClRrRWpBTkgxVXI0L0t0eXg4dXNmQVFLQmdDQjY0U0l1OVZjcjF5a0dVRjlXWnR1K3BpaVEyUW03bW9DOGxGNWYKdG9qQ3V6aDd4RGZzb0w4OEo3eDc5K25pTmJxeVFqMEt0bC9nWTlhWUZxZjM4N0xINkh4RmhMTTd4Vnc4MVdIQgpoTDBnZ3c3RllQekxqdmZNNm1VeXR6UUVDS3FPMzkrd3Vtb0lDcHVRYmNQYnhxWEFEVkxKODdFaHN1UVptREl5CkhMNU5Bb0dBZnNBemdlQjVHU3ZYRlp5NXZrS2JDYUpMMXNqMFlJakVpZS9YNW1LdmlMbmVpMXdLOGhVdVJodE0KNXQ0WnVlTlVhUndVWWErQzFjakgxeCtYaGE1K1c4NnVyYTVjT2ZDU3BqdWUrV1BOUTZJZXhZaGVsWHVvWTEwMgp4Sk5uSFhRdUQ0anF0QWlPQWVLS2diWG4rS0x1aDNkRnI3MVVIT2UwQS9NZVI4dFdxVFk9Ci0tLS0tRU5EIFJTQSBQUklWQVRFIEtFWS0tLS0tCg== - ca.crt: LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0tLS0tCk1JSURGakNDQWY2Z0F3SUJBZ0lRRTRlK0cySTZTL0VtTlZ3d05oRWJiVEFOQmdrcWhraUc5dzBCQVFzRkFEQWwKTVJFd0R3WURWUVFLRXdoaWIyOTBhM1ZpWlRFUU1BNEdBMVVFQXhNSGEzVmlaUzFqWVRBZUZ3MHhPREF6TVRVdwpOREUyTVRWYUZ3MHhPVEF6TVRVd05ERTJNVFZhTUNVeEVUQVBCZ05WQkFvVENHSnZiM1JyZFdKbE1SQXdEZ1lEClZRUURFd2RyZFdKbExXTmhNSUlCSWpBTkJna3Foa2lHOXcwQkFRRUZBQU9DQVE4QU1JSUJDZ0tDQVFFQXZpVGsKbjNBSDJ6VE1WL3pGL1E4V0N2NWNBYWVYUmNacXBxd3ZQeDJZRGlFVGZPY1N0NFIyaHJNeXFQRDVVa1dEOTluOApteWJFeDdmMW9jM1IxdXM0Zm9NR1d2OXR2SUpKa0Q5ckkya0MrU1lGcWtCOEtlZCtBVktJRmoza3JUL3p6NEhwCnpJckdvMzhSQWpDWER2UHNRVWNhWDV0WWZpWk5OR3ptQjB1YW9ZYkNnUWM3aDB5MWRWbU5KQ0tHVmR6TDRWd1UKbEZLK0lSSW9NRGtYN0NXN2VQYkhORXdKck81VEwrczg1T2kyM2xkRFl2UXNkMmtReXB0aDFSMU1aSzIrdW1UcgpHRlpZOEE2RmpHZDVLS0gwQkZmc1pBRFRXTXJ3K3JwSnpXcmtSQTR4VHBNU3V5TTZVZDc2R29KNXNUampsbU1mCmw4ZFdUSGFUZlM4T2lUVmVOd0lEQVFBQm8wSXdRREFPQmdOVkhROEJBZjhFQkFNQ0FxUXdEd1lEVlIwVEFRSC8KQkFVd0F3RUIvekFkQmdOVkhRNEVGZ1FVbVRuZHp6cUdUbERjcWJWSGJ0QzlXdE52aWdZd0RRWUpLb1pJaHZjTgpBUUVMQlFBRGdnRUJBRzZETU5GbFlVZjVTOW1TVWVyOERENVYzMENZWXdkNWpXeTZFcGl4WmhTdzJsc0VkMGNPCmtKSnA4enpIUTZrdXAwMkRWNGRqbHZGWUlGaW9GbTFpYkl5WkIvNEQyN1Mzc0JMT3lwTExnR3FmczZsMXBjdFIKTEpHUnRDSmRGZGx1Y2pWZG0veCtRc1NsRUM1c21ucm85OTFDdEV3WTFvd3VSQ1VFdFd6Rk9VaGRmYnBXSXJSMQpXcyt2dkNOdkhFQlBjMmxzY2R3OU00S0hScUNtWmtOVlRuM0tXbzFYUkNnSERPdWVwREdDWXdHc1NZZy9LMmpTCmVSWU9HWHpGKyt5d0g1cGNtMXhBeVZaVXJTQ25HYlJxS0lYeTlDN1lHcWtIb21hRUtJNWhNaVZDRDVrRGNMZm4KQ3JNaDAvUHpqQ3lwUWxGYXd3eW9BdWUyRU81YSsyc1drcW89Ci0tLS0tRU5EIENFUlRJRklDQVRFLS0tLS0K diff --git a/output/manifests/kube-controller-manager.yaml b/output/manifests/kube-controller-manager.yaml deleted file mode 100644 index 251e9fd9e..000000000 --- a/output/manifests/kube-controller-manager.yaml +++ /dev/null @@ -1,82 +0,0 @@ -apiVersion: apps/v1 -kind: Deployment -metadata: - name: kube-controller-manager - namespace: kube-system - labels: - tier: control-plane - k8s-app: kube-controller-manager -spec: - replicas: 2 - selector: - matchLabels: - tier: control-plane - k8s-app: kube-controller-manager - template: - metadata: - labels: - tier: control-plane - k8s-app: kube-controller-manager - spec: - affinity: - podAntiAffinity: - preferredDuringSchedulingIgnoredDuringExecution: - - weight: 100 - podAffinityTerm: - labelSelector: - matchExpressions: - - key: tier - operator: In - values: - - control-plane - - key: k8s-app - operator: In - values: - - kube-controller-manager - topologyKey: kubernetes.io/hostname - containers: - - name: kube-controller-manager - image: gcr.io/google_containers/hyperkube:v1.9.3 - command: - - ./hyperkube - - controller-manager - - --use-service-account-credentials - - --allocate-node-cidrs=true - - --cloud-provider= - - --cluster-cidr=10.2.0.0/16 - - --service-cluster-ip-range=10.3.0.0/16 - - --configure-cloud-routes=false - - --leader-elect=true - - --root-ca-file=/etc/kubernetes/secrets/ca.crt - - --service-account-private-key-file=/etc/kubernetes/secrets/service-account.key - livenessProbe: - httpGet: - path: /healthz - port: 10252 # Note: Using default port. Update if --port option is set differently. - initialDelaySeconds: 15 - timeoutSeconds: 15 - volumeMounts: - - name: secrets - mountPath: /etc/kubernetes/secrets - readOnly: true - - name: ssl-host - mountPath: /etc/ssl/certs - readOnly: true - nodeSelector: - node-role.kubernetes.io/master: "" - securityContext: - runAsNonRoot: true - runAsUser: 65534 - serviceAccountName: kube-controller-manager - tolerations: - - key: node-role.kubernetes.io/master - operator: Exists - effect: NoSchedule - volumes: - - name: secrets - secret: - secretName: kube-controller-manager - - name: ssl-host - hostPath: - path: /usr/share/ca-certificates - dnsPolicy: Default # Don't use cluster DNS. diff --git a/output/manifests/kube-dns-deployment.yaml b/output/manifests/kube-dns-deployment.yaml deleted file mode 100644 index 9874d51ba..000000000 --- a/output/manifests/kube-dns-deployment.yaml +++ /dev/null @@ -1,154 +0,0 @@ -apiVersion: apps/v1 -kind: Deployment -metadata: - name: kube-dns - namespace: kube-system - labels: - k8s-app: kube-dns - kubernetes.io/cluster-service: "true" - addonmanager.kubernetes.io/mode: Reconcile -spec: - # replicas: not specified here: - # 1. In order to make Addon Manager do not reconcile this replicas parameter. - # 2. Default is 1. - # 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on. - strategy: - rollingUpdate: - maxSurge: 10% - maxUnavailable: 0 - selector: - matchLabels: - k8s-app: kube-dns - template: - metadata: - labels: - k8s-app: kube-dns - spec: - nodeSelector: - node-role.kubernetes.io/master: "" - tolerations: - - key: node-role.kubernetes.io/master - operator: Exists - effect: NoSchedule - volumes: - - name: kube-dns-config - configMap: - name: kube-dns - optional: true - containers: - - name: kubedns - image: gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.8 - resources: - # TODO: Set memory limits when we've profiled the container for large - # clusters, then set request = limit to keep this container in - # guaranteed class. Currently, this container falls into the - # "burstable" category so the kubelet doesn't backoff from restarting it. - limits: - memory: 170Mi - requests: - cpu: 100m - memory: 70Mi - livenessProbe: - httpGet: - path: /healthcheck/kubedns - port: 10054 - scheme: HTTP - initialDelaySeconds: 60 - timeoutSeconds: 5 - successThreshold: 1 - failureThreshold: 5 - readinessProbe: - httpGet: - path: /readiness - port: 8081 - scheme: HTTP - # we poll on pod startup for the Kubernetes master service and - # only setup the /readiness HTTP server once that's available. - initialDelaySeconds: 3 - timeoutSeconds: 5 - args: - - --domain=cluster.local. - - --dns-port=10053 - - --config-dir=/kube-dns-config - - --v=2 - env: - - name: PROMETHEUS_PORT - value: "10055" - ports: - - containerPort: 10053 - name: dns-local - protocol: UDP - - containerPort: 10053 - name: dns-tcp-local - protocol: TCP - - containerPort: 10055 - name: metrics - protocol: TCP - volumeMounts: - - name: kube-dns-config - mountPath: /kube-dns-config - - name: dnsmasq - image: gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.8 - livenessProbe: - httpGet: - path: /healthcheck/dnsmasq - port: 10054 - scheme: HTTP - initialDelaySeconds: 60 - timeoutSeconds: 5 - successThreshold: 1 - failureThreshold: 5 - args: - - -v=2 - - -logtostderr - - -configDir=/etc/k8s/dns/dnsmasq-nanny - - -restartDnsmasq=true - - -- - - -k - - --cache-size=1000 - - --no-negcache - - --log-facility=- - - --server=/cluster.local/127.0.0.1#10053 - - --server=/in-addr.arpa/127.0.0.1#10053 - - --server=/ip6.arpa/127.0.0.1#10053 - ports: - - containerPort: 53 - name: dns - protocol: UDP - - containerPort: 53 - name: dns-tcp - protocol: TCP - # see: https://github.com/kubernetes/kubernetes/issues/29055 for details - resources: - requests: - cpu: 150m - memory: 20Mi - volumeMounts: - - name: kube-dns-config - mountPath: /etc/k8s/dns/dnsmasq-nanny - - name: sidecar - image: gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.8 - livenessProbe: - httpGet: - path: /metrics - port: 10054 - scheme: HTTP - initialDelaySeconds: 60 - timeoutSeconds: 5 - successThreshold: 1 - failureThreshold: 5 - args: - - --v=2 - - --logtostderr - - --probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local,5,SRV - - --probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local,5,SRV - ports: - - containerPort: 10054 - name: metrics - protocol: TCP - resources: - requests: - memory: 20Mi - cpu: 10m - dnsPolicy: Default # Don't use cluster DNS. - serviceAccountName: kube-dns diff --git a/output/manifests/kube-dns-sa.yaml b/output/manifests/kube-dns-sa.yaml deleted file mode 100644 index 4e5a85660..000000000 --- a/output/manifests/kube-dns-sa.yaml +++ /dev/null @@ -1,5 +0,0 @@ -apiVersion: v1 -kind: ServiceAccount -metadata: - name: kube-dns - namespace: kube-system diff --git a/output/manifests/kube-dns-svc.yaml b/output/manifests/kube-dns-svc.yaml deleted file mode 100644 index bbbebe22d..000000000 --- a/output/manifests/kube-dns-svc.yaml +++ /dev/null @@ -1,20 +0,0 @@ -apiVersion: v1 -kind: Service -metadata: - name: kube-dns - namespace: kube-system - labels: - k8s-app: kube-dns - kubernetes.io/cluster-service: "true" - kubernetes.io/name: "KubeDNS" -spec: - selector: - k8s-app: kube-dns - clusterIP: 10.3.0.10 - ports: - - name: dns - port: 53 - protocol: UDP - - name: dns-tcp - port: 53 - protocol: TCP diff --git a/output/manifests/kube-proxy-role-binding.yaml b/output/manifests/kube-proxy-role-binding.yaml deleted file mode 100644 index 0ae0e12a3..000000000 --- a/output/manifests/kube-proxy-role-binding.yaml +++ /dev/null @@ -1,12 +0,0 @@ -apiVersion: rbac.authorization.k8s.io/v1 -kind: ClusterRoleBinding -metadata: - name: kube-proxy -roleRef: - apiGroup: rbac.authorization.k8s.io - kind: ClusterRole - name: system:node-proxier # Automatically created system role. -subjects: -- kind: ServiceAccount - name: kube-proxy - namespace: kube-system diff --git a/output/manifests/kube-proxy-sa.yaml b/output/manifests/kube-proxy-sa.yaml deleted file mode 100644 index 651d76f30..000000000 --- a/output/manifests/kube-proxy-sa.yaml +++ /dev/null @@ -1,5 +0,0 @@ -apiVersion: v1 -kind: ServiceAccount -metadata: - namespace: kube-system - name: kube-proxy diff --git a/output/manifests/kube-proxy.yaml b/output/manifests/kube-proxy.yaml deleted file mode 100644 index c6a6cd408..000000000 --- a/output/manifests/kube-proxy.yaml +++ /dev/null @@ -1,67 +0,0 @@ -apiVersion: apps/v1 -kind: DaemonSet -metadata: - name: kube-proxy - namespace: kube-system - labels: - tier: node - k8s-app: kube-proxy -spec: - selector: - matchLabels: - tier: node - k8s-app: kube-proxy - template: - metadata: - labels: - tier: node - k8s-app: kube-proxy - spec: - containers: - - name: kube-proxy - image: gcr.io/google_containers/hyperkube:v1.9.3 - command: - - ./hyperkube - - proxy - - --cluster-cidr=10.2.0.0/16 - - --hostname-override=$(NODE_NAME) - - --kubeconfig=/etc/kubernetes/kubeconfig - - --proxy-mode=iptables - env: - - name: NODE_NAME - valueFrom: - fieldRef: - fieldPath: spec.nodeName - securityContext: - privileged: true - volumeMounts: - - mountPath: /lib/modules - name: lib-modules - readOnly: true - - mountPath: /etc/ssl/certs - name: ssl-certs-host - readOnly: true - - name: kubeconfig - mountPath: /etc/kubernetes - readOnly: true - hostNetwork: true - serviceAccountName: kube-proxy - tolerations: - - effect: NoSchedule - operator: Exists - - effect: NoExecute - operator: Exists - volumes: - - name: lib-modules - hostPath: - path: /lib/modules - - name: ssl-certs-host - hostPath: - path: /usr/share/ca-certificates - - name: kubeconfig - configMap: - name: kubeconfig-in-cluster - updateStrategy: - rollingUpdate: - maxUnavailable: 1 - type: RollingUpdate diff --git a/output/manifests/kube-scheduler-disruption.yaml b/output/manifests/kube-scheduler-disruption.yaml deleted file mode 100644 index 11af3faa2..000000000 --- a/output/manifests/kube-scheduler-disruption.yaml +++ /dev/null @@ -1,11 +0,0 @@ -apiVersion: policy/v1beta1 -kind: PodDisruptionBudget -metadata: - name: kube-scheduler - namespace: kube-system -spec: - minAvailable: 1 - selector: - matchLabels: - tier: control-plane - k8s-app: kube-scheduler diff --git a/output/manifests/kube-scheduler.yaml b/output/manifests/kube-scheduler.yaml deleted file mode 100644 index 780d61df5..000000000 --- a/output/manifests/kube-scheduler.yaml +++ /dev/null @@ -1,58 +0,0 @@ -apiVersion: apps/v1 -kind: Deployment -metadata: - name: kube-scheduler - namespace: kube-system - labels: - tier: control-plane - k8s-app: kube-scheduler -spec: - replicas: 2 - selector: - matchLabels: - tier: control-plane - k8s-app: kube-scheduler - template: - metadata: - labels: - tier: control-plane - k8s-app: kube-scheduler - spec: - affinity: - podAntiAffinity: - preferredDuringSchedulingIgnoredDuringExecution: - - weight: 100 - podAffinityTerm: - labelSelector: - matchExpressions: - - key: tier - operator: In - values: - - control-plane - - key: k8s-app - operator: In - values: - - kube-scheduler - topologyKey: kubernetes.io/hostname - containers: - - name: kube-scheduler - image: gcr.io/google_containers/hyperkube:v1.9.3 - command: - - ./hyperkube - - scheduler - - --leader-elect=true - livenessProbe: - httpGet: - path: /healthz - port: 10251 # Note: Using default port. Update if --port option is set differently. - initialDelaySeconds: 15 - timeoutSeconds: 15 - nodeSelector: - node-role.kubernetes.io/master: "" - securityContext: - runAsNonRoot: true - runAsUser: 65534 - tolerations: - - key: node-role.kubernetes.io/master - operator: Exists - effect: NoSchedule diff --git a/output/manifests/kube-system-rbac-role-binding.yaml b/output/manifests/kube-system-rbac-role-binding.yaml deleted file mode 100644 index 47623a36f..000000000 --- a/output/manifests/kube-system-rbac-role-binding.yaml +++ /dev/null @@ -1,12 +0,0 @@ -apiVersion: rbac.authorization.k8s.io/v1 -kind: ClusterRoleBinding -metadata: - name: system:default-sa -subjects: - - kind: ServiceAccount - name: default - namespace: kube-system -roleRef: - kind: ClusterRole - name: cluster-admin - apiGroup: rbac.authorization.k8s.io diff --git a/output/manifests/kubeconfig-in-cluster.yaml b/output/manifests/kubeconfig-in-cluster.yaml deleted file mode 100644 index 8e9ed3fb2..000000000 --- a/output/manifests/kubeconfig-in-cluster.yaml +++ /dev/null @@ -1,22 +0,0 @@ -apiVersion: v1 -kind: ConfigMap -metadata: - name: kubeconfig-in-cluster - namespace: kube-system -data: - kubeconfig: | - apiVersion: v1 - clusters: - - name: local - cluster: - server: https://bastion-test.k8s-playground.takescoop.com:443 - certificate-authority: /var/run/secrets/kubernetes.io/serviceaccount/ca.crt - users: - - name: service-account - user: - # Use service account token - tokenFile: /var/run/secrets/kubernetes.io/serviceaccount/token - contexts: - - context: - cluster: local - user: service-account diff --git a/output/manifests/pod-checkpointer-role-binding.yaml b/output/manifests/pod-checkpointer-role-binding.yaml deleted file mode 100644 index 97b12bab9..000000000 --- a/output/manifests/pod-checkpointer-role-binding.yaml +++ /dev/null @@ -1,13 +0,0 @@ -apiVersion: rbac.authorization.k8s.io/v1 -kind: RoleBinding -metadata: - name: pod-checkpointer - namespace: kube-system -roleRef: - apiGroup: rbac.authorization.k8s.io - kind: Role - name: pod-checkpointer -subjects: -- kind: ServiceAccount - name: pod-checkpointer - namespace: kube-system diff --git a/output/manifests/pod-checkpointer-role.yaml b/output/manifests/pod-checkpointer-role.yaml deleted file mode 100644 index 2a295e53c..000000000 --- a/output/manifests/pod-checkpointer-role.yaml +++ /dev/null @@ -1,12 +0,0 @@ -apiVersion: rbac.authorization.k8s.io/v1 -kind: Role -metadata: - name: pod-checkpointer - namespace: kube-system -rules: -- apiGroups: [""] # "" indicates the core API group - resources: ["pods"] - verbs: ["get", "watch", "list"] -- apiGroups: [""] # "" indicates the core API group - resources: ["secrets", "configmaps"] - verbs: ["get"] diff --git a/output/manifests/pod-checkpointer-sa.yaml b/output/manifests/pod-checkpointer-sa.yaml deleted file mode 100644 index e76928007..000000000 --- a/output/manifests/pod-checkpointer-sa.yaml +++ /dev/null @@ -1,5 +0,0 @@ -apiVersion: v1 -kind: ServiceAccount -metadata: - namespace: kube-system - name: pod-checkpointer diff --git a/output/manifests/pod-checkpointer.yaml b/output/manifests/pod-checkpointer.yaml deleted file mode 100644 index 10520391f..000000000 --- a/output/manifests/pod-checkpointer.yaml +++ /dev/null @@ -1,72 +0,0 @@ -apiVersion: apps/v1 -kind: DaemonSet -metadata: - name: pod-checkpointer - namespace: kube-system - labels: - tier: control-plane - k8s-app: pod-checkpointer -spec: - selector: - matchLabels: - tier: control-plane - k8s-app: pod-checkpointer - template: - metadata: - labels: - tier: control-plane - k8s-app: pod-checkpointer - annotations: - checkpointer.alpha.coreos.com/checkpoint: "true" - spec: - containers: - - name: pod-checkpointer - image: quay.io/coreos/pod-checkpointer:3cd08279c564e95c8b42a0b97c073522d4a6b965 - command: - - /checkpoint - - --lock-file=/var/run/lock/pod-checkpointer.lock - - --kubeconfig=/etc/checkpointer/kubeconfig - env: - - name: NODE_NAME - valueFrom: - fieldRef: - fieldPath: spec.nodeName - - name: POD_NAME - valueFrom: - fieldRef: - fieldPath: metadata.name - - name: POD_NAMESPACE - valueFrom: - fieldRef: - fieldPath: metadata.namespace - imagePullPolicy: Always - volumeMounts: - - mountPath: /etc/checkpointer - name: kubeconfig - - mountPath: /etc/kubernetes - name: etc-kubernetes - - mountPath: /var/run - name: var-run - serviceAccountName: pod-checkpointer - hostNetwork: true - nodeSelector: - node-role.kubernetes.io/master: "" - restartPolicy: Always - tolerations: - - key: node-role.kubernetes.io/master - operator: Exists - effect: NoSchedule - volumes: - - name: kubeconfig - configMap: - name: kubeconfig-in-cluster - - name: etc-kubernetes - hostPath: - path: /etc/kubernetes - - name: var-run - hostPath: - path: /var/run - updateStrategy: - rollingUpdate: - maxUnavailable: 1 - type: RollingUpdate diff --git a/requirements.txt b/requirements.txt index e8189b359..5afa3a9a8 100644 --- a/requirements.txt +++ b/requirements.txt @@ -1,4 +1,4 @@ -mkdocs==1.0.4 -mkdocs-material==4.6.3 -pygments==2.5.2 -pymdown-extensions==6.3.0 +mkdocs==1.1.2 +mkdocs-material==5.5.6 +pygments==2.6.1 +pymdown-extensions==7.1.0