K8s CoreDns not starting on 32.20200629.3.0 (Pods on master can't talk to the apiserver) #574

teintuc · 2020-07-17T09:52:54Z

Hi all,

I have a production kubernetes (v1.17.4) cluster running on fedora coreos 32.20200615.3.0 and all is fine.

When I try the last release 32.20200629.3.0, on my test cluster, some core service (like coredns) can't contact the apiserver because the service address (10.3.0.1) is timing out.

I rollbacked the nodes to 32.20200615.3.0 and it is all good again.
Which makes me think it comes from something new in the last version.

Is there something that changed between those version that could cause the problem?
Where can I go check in order to fix my problem?

Here the result of my rpm-ostree status:

State: idle
Deployments:
  ostree://fedora:fedora/x86_64/coreos/stable
                   Version: 32.20200629.3.0 (2020-07-10T17:58:03Z)
                    Commit: 6df95bdb2fe2d36e091d4d18e3844fa84ce4b80ea3bd0947db5d7a286ff41890
              GPGSignature: Valid signature by 97A1AE57C3A2372CCA3A4ABA6C13026D12C944D0

● ostree://fedora:fedora/x86_64/coreos/stable
                   Version: 32.20200629.3.0 (2020-07-10T17:58:03Z)
                    Commit: 6df95bdb2fe2d36e091d4d18e3844fa84ce4b80ea3bd0947db5d7a286ff41890
              GPGSignature: Valid signature by 97A1AE57C3A2372CCA3A4ABA6C13026D12C944D0

Thanks

The text was updated successfully, but these errors were encountered:

lucab · 2020-07-17T10:58:31Z

@teintuc thanks for the report. You can see the whole list of package differences on the website, 46 packages were upgraded in this release.
To the best of my knowledge, we didn't receive any related breakage report while this set of packages was soaking in testing/next.

cgwalters · 2020-07-17T13:14:27Z

Of those changes it looks to me like by far the most likely to be involved are iptables and kernel, and I'd put more money on the latter. Can you try using e.g. rpm-ostree override replace to downgrade the kernel on a node starting from the newer version, to bisect this?

teintuc · 2020-07-17T13:44:57Z

I was thinking about iptables but it seems ok.
Haven't thought about the kernel. I'll try to downgrade it (once I found how to do it :) )

Thanks

dustymabe · 2020-07-17T13:53:58Z

@teintuc - can you easily recreate these nodes? If so let's do our experiment on one that is disposable. You'll want to grabe the rpms from the kernel from 32.20200615.3.0 (looks like those are here) and then download them to the node that is showing the problem (on 32.20200629.3.0) and then:

rpm-ostree override replace ./kernel-5.6.18-300.fc32.x86_64.rpm ./kernel-core-5.6.18-300.fc32.x86_64.rpm ./kernel-modules-5.6.18-300.fc32.x86_64.rpm --reboot

I'd recommend doing this on a disposable node because you may end up losing your 32.20200615.3.0 deployment when performing these operations.

teintuc · 2020-07-17T13:56:29Z

thanks for the tips.

Yes, I can do this on disposable node :)
I'm already running nodes on 32.20200629.3.0.

teintuc · 2020-07-17T14:07:37Z

Ok I followed your instructions:
I got the following error:

 Problem: conflicting requests
  - nothing provides kernel-modules-uname-r = 5.6.18-300.fc32.x86_64 needed by kernel-5.6.18-300.fc32.x86_64

So after looking in the kernel repo you gave me, I added this package kernel-modules-5.6.18-300.fc32.x86_64.rpm.
I get this error:
error: Loading pkgcache branch rpmostree/pkg/kernel/5.6.18-300.fc32.x86__64: Failed to find metadata key rpmostree.sepolicy (signature s)

I'm looking to make it work :)

cgwalters · 2020-07-17T14:09:18Z

Dusty meant ./kernel-modules-5.6.18-300.fc32.x86_64.rpm and not kernel-debug-modules.

error: Loading pkgcache branch rpmostree/pkg/kernel/5.6.18-300.fc32.x86__64: Failed to find metadata key rpmostree.sepolicy (signature s)

Hmm...did you disable SELinux? That...isn't very well tested in rpm-ostree today.

teintuc · 2020-07-17T14:46:58Z

I tried without the debug module. I also, as you said, had to enable SELinux to make it worked.
Once all my nodes rebooted:

$ uname -a
Linux ip-10-102-145-247 5.6.18-300.fc32.x86_64 #1 SMP Wed Jun 10 21:38:25 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ cat /etc/os-release
NAME=Fedora
VERSION="32.20200629.3.0 (CoreOS)"
ID=fedora
VERSION_ID=32
VERSION_CODENAME=""
PLATFORM_ID="platform:f32"
PRETTY_NAME="Fedora CoreOS 32.20200629.3.0"
ANSI_COLOR="0;34"
LOGO=fedora-logo-icon
CPE_NAME="cpe:/o:fedoraproject:fedora:32"
HOME_URL="https://getfedora.org/coreos/"
DOCUMENTATION_URL="https://docs.fedoraproject.org/en-US/fedora-coreos/"
SUPPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
BUG_REPORT_URL="https://github.com/coreos/fedora-coreos-tracker/"
REDHAT_BUGZILLA_PRODUCT="Fedora"
REDHAT_BUGZILLA_PRODUCT_VERSION=32
REDHAT_SUPPORT_PRODUCT="Fedora"
REDHAT_SUPPORT_PRODUCT_VERSION=32
PRIVACY_POLICY_URL="https://fedoraproject.org/wiki/Legal:PrivacyPolicy"
VARIANT="CoreOS"
VARIANT_ID=coreos
OSTREE_VERSION='32.20200629.3.0'

I still get the same error on my kubernetes cluster: timeout on 10.3.0.0.1

I start to think my problems comes from iptables. Since it has been updated in the last release, the coincidence is too big.
I could do the same test by downgrading iptables packages.

dustymabe · 2020-07-17T15:17:02Z

Dusty meant ./kernel-modules-5.6.18-300.fc32.x86_64.rpm and not kernel-debug-modules.

oops.. copy/pasta.. fixed in my original comment now.

dustymabe · 2020-07-21T13:24:20Z

hey @teintuc - did you take a closer look at iptables or find out another reason for the regression?

teintuc · 2020-07-21T13:28:31Z

Hi, thanks for asking :)

I didn't had much time to have an other look at it.
I found some kubernetes related logs in my iptables in the KUBE-SERVICES chain like :

    0     0 REJECT     udp  --  any    any     anywhere             ip-10-3-0-10.ec2.internal  /* kube-system/kube-dns:dns has no endpoints */ udp dpt:domain reject-with icmp-port-unreachable

But don't know what is causing it. I'll keep digging.

dustymabe · 2020-07-21T13:36:28Z

Are you using docker/containerd or cri-o for the container runtime for kube? We have noticed issues in the past where running multiple container runtimes at the same time can cause issues with the firewall.

teintuc · 2020-07-21T13:40:59Z

We are using docker.

But there is something interesting: I pop some kubernetes workers, not only masters. Seems that pods hosted on master get the problem. In my case Coredns

DirkTheDaring · 2020-07-27T11:30:31Z

@teintuc are you using by any chance flannel? The same issue showed up in my environment
#583

teintuc · 2020-07-27T11:39:26Z

@DirkTheDaring We are not using flannel. We use the AWS-VPC-CNI.
But your problem looks a lot like mine. I can't reach coredns to resolve kubernetes.default (it goes to timeout).

DirkTheDaring · 2020-07-27T12:04:21Z

Then this goes back to the original questions of my engineers. What did CoreOS Change on the bottom level of the network? It seems like calico, which uses tunnels is not affected. Flannel uses VXLAN, and it seems that the packages only sometimes reach the DNS server (old CoreOS version begin of June) or it doesn't work at all (latest end of June version of CoreOS). We can note that on the baremetal image i am using the network device naming has changed from eth0 to ens18. But even a careful comparison of all firewall rules did not show any obvious difference. Whatever it is it happens somewhere the packages are encapsulated for the network transfer (flannel) and then either the transfer is rejected or let go, by some unknown mechanism.

So fare my understanding...

teintuc · 2020-07-27T13:08:57Z

Thanks for you analysis. I don't have much more to say.
Still have some more digging to do (And I will be on vacation soon).

DirkTheDaring · 2020-07-27T13:17:17Z

I have one request before you leave, would you please so kind to change the title your report so that it does reflect more the gist of what we are discussing here? This might help to attract other people with the same issue.

teintuc · 2020-07-27T13:21:00Z

@DirkTheDaring Done

dghubble · 2020-07-29T00:51:30Z

Let's be precise. The problem appears in 32.20200629.3.0 and is observed as a lack of pod-to-pod connectivity across node boundaries, when using the flannel CNI provider.

Pod to pod traffic for pods on the same node is ok
Calico (IPIP or vxlan) and Cilium (vxlan) are not affected
Pod outgoing masquerade traffic is ok (remember internal DNS is a cluster service connection)
Connectivity problems consistent, not random (double check your pods aren't on the same node, and pin host OS)

We can monitor traffic to identify where something may go wrong. With Pod A and Pod B on different nodes, Pod A pinging Pod B can be observed on Pod B's host flannel.1 interface, but does not reach the cni0, despite the route table reporting that's the interface to route to next. This shows traffic traverses the vxlan tunnel fine, matching other vxlan-based providers not being affected. But does not make it to Pod B will never see the ping request. From Pod B's host, Pod B can be reached.

On the prior FCOS version, you can monitor pod-to-pod connections in the same way and see ICMP packets reach flannel.1 and cni and Pod B replies of course.

OS Experiments

The problem can be observed in both new 32.20200629.3.0 nodes (new NIC names) and in nodes that upgrade to 32.20200629.3.0 or beyond (legacy NIC names), suggesting the NIC name migration is not a factor.

Starting with a good 32.20200615.3.0 node, incrementally overriding RPM packages to those used in 32.20200629.3.0 and rebooting, changing iptables, moby-engine, or kernel packages did not cause the node to exhibit the problem behavior. Further experiments start to run into conflicts.
Starting with a bad 32.20200629.3.0 node, rolling back the kernel RPMs did not convert it into a working node.
Its possible the change lies in OS configurations rather than specific RPM packages, but I don't see a good way to diff those. Perhaps we start looking into any intermediate published images (from testing or next btw those times).

I found few device sysctl changes of note and no iptables rules alterations either.

We probably want to stop describing this in terms of troubleshooting connecting the cluster services (kube-apiserver, CoreDNS). That's likely to draw random Google hits. But when pod-to-pod connectivity is broken at a specific point, those problems follow as symptoms that just happen to be the first thing anyone sees.

dghubble · 2020-07-29T02:44:52Z

FCOS next 32.20200625.1.0 is also affected so that may help reduce the change set.

dustymabe · 2020-07-29T04:45:35Z

I'd love to dig in and help find the problem. I can even do some custom builds and share them if people have a theory they want to test and don't have a build environment set up.

Its possible the change lies in OS configurations rather than specific RPM packages, but I don't see a good way to diff those. Perhaps we start looking into any intermediate published images (from testing or next btw those times).

Yep. That's what I typically do. You'll get the most granularity if you traverse the testing-devel stream. I'd suggest that you identify a node to test against and disable zincati on it. Then bisect over the commits in the stream (use the builds browser to find commit hashes) to find the last good and first bad.

Example:

commit=45ebfc671e478822bac3b1b99b47a5be6683e3fe859649c241f0fc4f1ea326a1 # 32.20200614.20.0
sudo rpm-ostree rebase fedora-compose:$commit --reboot

# check if it's good or bad (bisect)

commit=b6bf1636c8d04d6ced39ca9cd4ee3b7a7759aabb6b416cc877c25f15b2ad2a28 # 32.20200625.20.1
sudo rpm-ostree rebase fedora-compose:$commit --reboot

I used to have this whole process automated for Atomic Host. I should probably try to revive that.

DirkTheDaring · 2020-07-29T05:29:22Z

@dghubble cilium is also not working in my test environment
Currently I have the following hypothesis:
Calico does kernel space package encapsulation (IPIP), while both other CNIs like flannel (VXLAN) and cilium are doing Userspace package encapsulation.
So I put out the following "testable" claim: All CNIs which are doing user space encapsulation are broken in the latests builds.

And by test just test on every node if nslookup kubernet.default returns success. If it is succesful on all nodes it works (like in any CNI before the broken builds). In the latest builds, the test passes only for calico.
Note on test setup. At least 1 master and 3 nodes. Usually, it passes on nodes where the coredns is running, but it fails then on nodes where "only" nodelocaldns resides. Which begs the questions if it might be mtu related, as this only applies afaik when a real network transfer happens.

dghubble · 2020-07-29T06:07:27Z

Thanks @dustymabe! In the testing-devel stream, the problem is absent in 32.20200624.20.0 and arises in 32.20200624.20.1 (9650ef3690346e22b1045223699bd7a428323a31642fcc9165dc15410d52c4c1). Rolling back fixes it again. Rolling forward to that commit causes it again.

@DirkTheDaring I have v1.18.6 clusters with Cilium v1.8.1/v1.8.2 and FCOS 32.20200715.3.0 nodes which don't show the traffic issue or application lookup problems. For reference. Perhaps the underlying cause of the issue you see may be different. Also, Cilium status has full-mesh connectivity checks you can perform, so it can do much better than querying CoreDNS or other indirect measures of connectivity.

DirkTheDaring · 2020-07-29T07:16:12Z

@dghubble sorry, if i am persistent but i had two people here claiming that everything is fine with their cilium and i just could prove it happens also (but it is not triggered very often).
Would you please so kind to ease my mind and run one query on a cluster which is at least up for 3 days? (this was one a 5 days running cluster... and it had 5 entries... so it happens once a day - if you switch to calico, same workload -> no entries)

kubectl -n kube-system logs -lk8s-app=nodelocaldns --max-log-requests 20 |grep plugin/errors
[ERROR] plugin/errors: 2 221.70.168.192.in-addr.arpa. PTR: read tcp 192.168.0.3:51226->192.168.0.3:53: i/o timeout

and by the way looks cool this typhoon 👍

dghubble · 2020-07-29T07:29:25Z

The OS commits correspond to these changes. FCOS had been preventing systemd-udev's default.link being added.

FCOS testing-devel 32.20200624.20.1 is the first build to add /usr/lib/systemd/network/99-default.link.

[Match]
OriginalName=*

[Link]
NamePolicy=keep kernel database onboard slot path
AlternativeNamesPolicy=database onboard slot path
MACAddressPolicy=persistent

A nice flannel report describes flannel's wish to pick its own mac address for flannel.1, but this will conflict with systemd's pick. "This results in all cross node traffic being dropped at layer 2 on the destination node due to incorrect destination vtep mac". This matches the traffic problem observed here and neighbor nodes arp tables do indeed show a differing address, this all checks out.

Solution

So flannel users can add a flannel specific link configuration (kindly mentioned in that thread). Adapted here for Ignition:

variant: fcos
version: 1.0.0
storage:
  files:
    - path: /etc/systemd/network/10-flannel.link
      mode: 0644
      contents:
        inline: |
          [Match]
          OriginalName=flannel*
          [Link]
          MACAddressPolicy=none

PS: If you use Typhoon with flannel (not the default):

module "mycluster" {
  ...
  networking = "flannel"
  controller_snippets = [
    file("./snippets/flannel.yaml"),
  ]
  worker_snippets = [
    file("./snippets/flannel.yaml"),
  ]
}

Followup

Flatcar shipped a flannel.link for the same reason. Its not clear to me the OS should be involved in such CNI-specific needs since users can do this. But I mention it for completeness.

@teintuc could you retitle this?

dghubble · 2020-07-29T07:52:42Z

@DirkTheDaring I don't run nodelocaldns, but CoreDNS logs (12 days) on Cilium clusters show no error logs and graphs look fine too (else I'd be paged). If that is helpful to you.

lucab · 2020-07-29T08:02:30Z

@dghubble thanks for the followup! It looks like in Container Linux we had a network unit to mark flannel interfaces as unmanaged: https://github.com/coreos/coreos-overlay/blob/6cbf3556370722f11612a872a7d223e686c672c0/app-admin/flannel-wrapper/files/50-flannel.network. I guess that may have had the same effect of preventing systemd to modify the interface.

dghubble · 2020-07-29T08:11:38Z

Seems so. I could see it being nice to have builtin (odds of a random person calling their link flannel* seem low), but also maybe setting a bad precendent if future projects come along wanting the same. Either way seems ok to me!

DirkTheDaring · 2020-07-29T08:20:44Z

@dghubble Thanks for providing a fact that cilium works
@dghubble Thanks also for tracing down the problem. I have setup a fresh cluster with flannel and the fix above. I am happy to report that it works, I cannot reproduce any of the DNS problems.

On a personal note:
It seems that there are now a lot of "cooks" which fiddle around with the network:

systemd
NetworkManager
flannel (or any other CNI)

Is it really necessary that systemd is involved?
The change is also not obvious for anybody, without finding this issue on github.

dustymabe · 2020-07-29T13:50:37Z

@dghubble +1000 for the great detective work! This is unfortunate fallout from #484.

Will try to discuss this during the meeting today. Please come if you're free.

dustymabe · 2020-07-30T13:37:33Z

We discussed this in the meeting yesterday.

The two options we discussed to solve this problem are:

including a config snippet in the OS to handle this
reaching out to upstreams to include it when they deliver the software

The first is easy, but can be problematic for a few reasons:

we set a precedent of baking in configs for different projects (i.e. what about other projects similar to flannel that need something equivalent?, do we need to bake in a config for each?
if the config ever needs to be changed we need to update it. The upstream project itself should be making those decisions.

In the meeting we decided to wait a week and see if more information surfaces. If there are a few upstreams that we can target that will capture most flannel users then it would be nice to make the changes there. Like:

flannel rpm in Fedora (if package layering)
flannel config via typhoon
flannel config via the upstream docs

DirkTheDaring · 2020-07-30T13:42:07Z

@dustymabe we configure flannel (download and install) with kubespray.

Maybe it makes sense to open a ticket in the kubespray project so that they deploy a fix similar to typhoon.

dustymabe · 2020-07-30T16:10:46Z

Maybe it makes sense to open a ticket in the kubespray project so that they deploy a fix similar to typhoon.

That would be great at least to have the conversation. In the meeting (as summarized above) we weren't super opposed to the first option, but obviously including it upstream means that the fix it for other platforms as well (not just FCOS). Do you mind opening a ticket?

DirkTheDaring · 2020-07-31T18:53:46Z

@dustymabe already had opened one.
kubernetes-sigs/kubespray#6451

* Fedora CoreOS now ships systemd-udev's `default.link` while Flannel relies on being able to pick its own MAC address for the `flannel.1` link for tunneled traffic to reach cni0 on the destination side, without being dropped * This change first appeared in FCOS testing-devel 32.20200624.20.1 and is the behavior going forward in FCOS since it was added to align FCOS network naming / configs with the rest of Fedora and address issues related to the default being missing * Flatcar Linux (and Container Linux) has a specific flannel.link configuration builtin, so it was not affected * coreos/fedora-coreos-tracker#574 (comment) Note: Typhoon's recommended and default CNI provider is Calico, unless `networking` is set to flannel directly.

dghubble · 2020-08-02T05:58:22Z

Flannel effectively has a host requirement on how the flannel link is brought up, so I think this is best declared via Ignition. For myself (and Typhoon users), going with Ignition inclusion poseidon/typhoon#795. No changes needed in FCOS.

While Flannel is often run as a DaemonSet on container-optimized OSes and its possible to (ab)use the init container for this, Ignition seems the better option in my comparison. Via the daemonset mutates systemd network files after boot, requires privilege, may require networking restart, and its worrying to be mounting systemd files without strict cause.

cgwalters · 2020-08-05T16:52:21Z

One thing I've floated in the past is that we really want a clean mechanism to say to things NetworkManager-and-below (including systemd) like "don't touch this interface". For a long time NetworkManager even tried to do DHCP on veth pairs created by container runtimes for example.

So rather than matching on name, something like an extended attribute in the kernel network object or so. Then when flannel creates the interface it would atomically add that flag and everything in the OS would know to ignore it.

spaced · 2020-08-11T19:23:07Z

Disclaimer: i did not read each comment, but:
I had a similiar issue using latest fcos, because package podman comes with a cni plugin, using kubespray installed a calico cni and result was a strange behavior with dns and k8s network policies.
Make sure you have only ONE cni plugin installed (check /etc/cni/net.d).
I removed the podman network with podman network rm podman command.

TLDR: base package comes with a configured cni plugin (for podman) and may cause issues if using other cni plugins without uninstall it.

dghubble · 2020-08-11T20:09:14Z

I think this issue is should be closed. The problem centered around the link behavior changes following 32.20200629.3.0 and the behavior matches other distros pulling in the systemd defaults, with the solution of specifying flannels link needs.

An issue with a broad title (like "pods can't talk") will start getting unrelated reports.

* Update Grafana from v6.4.4 to v6.5.0 * https://grafana.com/docs/guides/whats-new-in-v6-5/ * Update Grafana from v6.5.0 to v6.5.1 * https://github.com/grafana/grafana/releases/tag/v6.5.1 * Fix DigitalOcean controller and worker ipv4/ipv6 outputs (#594) * Fix controller and worker ipv4/ipv4 outputs to be lists of strings * With Terraform v0.11 syntax, an enclosing list was required to coerce the output to be a list of strings * With Terraform v0.12 syntax, the enclosing list shouldn't be needed * Update mkdocs-material from v4.5.0 to v4.5.1 * Introduce cluster creation without local writes to asset_dir * Allow generated assets (TLS materials, manifests) to be securely distributed to controller node(s) via file provisioner (i.e. ssh-agent) as an assets bundle file, rather than relying on assets being locally rendered to disk in an asset_dir and then securely distributed * Change `asset_dir` from required to optional. Left unset, asset_dir defaults to "" and no assets will be written to files on the machine that runs terraform apply * Enhancement: Managed cluster assets are kept only in Terraform state, which supports different backends (GCS, S3, etcd, etc) and optional encryption. terraform apply accesses state, runs in-memory, and distributes sensitive materials to controllers without making use of local disk (simplifies use in CI systems) * Enhancement: Improve asset unpack and layout process to position etcd certificates and control plane certificates more cleanly, without unneeded secret materials Details: * Terraform file provisioner support for distributing directories of contents (with unknown structure) has been limited to reading from a local directory, meaning local writes to asset_dir were required. https://github.com/poseidon/typhoon/issues/585 discusses the problem and newer or upcoming Terraform features that might help. * Observation: Terraform provisioner support for single files works well, but iteration isn't viable. We're also constrained to Terraform language features on the apply side (no extra plugins, no shelling out) and CoreOS / Fedora tools on the receive side. * Take a map representation of the contents that would have been splayed out in asset_dir and pack/encode them into a single file format devised for easy unpacking. Use an awk one-liner on the receive side to unpack. In pratice, this has worked well and its rather nice that a single assets file is transferred by file provisioner (all or none) Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/162 * Add/update docs for asset_dir and kubeconfig usage * Original tutorials favored including the platform (e.g. google-cloud) in modules (e.g. google-cloud-yavin). Prefer naming conventions where each module / cluster has a simple name (e.g. yavin) since the platform is usually redundant * Retain the example cluster naming themes per platform * Reduce apiserver metrics cardinality and extraneous labels * Stop mapping node labels to targets discovered via Kubernetes nodes (e.g. etcd, kubelet, cadvisor). It is rarely useful to store node labels (e.g. kubernetes.io/os=linux) on these metrics * kube-apiserver's apiserver_request_duration_seconds_bucket metric has a high cardinality that includes labels for the API group, verb, scope, resource, and component for each object type, including for each CRD. This one metric has ~10k time series in a typical cluster (btw 10-40% of total) * Removing the apiserver request duration outright would make latency alerts a NoOp and break a Grafana apiserver panel. Instead, drop series that have a "group" label. Effectively, only request durations for core Kubernetes APIs will be kept (e.g. cardinality won't grow with each CRD added). This reduces the metric to ~2k unique series * Reduce kube-controller-manager pod eviction timeout from 5m to 1m * Reduce time to delete pods on unready nodes from 5m to 1m * Present since v1.13.3, but mistakenly removed in v1.16.0 static pod control plane migration Related: * https://github.com/poseidon/terraform-render-bootstrap/pull/148 * https://github.com/poseidon/terraform-render-bootstrap/pull/164 * Update Kubernetes from v1.16.3 to v1.17.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md/#v1170 * Update systemd services for the v0.17.x hyperkube * Binary asset locations within the upstream hyperkube image changed https://github.com/kubernetes/kubernetes/pull/84662 * Fix Container Linux and Flatcar Linux kubelet.service (rkt-fly with fairly dated CoreOS kubelet-wrapper) * Fix Fedora CoreOS kubelet.service (podman) * Fix Fedora CoreOS bootstrap.service * Fix delete-node kubectl usage for workers where nodes may delete themselves on shutdown (e.g. preemptible instances) * Update Calico from v3.10.1 to v3.10.2 * https://docs.projectcalico.org/v3.10/release-notes/ * Update CHANGES and tutorial notes for release * Update recommended Terraform and provider plugin versions * Update the rough count of resources created per cluster since its not been refreshed in a while (will vary based on cluster options) * Fix minor example typo in README * Update mkdocs-material from v4.5.1 to v4.6.0 * Update Grafana from v6.5.1 to v6.5.2 * https://github.com/grafana/grafana/releases/tag/v6.5.2 * Update kube-state-metrics from v1.8.0 to v1.9.0-rc.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0-rc.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0-rc.0 * Add Kubelet kubeconfig output for DigitalOcean * Allow the raw kubelet kubeconfig to be consumed via Terraform output * Update kube-state-metrics from v1.9.0-rc.1 to v1.9.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0-rc.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.0-rc.0 * Update CoreDNS from v1.6.5 to v1.6.6 * https://coredns.io/2019/12/11/coredns-1.6.6-release/ * Update Prometheus from v2.14.0 to v2.15.0 * https://github.com/prometheus/prometheus/releases/tag/v2.15.0 * Update Prometheus from v2.15.0 to v2.15.1 * https://github.com/prometheus/prometheus/releases/tag/v2.15.1 * Update Calico from v3.10.2 to v3.11.1 * https://docs.projectcalico.org/v3.11/release-notes/ * Rename CLC files and favor Terraform list index syntax * Rename Container Linux Config (CLC) files to *.yaml to align with Fedora CoreOS Config (FCC) files and for syntax highlighting * Replace common uses of Terraform `element` (which wraps around) with `list[index]` syntax to surface index errors * Inline Container Linux kubelet.service, deprecate kubelet-wrapper * Change kubelet.service on Container Linux nodes to ExecStart Kubelet inline to replace the use of the host OS kubelet-wrapper script * Express rkt run flags and volume mounts in a clear, uniform way to make the Kubelet service easier to audit, manage, and understand * Eliminate reliance on a Container Linux kubelet-wrapper script * Typhoon for Fedora CoreOS developed a kubelet.service that similarly uses an inline ExecStart (except with podman instead of rkt) and a more minimal set of volume mounts. Adopt the volume improvements: * Change Kubelet /etc/kubernetes volume to read-only * Change Kubelet /etc/resolv.conf volume to read-only * Remove unneeded /var/lib/cni volume mount Background: * kubelet-wrapper was added in CoreOS around the time of Kubernetes v1.0 to simplify running a CoreOS-built hyperkube ACI image via rkt-fly. The script defaults are no longer ideal (e.g. rkt's notion of trust dates back to quay.io ACI image serving and signing, which informed the OCI standard images we use today, though they still lack rkt's signing ideas). * Shipping kubelet-wrapper was regretted at CoreOS, but remains in the distro for compatibility. The script is not updated to track hyperkube changes, but it is stable and kubelet.env overrides bridge most gaps * Typhoon Container Linux nodes have used kubelet-wrapper to rkt/rkt-fly run the Kubelet via the official k8s.gcr.io hyperkube image using overrides (new image registry, new image format, restart handling, new mounts, new entrypoint in v1.17). * Observation: Most of what it takes to run a Kubelet container is defined in Typhoon, not in kubelet-wrapper. The wrapper's value is now undermined by having to workaround its dated defaults. Typhoon may be better served defining Kubelet.service explicitly * Typhoon for Fedora CoreOS developed a kubelet.service without the use of a host OS kubelet-wrapper which is both clearer and eliminated some volume mounts * Disable Kubelet 127.0.0.1.10248 healthz endpoint * Kubelet runs a healthz server listening on 127.0.0.1:10248 by default. Its unused by Typhoon and can be disabled * https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet/ * Enable kube-proxy metrics and allow Prometheus scrapes * Configure kube-proxy --metrics-bind-address=0.0.0.0 (default 127.0.0.1) to serve metrics on 0.0.0.0:10249 * Add firewall rules to allow Prometheus (resides on a worker) to scrape kube-proxy service endpoints on controllers or workers * Add a clusterIP: None service for kube-proxy endpoint discovery * Reduce Prometheus addon's node-exporter tolerations * Change node-exporter DaemonSet tolerations from tolerating all possible NoSchedule taints to tolerating the master taint and the not ready taint (we'd like metrics regardless) * Users who add custom node taints must add their custom taints to the addon node-exporter DaemonSet. As an addon, its expected users copy and manipulate manifests out-of-band in their own systems * Ensure /etc/kubernetes exists following Kubelet inlining * Inlining the Kubelet service removed the need for the kubelet.env file declared in Ignition. However, on some platforms, this removed the guarantee that /etc/kubernetes exists. Bare-Metal and DigitalOcean distribute the kubelet kubeconfig through Terraform file provisioner (scp) and place it in (now missing) /etc/kubernetes * https://github.com/poseidon/typhoon/pull/606 * Fix bare-metal and DigitalOcean Ignition to ensure the desired directory exists following first boot from disk * Cloud platforms with worker pools distribute the kubeconfig through Ignition user data (no impact or need) * Update Prometheus from v2.15.1 to v2.15.2 * https://github.com/prometheus/prometheus/releases/tag/v2.15.2 * Allow terraform-provider-google v3.x plugin versions * Typhoon Google Cloud is compatible with `terraform-provider-google` v3.x releases * No v3.x specific features are used, so v2.19+ provider versions are still allowed, to ease migrations * Update kube-state-metrics from v1.9.0 to v1.9.1 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.1 * Remove unneeded Kubelet /var/run mount on Fedora CoreOS * /var/run symlinks to /run (already mounted) * Fix bare-metal instruction for watching install to disk * Original instructions were to watch install to disk by SSH'ing via port 2222 following Typhoon v1.10.1. Restore that message, since the version number in the instruction was incorrectly bumped on each release * Update AWS Fedora CoreOS AMI filter for fedora-coreos-31 * Select the most recent fedora-coreos-31 AMI on AWS, instead of the most recent fedora-coreos-30 AMI (Nov 27, 2019) * Evaluated with fedora-coreos-31.20200108.2.0-hvm * Update Kubernetes from v1.17.0 to v1.17.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md#v1171 * Update Calico from v3.11.1 to v3.11.2 * https://docs.projectcalico.org/v3.11/release-notes/ * Fix link in maintenance docs * Also a fix version mention, since Terraform v0.12 was added in Typhoon v1.15.0 * Update Grafana from v6.5.2 to v6.5.3 * https://github.com/grafana/grafana/releases/tag/v6.5.3 * Update kube-state-metrics from v1.9.1 to v1.9.2 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.2 * Update bare-metal Fedora CoreOS image location * Use Fedora CoreOS production download streams (change) * Use live PXE kernel and initramfs images * https://getfedora.org/coreos/download/ * Update docs example to use public images (cache is still recommended at large scale) and stable stream * Update nginx-ingress from v0.26.1 to v0.27.1 * Change runAsUser from 33 to 101 for new alpine-based image * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.27.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.27.1 * Update Kubernetes from v1.17.1 to v1.17.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.17.md#v1172 * Update kube-state-metrics from v1.9.2 to v1.9.3 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.3 * Promote Fedora CoreOS from preview to alpha in docs * Add an announcement to the website as well * Fix minor typo in announcement date * Update Grafana from v6.5.3 to v6.6.0 * https://github.com/grafana/grafana/releases/tag/v6.6.0 * Update nginx-ingress from v0.27.1 to v0.28.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.28.0 * Add module for Fedora CoreOS on Google Cloud * Add Typhoon Fedora CoreOS on Google Cloud as alpha * Add docs on uploading the Fedora CoreOS GCP gzipped tarball to Google Cloud storage to create a boot disk image * Update kube-state-metrics from v1.9.3 to v1.9.4 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.4 * Update Calico from v3.11.2 to v3.12.0 * https://docs.projectcalico.org/release-notes/#v3120 * Remove reverse packet filter override, since Calico no longer relies on the setting * https://github.com/coreos/fedora-coreos-tracker/issues/219 * https://github.com/projectcalico/felix/pull/2189 * Update Grafana from v6.6.0 to v6.6.1 * https://github.com/grafana/grafana/releases/tag/v6.6.1 * Update docs generation packages * Update mkdocs-material from v4.6.0 to v4.6.2 * Add guide for Typhoon with Flatcar Linux on Google Cloud * Add docs on manually uploading a Flatcar Linux GCE/GCP gzipped tarball image as a Compute Engine image for use with the Typhoon container-linux module * Set status of Flatcar Linux on Google Cloud to alpha * Update Fedora CoreOS kernel arguments to align with upstream * Align bare-metal kernel arguments with upstream docs * Add missing initrd argument which can cause issues if not present. Fix #638 * Add tty0 and ttyS0 consoles (matches Container Linux) * Remove unused coreos.inst=yes Related: https://docs.fedoraproject.org/en-US/fedora-coreos/bare-metal/ * Update Kubernetes from v1.17.2 to v1.17.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.17.md#v1173 * Set docker log driver to json-file on Fedora CoreOS * Fix the last minor issue for Fedora CoreOS clusters to pass CNCF's Kubernetes conformance tests * Kubelet supports a seldom used feature `kubectl logs --limit-bytes=N` to trim a log stream to a desired length. Kubelet handles this in the CRI driver. The Kubelet docker shim only supports the limit bytes feature when Docker is configured with the default `json-file` logging driver * CNCF conformance tests started requiring limit-bytes be supported, indirectly forcing the log driver choice until either the Kubelet or the conformance tests are fixed * Fedora CoreOS defaults Docker to use `journald` (desired). For now, as a workaround to offer conformant clusters, the log driver can be set back to `json-file`. RHEL CoreOS likely won't have noticed the non-conformance since its using crio runtime * https://github.com/kubernetes/kubernetes/issues/86367 Note: When upstream has a fix, the aim is to drop the docker config override and use the journald default * Promote Fedora CoreOS AWS/bare-metal to beta * Remove alpha warnings from docs headers * Update recommended Terraform versions and providers * Sync the documented Terraform versions and provider plugin versions to those that are actively used/tested by the author * Update CHANGELOG sections and links * Add guide for Typhoon with Flatcar Linux on DigitalOcean * Add docs on manually uploading a Flatcar Linux DigitalOcean bin image as a custom image and using a data reference * Set status of Flatcar Linux on DigitalOcean to alpha * IPv6 is not supported for DigitalOcean custom images * Update Prometheus from v1.15.2 to v1.16.0 * https://github.com/prometheus/prometheus/releases/tag/v2.16.0 * Change Kubelet /var/lib/calico mount to read-only (#643) * Kubelet only requires read access to /var/lib/calico Signed-off-by: Suraj Deshmukh <surajd.service@gmail.com> * Update CoreDNS from v1.6.6 to v1.6.7 * https://coredns.io/2020/01/28/coredns-1.6.7-release/ * Update mkdocs-material from v4.6.2 to v4.6.3 * Fix worker_node_labels for initial Fedora CoreOS * Add Terraform strip markers to consume beginning and trailing whitespace in templated Kubelet arguments for podman (Fedora CoreOS only) * Fix initial `worker_node_labels` being quietly ignored on Fedora CoreOS cloud platforms that offer the feature * Close https://github.com/poseidon/typhoon/issues/650 * Update Grafana from v6.6.1 to v6.6.2 * https://github.com/grafana/grafana/releases/tag/v6.6.2 * Update kube-state-metrics from v1.9.4 to v1.9.5 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.5 * Update nginx-ingress from v0.28.0 to v0.29.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.29.0 * Update nginx-ingress from v0.29.0 to v0.30.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.30.0 * Update node-exporter from v0.18.1 to v1.0.0-rc.0 * Update mdadm alert rule; node-exporter adds `state` label to `node_md_disks` and removes `node_md_disks_active` * https://github.com/prometheus/node_exporter/releases/tag/v1.0.0-rc.0 * Use a route table with separate (rather than inline) routes * Allow users to extend the route table using a data reference and adding route resources (e.g. unusual peering setups) * Note: Internally connecting AWS clusters can reduce cross-cloud flexibility and inhibits blue-green cluster patterns. It is not recommended * Update etcd from v3.4.3 to v3.4.4 * https://github.com/etcd-io/etcd/releases/tag/v3.4.4 * Add automatic worker deletion on Fedora CoreOS clouds * On clouds where workers can scale down or be preempted (AWS, GCP, Azure), shutdown runs delete-node.service to remove a node a prevent NotReady nodes from lingering * Add the delete-node.service that wasn't carried over from Container Linux and port it to use podman * Change Container Linux etcd-member to fetch with docker:// * Quay has historically generated ACI signatures for images to facilitate rkt's notions of verification (it allowed authors to actually sign images, though `--trust-keys-from-https` is in use since etcd and most authors don't sign images). OCI standardization didn't adopt verification ideas and checking signatures has fallen out of favor. * Fix an issue where Quay no longer seems to be generating ACI signatures for new images (e.g. quay.io/coreos/etcd:v.3.4.4) * Don't be alarmed by rkt `--insecure-options=image`. It refers to disabling image signature checking (i.e. docker pull doesn't check signatures either) * System containers for Kubelet and bootstrap have transitioned to the docker:// transport, so there is precedent and this brings all the system containers on Container Linux controllers into alignment * Refresh Prometheus alerts and Grafana dashboards * Add 2 min wait before KubeNodeUnreachable to be less noisy on premeptible clusters * Add a BlackboxProbeFailure alert for any failing probes for services annotated `prometheus.io/probe: true` * Upgrade terraform-provider-azurerm to v2.0+ * Add support for `terraform-provider-azurerm` v2.0+. Require `terraform-provider-azurerm` v2.0+ and drop v1.x support since the Azure provider major release is not backwards compatible * Use Azure's new Linux VM and Linux VM Scale Set resources * Change controller's Azure disk caching to None * Associate subnets (in addition to NICs) with security groups (aesthetic) * If set, change `worker_priority` from `Low` to `Spot` (action required) Related: * https://www.terraform.io/docs/providers/azurerm/guides/2.0-upgrade-guide.html * Accept initial worker node labels and taints map on bare-metal * Add `worker_node_labels` map from node name to a list of initial node label strings * Add `worker_node_taints` map from node name to a list of initial node taint strings * Unlike cloud platforms, bare-metal node labels and taints are defined via a map from node name to list of labels/taints. Bare-metal clusters may have heterogeneous hardware so per node labels and taints are accepted * Only worker node names are allowed. Workloads are not scheduled on controller nodes so altering their labels/taints isn't suitable ``` module "mercury" { ... worker_node_labels = { "node2" = ["role=special"] } worker_node_taints = { "node2" = ["role=special:NoSchedule"] } } ``` Related: https://github.com/poseidon/typhoon/issues/429 * Add support for Flatcar Linux on Azure * Accept `os_image` "flatcar-stable" and "flatcar-beta" to use Kinvolk's Flatcar Linux images from the Azure Marketplace Note: Flatcar Linux Azure Marketplace images require terms be accepted before use * Update Calico from v3.12.0 to v3.13.1 * https://docs.projectcalico.org/v3.13/release-notes/ * Update Kubernetes from v1.17.3 to v1.17.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.17.md#v1174 * Update recommended Terraform versions and providers * Sync the documented Terraform versions and provider plugin versions to those that are actively used/tested by the author * Remove Container Linux Update Operator (CLUO) addon * Stop providing example manifests for the Container Linux Update Operator (CLUO) * CLUO requires patches to support Kubernetes v1.16+, but the project and push access is rather unowned * CLUO hasn't been in active use in our clusters and won't be relevant beyond Container Linux. Not to say folks can't patch it and run it on their own. Examples just aren't provided here Related: https://github.com/coreos/container-linux-update-operator/pull/197 * Promote Fedora CoreOS AWS and Google Cloud * Promote Fedora CoreOS AWS to stable * Promote Fedora CoreOS GCP to beta * Update etcd from v3.4.4 to v3.4.5 * https://github.com/etcd-io/etcd/releases/tag/v3.4.5 * Update Prometheus from v2.16.0 to v2.17.0-rc.3 * https://github.com/prometheus/prometheus/releases/tag/v2.17.0-rc.3 * Update Grafana from v6.6.2 to v6.7.1 * https://github.com/grafana/grafana/releases/tag/v6.7.1 * Switch from upstream hyperkube image to individual images * Kubernetes plans to stop releasing the hyperkube container image * Upstream will continue to publish `kube-apiserver`, `kube-controller-manager`, `kube-scheduler`, and `kube-proxy` container images to `k8s.gcr.io` * Upstream will publish Kubelet only as a binary for distros to package, either as a DEB/RPM on traditional distros or a container image on container-optimized operating systems * Typhoon will package the upstream Kubelet (checksummed) and its dependencies as a container image for use on CoreOS Container Linux, Flatcar Linux, and Fedora CoreOS * Update the Typhoon container image security policy to list `quay.io/poseidon/kubelet`as an official distributed artifact Hyperkube: https://github.com/kubernetes/kubernetes/pull/88676 Kubelet Container Image: https://github.com/poseidon/kubelet Kubelet Quay Repo: https://quay.io/repository/poseidon/kubelet * Fix image tag for Container Linux AWS workers * #669 left one reference to the original SHA tagged image before the v1.17.4 image tag was applied * Update Prometheus from v2.17.0-rc.3 to v2.17.0 * https://github.com/prometheus/prometheus/releases/tag/v2.17.0 * Rename DigitalOcean image variable to os_image * Rename variable `image` to `os_image` to match the naming used for the same purpose on other supported platforms (e.g. AWS, Azure, GCP) * Deprecate asset_dir variable and remove docs * Remove docs for the `asset_dir` variable and deprecate it in CHANGES. It will be removed in an upcoming release * Typhoon v1.17.0 introduced a new mechanism for managing and distributing generated assets that stopped relying on writing out to disk. `asset_dir` became optional and defaulted to being unset / off (recommended) * Add Fedora CoreOS to issue template and docs * Update several Container Linux references to start referring to Flatcar Linux * Update docs and mentions of Fedora CoreOS * Update Kubernetes from v1.17.4 to v1.18.0 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md * Update docs from Kubernetes v1.17.4 to v1.18.0 * Set docker log driver to journald on Fedora CoreOS * Before Kubernetes v1.18.0, Kubelet only supported kubectl `--limit-bytes` with the Docker `json-file` log driver so the Fedora CoreOS default was overridden for conformance. See https://github.com/poseidon/typhoon/pull/642 * Kubelet v1.18+ implemented support for other docker log drivers, so the Fedora CoreOS default `journald` can be used again Rel: https://github.com/kubernetes/kubernetes/issues/86367 * Update Prometheus from v2.17.0 to v2.17.1 * https://github.com/prometheus/prometheus/releases/tag/v2.17.1 * Add CoreOS Container Linux EOL recommendation to CHANGES * Recommend that users who have not yet tried Fedora CoreOS or Flatcar Linux do so. Likely, Container Linux will reach EOL and platform support / stability ratings will be in a mixed state. Nevertheless, folks should migrate by September. * Fix delete-node.service kubectl service exec's * Fix delete-node service that runs on worker (cloud-only) shutdown to delete a Kubernetes node. Regressed in #669 (unreleased) * Use rkt `--exec` to invoke kubectl binary in the kubelet image * Use podman `--entrypoint` to invoke the kubectl binary in the kubelet image * Fix Fedora CoreOS AMI to filter for stable images * Fix issue observed in us-east-1 where AMI filters chose the latest testing channel release, rather than the stable chanel * Fedora CoreOS AMI filter selects the latest image with a matching name, x86_64, and hvm, excluding dev images. Add a filter for "Fedora CoreOS stable", which seems to be the only distinguishing metadata indicating the channel * Add support for Fedora CoreOS snippets * Refresh snippets customization docs * Requires terraform-provider-ct v0.5+ * Allow bootstrap re-apply for Fedora CoreOS GCP * Problem: Fedora CoreOS images are manually uploaded to GCP. When a cluster is created with a stale image, Zincati immediately checks for the latest stable image, fetches, and reboots. In practice, this can unfortunately occur exactly during the initial cluster bootstrap phase. * Recommended: Upload the latest Fedora CoreOS image regularly * Mitigation: Allow a failed bootstrap.service run (which won't touch the done ConditionalPathExists) to be re-run by running `terraforma apply` again. Add a known issue to CHANGES * Update docs to show the current Fedora CoreOS stable version to reduce likelihood users see this issue Longer term ideas: * Ideal: Fedora CoreOS publishes a stable channel. Instances will always boot with the latest image in a channel. The problem disappears since it works the same way AWS does * Timer: Consider some timer-based approach to have zincati delay any system reboots for the first ~30 min of a machine's life. Possibly just configured on the controller node https://github.com/coreos/zincati/pull/251 * External coordination: For Container Linux, locksmith filled a similar role and was disabled to allow CLUO to coordinate reboots. By running atop Kubernetes, it was not possible for the reboot to occur before cluster bootstrap * Rely on https://github.com/coreos/zincati/issues/115 to delay the reboot since bootstrap involves an SSH session * Use path-based activation of zincati on controllers and set that path at the end of the bootstrap process Rel: https://github.com/coreos/fedora-coreos-tracker/issues/239 * Change default kube-system DaemonSet tolerations * Change kube-proxy, flannel, and calico-node DaemonSet tolerations to tolerate `node.kubernetes.io/not-ready` and `node-role.kubernetes.io/master` (i.e. controllers) explicitly, rather than tolerating all taints * kube-system DaemonSets will no longer tolerate custom node taints by default. Instead, custom node taints must be enumerated to opt-in to scheduling/executing the kube-system DaemonSets * Consider setting the daemonset_tolerations variable of terraform-render-bootstrap at a later date Background: Tolerating all taints ruled out use-cases where certain nodes might legitimately need to keep kube-proxy or CNI networking disabled Related: https://github.com/poseidon/terraform-render-bootstrap/pull/179 * Fix bootstrap regression when networking="flannel" * Fix bootstrap error for missing `manifests-networking/crd*yaml` when `networking = "flannel"` * Cleanup manifest-networking directory left during bootstrap * Regressed in v1.18.0 changes for Calico https://github.com/poseidon/typhoon/pull/675 * Rename Container Linux snippets variable for consistency * Rename controller_clc_snippets to controller_snippets (cloud platforms) * Rename worker_clc_snippets to worker_snippets (cloud platforms) * Rename clc_snippets to snippets (bare-metal) * Update flannel from v0.11.0 to v0.12.0 * https://github.com/coreos/flannel/releases/tag/v0.12.0 * Fix UDP outbound and clock sync timeouts on Azure workers * Add "lb" outbound rule for worker TCP _and_ UDP traffic * Fix Azure worker nodes clock synchronization being inactive due to timeouts reaching the CoreOS / Flatcar NTP pool * Fix Azure worker nodes not providing outbount UDP connectivity Background: Azure provides VMs outbound connectivity either by having a public IP or via an SNAT masquerade feature bundled with their virtual load balancing abstraction (in contrast with, say, a NAT gateway). Azure worker nodes have only a private IP, but are associated with the cluster load balancer's backend pool and ingress frontend IP. Outbound traffic uses SNAT with this frontend IP. A subtle detail with Azure SNAT seems to be that since both inbound lb_rule's are TCP only, outbound UDP traffic isn't SNAT'd (highlights the reasons Azure shouldn't have conflated inbound load balancing with outbound SNAT concepts). However, adding a separate outbound rule and disabling outbound SNAT on our ingress lb_rule's we can tell Azure to continue load balancing as before, and support outbound SNAT for worker traffic of both the TCP and UDP protocol. Fixes clock synchronization timeouts: ``` systemd-timesyncd[786]: Timed out waiting for reply from 45.79.36.123:123 (3.flatcar.pool.ntp.org) ``` Azure controller nodes have their own public IP, so controllers (and etcd) nodes have not had clock synchronization or outbound UDP issues * Fix terraform fmt * Refresh Prometheus rules/alerts and Grafana dashboards * Refresh upstream Prometheus rules and alerts and Grafana dashboards * All Loki recording rules for convenience * Update Grafana from v6.7.1 to v6.7.2 * https://github.com/grafana/grafana/releases/tag/v6.7.2 * Update etcd from v3.4.5 to v3.4.7 * https://github.com/etcd-io/etcd/releases/tag/v3.4.7 * https://github.com/etcd-io/etcd/releases/tag/v3.4.6 * Update Kubernetes from v1.18.0 to v1.18.1 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1181 * Add support for Fedora CoreOS on DigitalOcean * Add `digital-ocean/fedora-coreos/kubernetes` module * DigitalOcean custom uploaded images do not permit droplet IPv6 networking * Update CHANGES for v1.18.1 release * Change order of modules in the README * Fix docs TOC to include Fedora CoreOS DigitalOcean * Change `container-linux` module preference to Flatcar Linux * No change to Fedora CoreOS modules * For Container Linx AWS and Azure, change the `os_image` default from coreos-stable to flatcar-stable * For Container Linux GCP and DigitalOcean, change `os_image` to be required since users should upload a Flatcar Linux image and set the variable * For Container Linux bare-metal, recommend users change the `os_channel` to Flatcar Linux. No actual module change. * Add support for Fedora CoreOS on Azure * Add `azure/fedora-coreos/kubernetes` module * Fix Fedora CoreOS Azure MTU with Calico * With Calico VXLAN on Fedora CoreOS the 1450 MTU should be used * Update Kubernetes from v1.18.1 to v1.18.2 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#changelog-since-v1181 * Remove temporary workaround for v1.18.0 apply issue * In v1.18.0, kubectl apply would fail to apply manifests if any single manifest was unable to validate. For example, if a CRD and CR were defined in the same directory, apply would fail since the CR would be invalid as the CRD wouldn't exist * Typhoon temporary workaround was to separate CNI CRD manifests and explicitly apply them first. No longer needed in v1.18.1+ * Kubernetes v1.18.1 restored the prior behavior where kubectl apply applies as many valid manifests as it can. In the example above, the CRD would be applied and the CR could be applied if the kubectl apply was re-run (allowing for apply loops). * Upstream fix: https://github.com/kubernetes/kubernetes/pull/89864 * Revert Flatcar Linux Azure to manual upload images * Initial support for Flatcar Linux on Azure used the Flatcar Linux Azure Marketplace images (e.g. `flatcar-stable`) in https://github.com/poseidon/typhoon/pull/664 * Flatcar Linux Azure Marketplace images have some unresolved items https://github.com/poseidon/typhoon/issues/703 * Until the Marketplace items are resolved, revert to requiring Flatcar Linux's images be manually uploaded (like GCP and DigitalOcean) * Fix bootstrap mount to use shared volume SELinux label * Race: During initial bootstrap, static control plane pods could hang with Permission denied to bootstrap secrets. A manual fix involved restarting Kubelet, which relabeled mounts The race had no effect on subsequent reboots. * bootstrap.service runs podman with a private unshared mount of /etc/kubernetes/bootstrap-secrets which uses an SELinux MCS label with a category pair. However, bootstrap-secrets should be shared as its mounted by Docker pods kube-apiserver, kube-scheduler, and kube-controller-manager. Restarting Kubelet was a manual fix because Kubelet relabels all /etc/kubernetes * Fix bootstrap Pod to use the shared volume label, which leaves bootstrap-secrets files with SELinux level s0 without MCS * Also allow failed bootstrap.service to be re-applied. This was missing on bare-metal and AWS * Fix race condition creating DigitalOcean firewall rules * DigitalOcean firewall rules should reference Terraform tag resources rather than using tag strings. Otherwise, terraform apply can fail (neeeds rerun) if a tag has not yet been created * Update Prometheus from v2.17.1 to v2.17.2 * https://github.com/prometheus/prometheus/releases/tag/v2.17.2 * Remove extraneous sudo from layout asset unpacking * Update Calico from v3.13.1 to v3.13.3 * https://docs.projectcalico.org/v3.13/release-notes/ * Enable Kubelet TLS bootstrap and NodeRestriction * Enable bootstrap token authentication on kube-apiserver * Generate the bootstrap.kubernetes.io/token Secret that may be used as a bootstrap token * Generate a bootstrap kubeconfig (with a bootstrap token) to be securely distributed to nodes. Each Kubelet will use the bootstrap kubeconfig to authenticate to kube-apiserver as `system:bootstrappers` and send a node-unique CSR for kube-controller-manager to automatically approve to issue a Kubelet certificate and kubeconfig (expires in 72 hours) * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the `system:node-bootstrapper` ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr nodeclient ClusterRole * Add ClusterRoleBinding for bootstrap token subjects (`system:bootstrappers`) to have the csr selfnodeclient ClusterRole * Enable NodeRestriction admission controller to limit the scope of Node or Pod objects a Kubelet can modify to those of the node itself * Ability for a Kubelet to delete its Node object is retained as preemptible nodes or those in auto-scaling instance groups need to be able to remove themselves on shutdown. This need continues to have precedence over any risk of a node deleting itself maliciously Security notes: 1. Issued Kubelet certificates authenticate as user `system:node:NAME` and group `system:nodes` and are limited in their authorization to perform API operations by Node authorization and NodeRestriction admission. Previously, a Kubelet's authorization was broader. This is the primary security motivation. 2. The bootstrap kubeconfig credential has the same sensitivity as the previous generated TLS client-certificate kubeconfig. It must be distributed securely to nodes. Its compromise still allows an attacker to obtain a Kubelet kubeconfig 3. Bootstrapping Kubelet kubeconfig's with a limited lifetime offers a slight security improvement. * An attacker who obtains the kubeconfig can likely obtain the bootstrap kubeconfig as well, to obtain the ability to renew their access * A compromised bootstrap kubeconfig could plausibly be handled by replacing the bootstrap token Secret, distributing the token to new nodes, and expiration. Whereas a compromised TLS-client certificate kubeconfig can't be revoked (no CRL). However, replacing a bootstrap token can be impractical in real cluster environments, so the limited lifetime is mostly a theoretical benefit. * Cluster CSR objects are visible via kubectl which is nice 4. Bootstrapping node-unique Kubelet kubeconfigs means Kubelet clients have more identity information, which can improve the utility of audits and future features Rel: https://kubernetes.io/docs/reference/command-line-tools-reference/kubelet-tls-bootstrapping/ Rel: https://github.com/poseidon/terraform-render-bootstrap/pull/185 * Add Fedora CoreOS Azure docs to site navigation * Fix missing Fedora CoreOS Azure docs * Update recommended Terraform provider versions * Sync the Terraform provider plugin versions to those actively used and tested by the author * Fix terraform fmt * Use Terraform element wrap-around for AWS controllers subnet_id (#714) * Fix Terraform plan error when controller_count exceeds available AWS zones (e.g. 5 controllers) * Update Grafana from v6.7.2 to v7.0.0-beta1 * https://github.com/grafana/grafana/releases/tag/v7.0.0-beta1 * Update Prometheus from v2.17.2 to v2.18.0-rc.1 * https://github.com/prometheus/prometheus/releases/tag/v2.18.0-rc.1 * Update nginx-ingress from v0.30.0 to v0.32.0 * Add support for IngressClass and RBAC authorization * Since our nginx ingress controller example uses the flag `--ingress-class=public`, add an IngressClass to go along with it Rel: https://kubernetes.io/docs/concepts/services-networking/ingress/#ingress-class * Update Prometheus from v2.18.0-rc.1 to v2.18.0 * https://github.com/prometheus/prometheus/releases/tag/v2.18.0 * Update Prometheus from v2.18.0 to v2.18.1 * https://github.com/prometheus/prometheus/releases/tag/v2.18.1 * Update Grafana from v7.0.0-beta1 to v7.0.0-beta2 * https://github.com/grafana/grafana/releases/tag/v7.0.0-beta2 * Use Fedora CoreOS image streams on Google Cloud * Add `os_stream` variable to set a Fedora CoreOS stream to `stable` (default), `testing`, or `next` * Deprecate `os_image` variable. Remove docs about uploading Fedora CoreOS images manually, this is no longer needed * https://docs.fedoraproject.org/en-US/fedora-coreos/update-streams/ Rel: https://github.com/coreos/fedora-coreos-docs/pull/70 * Fix Calico install-cni crash loop on Pod restarts * Set a consistent MCS level/range for Calico install-cni * Note: Rebooting a node was a workaround, because Kubelet relabels /etc/kubernetes(/cni/net.d) Background: * On SELinux enforcing systems, the Calico CNI install-cni container ran with default SELinux context and a random MCS pair. install-cni places CNI configs by first creating a temporary file and then moving them into place, which means the file MCS categories depend on the containers SELinux context. * calico-node Pod restarts creates a new install-cni container with a different MCS pair that cannot access the earlier written file (it places configs every time), causing the init container to error and calico-node to crash loop * https://github.com/projectcalico/cni-plugin/issues/874 ``` mv: inter-device move failed: '/calico.conf.tmp' to '/host/etc/cni/net.d/10-calico.conflist'; unable to remove target: Permission denied Failed to mv files. This may be caused by selinux configuration on the host, or something else. ``` Note, this isn't a host SELinux configuration issue. Related: * https://github.com/poseidon/terraform-render-bootstrap/pull/186 * Update Calico from v3.13.3 to v3.14.0 * https://docs.projectcalico.org/v3.14/release-notes/ * Update Grafana from v7.0.0-beta2 to v7.0.0-beta.3 * https://github.com/grafana/grafana/releases/tag/v7.0.0-beta3 * Support Fedora CoreOS OS image streams on AWS * Add `os_stream` variable to set the stream to stable (default), testing, or next * Remove unused os_image variable on Fedora CoreOS AWS * Highlight SELinux enforcing mode in features * Restore use of Flatcar Linux Azure Marketplace image * Switch Flatcar Linux Azure to use the Marketplace image from Kinvolk (offer `flatcar-container-linux-free`) * Accepting Azure Marketplace terms is still neccessary, update docs to show accepting the free offer rather than BYOL * Upstream Flatcar: https://github.com/flatcar-linux/Flatcar/issues/82 * Typhoon: https://github.com/poseidon/typhoon/issues/703 * Update Grafana from v7.0.0-beta3 to v7.0.0 * https://github.com/grafana/grafana/releases/tag/7.0.0 * Update kube-state-metrics from v1.9.5 to v1.9.6 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.6 * Update node-exporter from v1.0.0-rc.0 to v1.0.0-rc.1 * https://github.com/prometheus/node_exporter/releases/tag/v1.0.0-rc.1 * Rollback Grafana to v7.0.0-beta3, v7.0.0 image is missing * Grafana hasn't published the v7.0.0 image yet * Use new Azure subnet to set address_prefixes list * Update Azure subnet `address_prefix` to `azure_prefixes` list * Fix warning that `address_prefix` is deprecated * Require `terraform-provider-azurerm` v2.8.0+ (action required) Rel: https://github.com/terraform-providers/terraform-provider-azurerm/pull/6493 * Update Grafana from v7.0.0-beta2 to v7.0.0 * https://grafana.com/docs/grafana/latest/guides/whats-new-in-v7-0/ * Update etcd from v3.4.7 to v3.4.8 * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md#v348-2020-05-18 * Fix Fedora CoreOS on GCP proposing controller recreate * With Fedora CoreOS image stream support (#727), the latest resolved image will change over the lifecycle of a cluster. * Fix issue where an image diff proposed replacing a Fedora CoreOS controller on GCP, introduced in #727 (unreleased) * Also ignore image diffs to the GCP managed instance group of workers. This aligns with worker AMI diffs being ignored on AWS and similar on Azure, since workers update themselves. Background: * Controller nodes should strictly not be recreated by Terraform, they are stateful (etcd) and should not be replaced * Across cloud platforms, OS image diffs are ignored since both Flatcar Linux and Fedora CoreOS nodes update themselves. For workers, user-data or disk size diffs (where relevant) are allowed to recreate workers templates/configs since these are considered to be user-initiated declarations that a reprovision should be done * Set Kubelet image via kubelet.service KUBELET_IMAGE * Write the systemd kubelet.service to use `KUBELET_IMAGE` as the Kubelet. This provides a nice way to use systemd dropins to temporarily override the image (e.g. during a registry outage) Note: Only Typhoon Kubelet images and registries are supported. * Update Kubernetes from v1.18.2 to v1.18.3 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md * Upgrade docs packages and refresh content * Promote DigitalOcean from alpha to beta for Fedora CoreOS and Flatcar Linux * Upgrade mkdocs-material and PyPI packages for docs * Replace docs mentions of Container Linux with Flatcar Linux and move docs/cl to docs/flatcar-linux * Deprecate CoreOS Container Linux support. Its still usable for some time, but start removing docs * Update etcd from v3.4.8 to v3.4.9 * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md#v349-2020-05-20 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions to those actively used internally * Fix terraform fmt * Update node-exporter from v1.0.0-rc.1 to v1.0.0 * https://github.com/prometheus/node_exporter/releases/tag/v1.0.0 * Update Grafana from v7.0.0 to v7.0.1 * https://github.com/grafana/grafana/releases/tag/v7.0.1 * Update mkdocs-material from v5.2.0 to v5.2.2 * https://github.com/squidfunk/mkdocs-material/releases/tag/5.2.2 * Update Github issue template to use drop-downs (#747) * Create a stricter bug report template * Highlight topics that are not accepted in issues: operation, support, debugging, advice, or Kubernetes concepts * Add a section to strongly suggest bug reports link a PR or describe a solution. This may be able to weed out topics that aren't focused bug reports * Update the fallback issue template * Even "blank" issues need to fill out the fallback template * Update Calico from v3.14.0 to v3.14.1 * https://docs.projectcalico.org/v3.14/release-notes/ * Change Kubelet container image publishing * Build Kubelet container images internally and publish to Quay and Dockerhub (new) as an alternative in case of registry outage or breach * Use our infra to provide single and multi-arch (default) Kublet images for possible future use * Docs: Show how to use alternative Kubelet images via snippets and a systemd dropin (builds on #737) Changes: * Update docs with changes to Kubelet image building * If you prefer to trust images built by Quay/Dockerhub, automated image builds are still available with unique tags (albeit with some limitations): * Quay automated builds are tagged `build-{short_sha}` (limit: only amd64) * Dockerhub automated builts are tagged `build-{tag}` and `build-master` (limit: only amd64, no shas) Links: * Kubelet: https://github.com/poseidon/kubelet * Docs: https://typhoon.psdn.io/topics/security/#container-images * Registries: * quay.io/poseidon/kubelet * docker.io/psdn/kubelet * Tweak minor style elements of issue templates * Update kube-state-metrics from v1.9.6 to v1.9.7 * https://github.com/kubernetes/kube-state-metrics/releases/tag/v1.9.7 * Update Grafana from v7.0.1 to v7.0.3 * https://github.com/grafana/grafana/releases/tag/v7.0.2 * https://github.com/grafana/grafana/releases/tag/v7.0.3 * Update Prometheus from v2.18.1 to v2.19.0-rc.0 * https://github.com/prometheus/prometheus/releases/tag/v2.19.0-rc.0 * Fix Fedora CoreOS docs for selecting a stream * Fedora CoreOS image `os_stream` stable, testing, and next have been configurable since v1.18.3 * Remove mention of outdated `os_image` variable * Update security disclosure contact email * Use security@psdn.io across github.com/poseidon projects * Use strict mode for Container Linux Configs * Enable terraform-provider-ct `strict` mode for parsing Container Linux Configs and snippets * Fix Container Linux Config systemd unit syntax `enable` (old) to `enabled` * Align with Fedora CoreOS which uses strict mode already * Update Prometheus from v2.19.0-rc.0 to v2.19.0 * https://github.com/prometheus/prometheus/releases/tag/v2.19.0 * Remove unused Kubelet cert / key Terraform state * Generated Kubelet TLS certificate and key are not longer used or distributed to machines since Kubelet TLS bootstrap is used instead. Remove the certificate and key from state * Remove unused Kubelet lock-file and exit-on-lock-contention * Kubelet `--lock-file` and `--exit-on-lock-contention` date back to usage of bootkube and at one point running Kubelet in a "self-hosted" style whereby an on-host Kubelet (rkt) started pods, but then a Kubelet DaemonSet was scheduled and able to take over (hence self-hosted). `lock-file` and `exit-on-lock-contention` flags supported this pivot. The pattern has been out of favor (in bootkube too) for years because of dueling Kubelet complexity * Typhoon runs Kubelet as a container via an on-host systemd unit using podman (Fedora CoreOS) or rkt (Flatcar Linux). In fact, Typhoon no longer uses bootkube or control plane pivot (let alone Kubelet pivot) and uses static pods since v1.16.0 * https://github.com/poseidon/typhoon/pull/536 * Update node-exporter from v1.0.0 to v1.0.1 * https://github.com/prometheus/node_exporter/releases/tag/v1.0.1 * Update mkdocs packages for website * Fix typo in DigitalOcean docs title * Update nginx-ingress from v0.32.0 to v0.33.0 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-0.33.0 * Update Kubernetes from v1.18.3 to v1.18.4 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1184 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions with those used internally * Rename controller node label and NoSchedule taint * Remove node label `node.kubernetes.io/master` from controller nodes * Use `node.kubernetes.io/controller` (present since v1.9.5, [#160](https://github.com/poseidon/typhoon/pull/160)) to node select controllers * Rename controller NoSchedule taint from `node-role.kubernetes.io/master` to `node-role.kubernetes.io/controller` * Tolerate the new taint name for workloads that may run on controller nodes and stop tolerating `node-role.kubernetes.io/master` taint * Fix Kubelet starting before hostname set on FCOS AWS * Fedora CoreOS `kubelet.service` can start before the hostname is set. Kubelet reads the hostname to determine the node name to register. If the hostname was read as localhost, Kubelet will continue trying to register as localhost (problem) * This race manifests as a node that appears NotReady, the Kubelet is trying to register as localhost, while the host itself (by then) has an AWS provided hostname. Restarting kubelet.service is a manual fix so Kubelet re-reads the hostname * This race could only be shown on AWS, not on Google Cloud or Azure despite attempts. Bare-metal and DigitalOcean differ and use hostname-override (e.g. afterburn) so they're not affected * Wait for nodes to have a non-localhost hostname in the oneshot that awaits /etc/resolve.conf. Typhoon has no valid cases for a node hostname being localhost (not even single-node clusters) Related Openshift: https://github.com/openshift/machine-config-operator/pull/1813 Close https://github.com/poseidon/typhoon/issues/765 * Reduce Calcio MTU on Fedora CoreOS Azure * Change the Calico VXLAN interface for MTU from 1450 to 1410 * VXLAN on Azure should support MTU 1450. However, there is history where performance measures have shown that 1410 is needed to have expected performance. Flatcar Linux has the same MTU 1410 override and note * FCOS 31.20200323.3.2 was known to perform fine with 1450, but now in 31.20200517.3.0 the right value seems to be 1410 * Add experimental Cilium CNI provider * Accept experimental CNI `networking` mode "cilium" * Run Cilium v1.8.0-rc4 with overlay vxlan tunnels and a minimal set of features. We're interested in: * IPAM: Divide pod_cidr into /24 subnets per node * CNI networking pod-to-pod, pod-to-external * BPF masquerade * NetworkPolicy as defined by Kubernetes (no L7 Policy) * Continue using kube-proxy with Cilium probe mode * Firewall changes: * Require UDP 8472 for vxlan (Linux kernel default) between nodes * Optional ICMP echo(8) between nodes for host reachability (health) * Optional TCP 4240 between nodes for endpoint reachability (health) Known Issues: * Containers with `hostPort` don't listen on all host addresses, these workloads must use `hostNetwork` for now https://github.com/cilium/cilium/issues/12116 * Erroneous warning on Fedora CoreOS https://github.com/cilium/cilium/issues/10256 Note: This is experimental. It is not listed in docs and may be changed or removed without a deprecation notice Related: * https://github.com/poseidon/terraform-render-bootstrap/pull/192 * https://github.com/cilium/cilium/issues/12217 * Update Cilium from v1.8.0-rc4 to v1.8.0 * https://github.com/cilium/cilium/releases/tag/v1.8.0 * Update Prometheus from v2.19.0 to v2.19.1 * https://github.com/prometheus/prometheus/releases/tag/v2.19.1 * Update Grafana from v7.0.3 to v7.0.4 * https://github.com/grafana/grafana/releases/tag/v7.0.4 * Update mkdocs-material from v5.3.0 to v5.3.3 * Update Calico from v3.14.1 to v3.15.0 * https://docs.projectcalico.org/v3.15/release-notes/ * Update Kubernetes from v1.18.4 to v1.18.5 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1185 * Update Prometheus from v2.19.1 to v2.19.2 * https://github.com/prometheus/prometheus/releases/tag/v2.19.2 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions with those used internally * Revert "Update Prometheus from v2.19.1 to v2.19.2" * Prometheus has not published the v1.19.2 * This reverts commit 81b6f54169119702c3cc6a3ecabca77f8646b444. * Isolate each DigitalOcean cluster in its own VPC * DigitalOcean introduced Virtual Private Cloud (VPC) support to match other clouds and enhance the prior "private networking" feature. Before, droplet's belonging to different clusters (but residing in the same region) could reach one another (although Typhoon firewall rules prohibit this). Now, droplets in a VPC reside in their own network * https://www.digitalocean.com/docs/networking/vpc/ * Create droplet instances in a VPC per cluster. This matches the design of Typhoon AWS, Azure, and GCP. * Require `terraform-provider-digitalocean` v1.16.0+ (action required) * Output `vpc_id` for use with an attached DigitalOcean loadbalancer * Remove os_image variable on Google Cloud Fedora CoreOS * In v1.18.3, the `os_stream` variable was added to select a Fedora CoreOS image stream (stable, testing, next) on AWS and Google Cloud (which publish official streams) * Remove `os_image` variable deprecated in v1.18.3. Manually uploaded images are no longer needed * Fix terraform fmt in firewall rules * Promote Fedora CoreOS on Google Cloud to stable status * Allow using Flatcar Linux edge on Azure * Set Kubelet cgroup driver to systemd when Flatcar Linux edge is chosen Note: Typhoon module status assumes use of the stable variant of an OS channel/stream. Its possible to use earlier variants and those are sometimes tested or developed against, but stable is the recommendation * Remove CoreOS Container Linux image names from docs * Remove coreos-stable, coreos-beta, and coreos-alpha channel references from docs * CoreOS Container Linux is end of life (see changelog) * Update Grafana from v7.0.4 to v7.0.5 * https://github.com/grafana/grafana/releases/tag/v7.0.5 * Update Cilium from v1.8.0 to v1.8.1 * https://github.com/cilium/cilium/releases/tag/v1.8.1 * Update Prometheus from v2.19.1 to v2.19.2 * https://github.com/prometheus/prometheus/releases/tag/v2.19.2 * Update Grafana from v7.0.5 to v7.0.6 * https://github.com/grafana/grafana/releases/tag/v7.0.6 * Update mkdocs-material from v5.3.3 to v5.4.0 * Update Kubernetes from v1.18.5 to v1.18.6 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1186 * https://github.com/poseidon/terraform-render-bootstrap/pull/201 * Update ingress-nginx from v0.33.0 to v0.34.1 * Switch to ingress-nginx controller images from us.grc.io (eu, asia can also be used if desired) * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.34.1 * https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v0.34.0 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions with those used internally * Show Cilium as a CNI provider option in docs * Start to show Cilium as a CNI option * https://github.com/cilium/cilium * Update Grafana from v7.0.6 to v7.1.0 * https://github.com/grafana/grafana/releases/tag/v7.1.0 * Update etcd from v3.4.9 to v3.4.10 * https://github.com/etcd-io/etcd/releases/tag/v3.4.10 * Declare etcd data directory permissions * Set etcd data directory /var/lib/etcd permissions to 700 * On Flatcar Linux, /var/lib/etcd is pre-existing and Ignition v2 doesn't overwrite the directory. Update the Container Linux config, but add the manual chmod workaround to bootstrap for Flatcar Linux users * https://github.com/etcd-io/etcd/blob/master/CHANGELOG-3.4.md#v3410-2020-07-16 * https://github.com/etcd-io/etcd/pull/11798 * Update CoreDNS from v1.6.7 to v1.7.0 * https://coredns.io/2020/06/15/coredns-1.7.0-release/ * Update Grafana dashboard with revised metrics names * Update Cilium from v1.8.1 to v1.8.2 * https://github.com/cilium/cilium/releases/tag/v1.8.2 * Fix some links in docs (#788) * Update Grafana from v7.1.0 to v7.1.1 * https://github.com/grafana/grafana/releases/tag/v7.1.1 * Update Prometheus from v2.19.2 to v2.20.0 * https://github.com/prometheus/prometheus/releases/tag/v2.20.0 * Migrate to Fedora CoreOS * bastion only needs base fcos config * Revert "bastion only needs base fcos config" This reverts commit 2984f3e70be6f8ecdc6a96f70cf809b260c919d6. * support custom bastion snippet as a variable * Fix flannel support on Fedora CoreOS * Fedora CoreOS now ships systemd-udev's `default.link` while Flannel relies on being able to pick its own MAC address for the `flannel.1` link for tunneled traffic to reach cni0 on the destination side, without being dropped * This change first appeared in FCOS testing-devel 32.20200624.20.1 and is the behavior going forward in FCOS since it was added to align FCOS network naming / configs with the rest of Fedora and address issues related to the default being missing * Flatcar Linux (and Container Linux) has a specific flannel.link configuration builtin, so it was not affected * https://github.com/coreos/fedora-coreos-tracker/issues/574#issuecomment-665487296 Note: Typhoon's recommended and default CNI provider is Calico, unless `networking` is set to flannel directly. * Relex terraform-provider-matchbox version constraint * Allow use of terraform-provider-matchbox v0.3+ (which allows v0.3.0 <= version < v1.0) for any pre 1.0 release * Before, the requirement was v0.3.0 <= version < v0.4.0 * Update from coreos/flannel-cni to poseidon/flannel-cni * Update CNI plugins from v0.6.0 to v0.8.6 to fix several CVEs * Update the base image to alpine:3.12 * Use `flannel-cni` as an init container and remove sleep * https://github.com/poseidon/terraform-render-bootstrap/pull/205 * https://github.com/poseidon/flannel-cni * https://quay.io/repository/poseidon/flannel-cni Background * Switch from github.com/coreos/flannel-cni v0.3.0 which was last published by me in 2017 and is no longer accessible to me to maintain or patch * Port to the poseidon/flannel-cni rewrite, which releases v0.4.0 to continue the prior release numbering * Update mkdocs-material from v5.4.0 to v5.5.1 * use scoop's fork of terraform-render-bootstrap * policy arn * Revert "policy arn" This reverts commit 4579af8bda4a5720e791c56fa02c14ecb767a537. * workers and controllers need to stay private * fedora coreos 32 * fedora coreos 32 * Support Fedora CoreOS OS image streams on AWS * fix mistakes in resolving merging conflicts * add new security components * fix json format * Update Grafana from v7.1.1 to v7.1.3 * https://github.com/grafana/grafana/releases/tag/v7.1.3 * https://github.com/grafana/grafana/releases/tag/v7.1.2 * Allow terraform-provider-aws v3.0+ plugin * Typhoon AWS is compatible with terraform-provider-aws v3.x releases * Continue to allow v2.23+, no v3.x specific features are used * Set required provider versions in the worker module, since it can be used independently Related: * https://github.com/terraform-providers/terraform-provider-aws/releases/tag/v3.0.0 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions to those used internally * fix ssl cert mounts * Migrate from Terraform v0.12.x to v0.13.x * Recommend Terraform v0.13.x * Support automatic install of poseidon's provider plugins * Update tutorial docs for Terraform v0.13.x * Add migration guide for Terraform v0.13.x (best-effort) * Require Terraform v0.12.26+ (migration compatibility) * Require `terraform-provider-ct` v0.6.1 * Require `terraform-provider-matchbox` v0.4.1 * Require `terraform-provider-digitalocean` v1.20+ Related: * https://www.hashicorp.com/blog/announcing-hashicorp-terraform-0-13/ * https://www.terraform.io/upgrade-guides/0-13.html * https://registry.terraform.io/providers/poseidon/ct/latest * https://registry.terraform.io/providers/poseidon/matchbox/latest * apiserver nlb should be internal * update terraform-render-bootstrap with latest upstream * Update Terraform migration guide SHA * Mention the first master branch SHA that introduced Terraform v0.13 forward compatibility * Link the migration guide on Github until a release is available and website docs are published * Update Kubernetes from v1.18.6 to v1.18.8 * https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.18.md#v1188 * Update recommended Terraform provider versions * Sync Terraform provider plugin versions to those used internally * Update mkdocs-material from v5.5.1 to v5.5.6 * Fix minor details in docs * try relabeling /etc/kubernetes/bootstrap-secrets by explicitly mounting to kubelet * relabeling does not need explicitly mounting to kubelet * need to update the type label of bootstrap-secret in the newest typhoon * update terraform-render-bootstrap with latest upstream * rm unnecessary volume mounts on etcd * rm output/ Co-authored-by: Dalton Hubble <dghubble@gmail.com> Co-authored-by: Arve Knudsen <arve.knudsen@gmail.com> Co-authored-by: Suraj Deshmukh <surajd.service@gmail.com> Co-authored-by: Ben Drucker <bvdrucker@gmail.com> Co-authored-by: Eldon <github@eldondev.com>

mleoca · 2020-09-09T05:15:17Z

I've tried the flannel solution with the unmanaged link flag to eni's created by the aws-k8s-cni plugin but does not seem to be working there or I'm not doing it right.

teintuc · 2020-09-09T07:11:10Z

Hi all,

Sorry for the late late reply!
My colleague got to the same conclusions and found a solution which works for us.
You need to drop a file into: /etc/systemd/network/
With:

[Match]
OriginalName=eni* ens*
[Link]
MACAddressPolicy=none

We tested it on Fedora CoreOS 32.20200809.3.0

Thanks

* Fedora CoreOS now ships systemd-udev's `default.link` while Flannel relies on being able to pick its own MAC address for the `flannel.1` link for tunneled traffic to reach cni0 on the destination side, without being dropped * This change first appeared in FCOS testing-devel 32.20200624.20.1 and is the behavior going forward in FCOS since it was added to align FCOS network naming / configs with the rest of Fedora and address issues related to the default being missing * Flatcar Linux (and Container Linux) has a specific flannel.link configuration builtin, so it was not affected * coreos/fedora-coreos-tracker#574 (comment) Note: Typhoon's recommended and default CNI provider is Calico, unless `networking` is set to flannel directly.

lucab added area/kubernetes needs/more-information labels Jul 17, 2020

lucab mentioned this issue Jul 27, 2020

nodelocaldns does cannot connect to coredns (on nodes where no coredns runs) #583

Open

teintuc changed the title ~~Need help with last release 32.20200629.3.0~~ K8s CoreDns not starting on 32.20200629.3.0 (Pods on master can't talk to the apiserver) Jul 27, 2020

dustymabe added the meeting topics for meetings label Jul 29, 2020

DirkTheDaring mentioned this issue Jul 31, 2020

flannel does not work (nodelocaldns hanging is a symptom) in fedora cores starting from end of June kubernetes-sigs/kubespray#6451

Closed

dghubble mentioned this issue Aug 2, 2020

Fix flannel support on Fedora CoreOS poseidon/typhoon#795

Merged

betermieux mentioned this issue Aug 5, 2020

Flannel failure after mac-address changes of vxlan device flannel-io/flannel#1330

Closed

lucab removed the meeting topics for meetings label Aug 12, 2020

teintuc closed this as completed Sep 9, 2020

This was referenced Oct 16, 2020

Canal/flannel starts/active/running but does not pass traffic until flannel container restart on fcos rancher/rke#2270

Closed

Calico node networking errors rancher/rke#1606

Closed

Support for Fedora CoreOS rancher/rancher#29593

Closed

Slyke mentioned this issue Aug 29, 2021

Flannel fails to communicate between pods after node reboot flannel-io/flannel#1474

Closed

K8s CoreDns not starting on 32.20200629.3.0 (Pods on master can't talk to the apiserver) #574

K8s CoreDns not starting on 32.20200629.3.0 (Pods on master can't talk to the apiserver) #574

Comments

teintuc commented Jul 17, 2020

lucab commented Jul 17, 2020

cgwalters commented Jul 17, 2020

teintuc commented Jul 17, 2020

dustymabe commented Jul 17, 2020 • edited Loading

teintuc commented Jul 17, 2020

teintuc commented Jul 17, 2020 • edited Loading

cgwalters commented Jul 17, 2020

teintuc commented Jul 17, 2020

dustymabe commented Jul 17, 2020

dustymabe commented Jul 21, 2020

teintuc commented Jul 21, 2020 • edited Loading

dustymabe commented Jul 21, 2020 • edited Loading

teintuc commented Jul 21, 2020 • edited Loading

DirkTheDaring commented Jul 27, 2020

teintuc commented Jul 27, 2020

DirkTheDaring commented Jul 27, 2020 • edited Loading

teintuc commented Jul 27, 2020

DirkTheDaring commented Jul 27, 2020

teintuc commented Jul 27, 2020

dghubble commented Jul 29, 2020 • edited Loading

OS Experiments

dghubble commented Jul 29, 2020

dustymabe commented Jul 29, 2020

DirkTheDaring commented Jul 29, 2020 • edited Loading

dghubble commented Jul 29, 2020

DirkTheDaring commented Jul 29, 2020 • edited Loading

dghubble commented Jul 29, 2020 • edited Loading

Solution

Followup

dghubble commented Jul 29, 2020

lucab commented Jul 29, 2020

dghubble commented Jul 29, 2020

DirkTheDaring commented Jul 29, 2020

dustymabe commented Jul 29, 2020

dustymabe commented Jul 30, 2020

DirkTheDaring commented Jul 30, 2020

dustymabe commented Jul 30, 2020

DirkTheDaring commented Jul 31, 2020

dghubble commented Aug 2, 2020

cgwalters commented Aug 5, 2020

spaced commented Aug 11, 2020

dghubble commented Aug 11, 2020

mleoca commented Sep 9, 2020

teintuc commented Sep 9, 2020

dustymabe commented Jul 17, 2020 •

edited

Loading

teintuc commented Jul 17, 2020 •

edited

Loading

teintuc commented Jul 21, 2020 •

edited

Loading

dustymabe commented Jul 21, 2020 •

edited

Loading

teintuc commented Jul 21, 2020 •

edited

Loading

DirkTheDaring commented Jul 27, 2020 •

edited

Loading

dghubble commented Jul 29, 2020 •

edited

Loading

DirkTheDaring commented Jul 29, 2020 •

edited

Loading

DirkTheDaring commented Jul 29, 2020 •

edited

Loading

dghubble commented Jul 29, 2020 •

edited

Loading