Skip to content
This repository has been archived by the owner on Jun 20, 2024. It is now read-only.

Weave-net always renders one node unhealthy in the cluster #3243

Closed
hazim1093 opened this issue Feb 13, 2018 · 16 comments
Closed

Weave-net always renders one node unhealthy in the cluster #3243

hazim1093 opened this issue Feb 13, 2018 · 16 comments

Comments

@hazim1093
Copy link

What you expected to happen?

Using weavenet in a Kubernetes cluster, created via Kops, with 3 masters and 3 nodes.
I was expecting all nodes to be healthy with no network issues.

What happened?

  • Scenario 1: 1 node in the cluster randomly can't reach the internet.
  • Scenario 2: 1 Node in the cluster becomes completely unhealthy, and unreachable. (Not even able to ssh into the node anymore)

In both scenarios, other nodes seem to work absolutely fine.

How to reproduce it?

  • Scenario 1 randomly appears when the load on cluster is increased.
  • Scenario 2 is easy to reproduce for us, just destroy all nodes in the cluster at once. New nodes spin up, the cluster becomes ready and functional but just after some time, 1 node becomes completely unhealthy and unreachable. (Our guess is that iptables for the node have been messed up but we can't even ssh into it anymore to actually check that)

Anything else we need to know?

Cloud Provider: AWS
Kubernetes configuration: Kops -> 3 masters and 3 nodes

Versions:

  • weave version: 2.0.1, 2.0.5, and even 2.2.0 (upgrading to it manually)
  • Distribution : Coreos
  • uname -r : 4.13.16-coreos-r2
  • Kubernetes Version: 1.8.7
  • docker version:
Client:
 Version:      17.09.0-ce
 API version:  1.32
 Go version:   go1.8.4
 Git commit:   afdb6d4
 Built:        Tue Sep 26 22:24:58 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.09.0-ce
 API version:  1.32 (minimum version 1.12)
 Go version:   go1.8.4
 Git commit:   afdb6d4
 Built:        Tue Sep 26 22:24:58 2017
 OS/Arch:      linux/amd64
 Experimental: false

Logs:

Weave pod logs:

The weave pod on that node was unreachable so we got logs from another pod:
https://gist.github.com/hazim1093/c7aba837dc94d9b66e24f227f9cd4d6f

Node status in Kubernetes Dashboard:

screen shot 2018-02-13 at 08 58 31

What we can see in Weave scope:

weavescope1

weavescope2

NOTE

The node seen as Unmanaged in the images above, is the faulty one.

Network:

  • For Scenario 2, as the node was unreachable at all, we could not find out iptable changes or any logs of this sort. Any solution for that?
  • For Scenario 1, I compared iptables of the faulty node with the nodes which were working fine and found these entries missing in sudo iptables-save command of the faulty node:
-A WEAVE -s 100.96.0.0/11 -d 224.0.0.0/4 -j RETURN
-A WEAVE ! -s 100.96.0.0/11 -d 100.96.0.0/11 -j MASQUERADE
-A WEAVE -s 100.96.0.0/11 ! -d 100.96.0.0/11 -j MASQUERADE
-A FORWARD -o weave -m state --state NEW -j NFLOG --nflog-group 86
-A FORWARD -o weave -j DROP
-A FORWARD -i weave ! -o weave -j ACCEPT
-A FORWARD -o weave -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT

We were first facing this issue while using weavenet 2.0.1 and then 2.0.5. So we upgraded to 2.2.0. But still the Scenario 2 is easily reproducible for us by just killing all nodes at once.

@leth
Copy link
Contributor

leth commented Feb 13, 2018

Could you perhaps try to get the weave logs from the unreachable node via the AWS console?

@leth
Copy link
Contributor

leth commented Feb 13, 2018

  • When you say "1 node in the cluster randomly can't reach the internet." does that mean from the host, from a pod, or both?
  • Is there anything interesting in journalctl -r on the unreachable node?
  • Do those nodes run any other software which might change iptables rules?

To debug this further, perhaps run some tcpdump from/to the unreachable node, from another host, from another pod, to the Internet, etc. and check where traffic is lost.

@rasheedamir
Copy link

rasheedamir commented Feb 13, 2018

@leth great thanks for quick help

So, we saw that ASG removed the bad node and brought up a new node; and that new node was healthy; we killed all 3 nodes again and then one became unhealthy and here are the logs from AWS

Actions -> Instance Settings -> Get System Log

https://gist.github.com/rasheedamir/394fd3bfc4771c7f106571876912f6e5

Please let me know if we can grab other logs from somewhere

@hazim1093
Copy link
Author

@leth
We're not able to reproduce that scenario where just the node can't reach internet, but we can ssh into the node. We weren't able to reach internet from Pods for sure. I'm confused about connectivity on the host itself, can't remember.

But in the other scenario, we can't even ssh into the node to check journalctl or tcpdump.

We aren't running any other software which could change ip tables

@rasheedamir
Copy link

So, we tried again and this is the behavior

Nodes become healthy and just after shortwhile one of them becomes unhealthy; and ssh doesn't work for it; plz look this screenshot

screen shot 2018-02-13 at 15 10 59

@rasheedamir
Copy link

So, after shortwhile it became healthy again

screen shot 2018-02-13 at 15 22 54

Here are the logs from AWS console

https://gist.github.com/rasheedamir/5f6b552057270d3745ca0c5e55a35cbf

@rasheedamir
Copy link

@leth here is bit of journalctl -r output from unhealthy node which became healthy

Feb 13 14:18:43 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:43.373621    1111 aws.go:1051] Could not determine public DNS from AWS metadata.
Feb 13 14:18:42 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:42.878547    1111 server.go:779] GET /pods/: (562.208µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:18:41 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:41.883877    1111 server.go:779] GET /pods/: (602.184µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:18:40 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:40.878443    1111 server.go:779] GET /pods/: (598.786µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:18:39 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:39.879243    1111 server.go:779] GET /pods/: (592.581µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:18:38 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:38.883205    1111 server.go:779] GET /pods/: (582.06µs) 200 [[Go-http-client/1.1] 127.0.0.1:33
Feb 13 14:18:37 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:37.878500    1111 server.go:779] GET /pods/: (537.807µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:18:36 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:36.878468    1111 server.go:779] GET /pods/: (572.363µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:18:35 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:35.883510    1111 server.go:779] GET /pods/: (560.084µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:18:34 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:34.878582    1111 server.go:779] GET /pods/: (544.012µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:18:33 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:33.879614    1111 server.go:779] GET /pods/: (584.533µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:18:33 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:33.347335    1111 aws.go:1051] Could not determine public DNS from AWS metadata.
Feb 13 14:18:32 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:32.884051    1111 server.go:779] GET /pods/: (592.681µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:18:31 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:31.878378    1111 server.go:779] GET /pods/: (591.307µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:18:30 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:18:30.878656    1111 server.go:779] GET /pods/: (571.458µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:19:10 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:19:10.878846    1111 server.go:779] GET /pods/: (587.083µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:19:10 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:19:10.648149    1111 fs.go:603] killing cmd [find /var/lib/docker/overlay/f8fae0893fe5814e3b2f7d1
Feb 13 14:19:10 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:19:10.639175    1111 fs.go:603] killing cmd [find /var/lib/docker/overlay/84536d29fba73373eb99d32
Feb 13 14:19:09 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:19:09.878432    1111 server.go:779] GET /pods/: (609.56µs) 200 [[Go-http-client/1.1] 127.0.0.1:33
Feb 13 14:19:08 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:19:08.878740    1111 server.go:779] GET /pods/: (567.377µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:19:07 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:19:07.879136    1111 server.go:779] GET /pods/: (644.831µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:19:06 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:19:06.878718    1111 server.go:779] GET /pods/: (575.434µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:19:05 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:19:05.880107    1111 server.go:779] GET /pods/: (650.48µs) 200 [[Go-http-client/1.1] 127.0.0.1:33
Feb 13 14:19:05 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: weave-cni: unable to release IP address: 400 Bad Request: Delete: no addresses for cb0b7ef561315880f6c32a8
Feb 13 14:19:05 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: W0213 14:19:05.561116    1111 cni.go:265] CNI failed to retrieve network namespace path: Cannot find netwo
Feb 13 14:19:05 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: weave-cni: unable to release IP address: 400 Bad Request: Delete: no addresses for 0ae52d18cd31ac69daaab87
Feb 13 14:19:05 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: W0213 14:19:05.526045    1111 cni.go:265] CNI failed to retrieve network namespace path: Cannot find netwo
Feb 13 14:19:05 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: weave-cni: unable to release IP address: 400 Bad Request: Delete: no addresses for 8a16a2568b6979e34d60e12
Feb 13 14:19:05 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: W0213 14:19:05.494106    1111 cni.go:265] CNI failed to retrieve network namespace path: Cannot find netwo
Feb 13 14:19:05 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: weave-cni: unable to release IP address: 400 Bad Request: Delete: no addresses for 101b26cc6de57602d0b0df2
Feb 13 14:19:05 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: W0213 14:19:05.461932    1111 cni.go:265] CNI failed to retrieve network namespace path: Cannot find netwo
Feb 13 14:19:04 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:19:04.881705    1111 server.go:779] GET /pods/: (626.402µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:19:03 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:19:03.878605    1111 server.go:779] GET /pods/: (689.387µs) 200 [[Go-http-client/1.1] 127.0.0.1:3
Feb 13 14:19:03 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:19:03.422929    1111 aws.go:1051] Could not determine public DNS from AWS metadata.

@leth
Copy link
Contributor

leth commented Feb 13, 2018

It looks like that log is truncated at a certain width, could you re-post it? thanks

@rasheedamir
Copy link

Ah sorry @leth here is the full file; I hope it doesn't have any secrets :)

log.txt

@rade
Copy link
Member

rade commented Feb 13, 2018

Nodes become healthy and just after shortwhile one of them becomes unhealthy
after shortwhile it became healthy again

What instance type are you using? The above are hallmarks of throttling.

@rasheedamir
Copy link

@rade we have:

t2.xlarge for nodes & t2.medium for master

@rade
Copy link
Member

rade commented Feb 13, 2018

ok. not throttling then.

@rasheedamir
Copy link

@leth between this time slot; node wasn't healthy

Also if you see there is a slot of 10 minutes where nothing was logged ... not sure if it means something

Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel: active_anon:3896614 inactive_anon:8298 isolated_anon:0
                                                                  active_file:8071 inactive_file:7839 isolated_file:32
                                                                  unevictable:0 dirty:25 writeback:0 unstable:0
                                                                  slab_reclaimable:26920 slab_unreclaimable:68295
                                                                  mapped:26334 shmem:22794 pagetables:17648 bounce:0
                                                                  free:33033 free_pcp:1112 free_cma:0
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel: Mem-Info:
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel: R13: 0000000778b00218 R14: 000000077837b9b8 R15: 00007fc85c00a000
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel: R10: 0000000784d96d38 R11: 00000006c5224470 R12: 0000000000000000
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel: RBP: 0000000784d96d38 R08: 0000000784d96c30 R09: 0000000784d96f90
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel: RDX: 00000007c0105950 RSI: 0000000784d96d38 RDI: 0000000784d97008
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel: RAX: 0000000784d96fe8 RBX: 0000000000000000 RCX: 0000000000000000
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel: RSP: 002b:00007fc863029790 EFLAGS: 00010246
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel: RIP: 0033:0x7fc84d0b8890
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel:  page_fault+0x28/0x30
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel:  do_page_fault+0x22/0x30
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel:  ? exit_to_usermode_loop+0x90/0xb0
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel:  __do_page_fault+0x24c/0x4d0
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel:  handle_mm_fault+0xce/0x1d0
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel:  __handle_mm_fault+0xc3c/0x10a0
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel:  alloc_pages_vma+0x7f/0x190
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel:  __alloc_pages_nodemask+0x24a/0x270
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal kernel:  __alloc_pages_slowpath+0x858/0xe00
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal systemd-journald[10826]: Missed 7 kernel messages
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal systemd-journald[10826]: System journal (/var/log/journal/a3e63195cec04c36bd6127d63e0d8c2c) is 16.0M, max 4.0G, 3.9G free.
Feb 13 14:17:02 ip-100-0-6-53.eu-west-1.compute.internal systemd-journald[10826]: Journal started
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:07:37.521683    1111 prober.go:106] Readiness probe for "external-ingress-dbf949d7-ljfj7_external-ingress(36b0bb0a-10c5-11e8-b492-02a4e16f28ae):external-ingress-controller" failed (failure): Get http://10.35.0.18:18080/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:07:37.521633    1111 prober.go:106] Liveness probe for "weave-net-6vrtm_kube-system(89be1a23-10c5-11e8-b492-02a4e16f28ae):weave" failed (failure): Get http://127.0.0.1:6784/status: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kubelet[1111]: I0213 14:07:37.521573    1111 prober.go:106] Readiness probe for "external-ingress-dbf949d7-wz745_external-ingress(43b56434-10c5-11e8-b492-02a4e16f28ae):external-ingress-controller" failed (failure): Get http://10.35.0.8:18080/healthz: net/http: request canceled (Client.Timeout exceeded while awaiting headers)
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kernel: 0 pages hwpoisoned
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kernel: 85413 pages reserved
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kernel: 0 pages HighMem/MovableOnly
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kernel: 4194205 pages RAM
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kernel: Total swap = 0kB
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kernel: Free swap  = 0kB
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kernel: Swap cache stats: add 0, delete 0, find 0/0
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kernel: 0 pages in swap cache
Feb 13 14:07:53 ip-100-0-6-53.eu-west-1.compute.internal kernel: 40614 total pagecache pages

@brb
Copy link
Contributor

brb commented Feb 13, 2018

Scenario 1 seems to be an instance of #3133 which should be fixed in the 2.2.0 release.

Scenario 2 Try creating VMs with more memory. From the dmesg log ^^ you can see that the kernel is struggling to allocate physical memory for OS processes, so the system might become unresponsive due to this.

@rasheedamir
Copy link

thanks for response @brb

Wondering why does k8s schedule more pods then a node can entertain?

Also we have seen that in some cases that node is removed with this message:

Description:DescriptionTerminating EC2 instance: i-08ef3e6776bf0f872
Cause:CauseAt 2018-02-13T11:59:19Z an instance was taken out of service in response to a EC2 instance status checks failure.

Does it give any indication?

@brb
Copy link
Contributor

brb commented Feb 14, 2018

Wondering why does k8s schedule more pods then a node can entertain?

Not sure. I think it's better to ask Kubernetes community (via Github issue / Slack).

Does it give any indication?

EC2 status checks might fail due to many reasons. Anyway, I would suggest to consult AWS docs.

@brb brb closed this as completed Feb 14, 2018
@brb brb added this to the n/a milestone Feb 14, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants