Peer connection attempts reported as failed #3398

erik-stephens · 2018-09-07T17:23:16Z

What you expected to happen?

Peer connection attempts to self not to be reported as failed.

Background

We got alarmed by these. After some research, the takeway is that it's noise. However, also saw a lot of issues about peers not being able to join shortly after CONN_LIMIT. We don't scale nodes much, so less of a concern for us, but still thought we should pay attention to this metric just in case. Our workaround is simply to treat N failed connections for N nodes in the cluster as normal.

Some Ideas

Adding a new state just for this might be a no-go.
Adding special handling in the metrics reporting to ignore when Info ~ "cannot connect to ourself" might be too brittle.
Patch something in the kubernetes add-on so that a node doesn't try to connect to itself. The hope is that something in kubernetes land can better inform weave which peers to try.

If core devs provide a bit of guidance, we might be able to submit a PR.

Versions:

weave 2.4.0
kubernetes 1.11

Logs:

kubectl -n kube-system exec weave-net-fqv5t -c weave -- /home/weave/weave --local report | jq -r '.Router.Connections[] | select(.State != "established")'
{
  "Address": "10.210.1.78:6783",
  "Outbound": true,
  "State": "failed",
  "Info": "cannot connect to ourself, retry: never",
  "Attrs": null
}

The text was updated successfully, but these errors were encountered:

murali-reddy · 2018-09-21T08:51:40Z

If core devs provide a bit of guidance, we might be able to submit a PR.

@erik-stephens Apologies for the delay in response. Please give it a try to submit a PR.

Please take a look at the kube-utils program which when called returns the list of the current set of Kubernetes node Ip'ss. That list is passed by launch.sh to launch main weave program by passing passing the llist as argument.

So try to exclude the self node in the returned list by kube-utils.

node IP from the list of the peers passed to weaver Fixes #3398

…rom the list of the peers passed to weaver Fixes #3398

brb added feature [component/kube] labels Sep 21, 2018

murali-reddy added a commit that referenced this issue Nov 21, 2018

prevent kubernetes node connecting to self by excluding the

7a0e725

node IP from the list of the peers passed to weaver Fixes #3398

murali-reddy mentioned this issue Nov 21, 2018

prevent kubernetes node connecting to self #3454

Merged

murali-reddy added a commit that referenced this issue Nov 21, 2018

prevent kubernetes node connecting to self by excluding the node IP f…

024cfbe

…rom the list of the peers passed to weaver Fixes #3398

murali-reddy added a commit that referenced this issue Nov 21, 2018

prevent kubernetes node connecting to self by excluding the node IP f…

81b1746

…rom the list of the peers passed to weaver Fixes #3398

bboreham closed this as completed in #3454 Jan 3, 2019

bboreham added this to the 2.6 milestone Nov 4, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Peer connection attempts reported as failed #3398

Peer connection attempts reported as failed #3398

erik-stephens commented Sep 7, 2018

murali-reddy commented Sep 21, 2018

Peer connection attempts reported as failed #3398

Peer connection attempts reported as failed #3398

Comments

erik-stephens commented Sep 7, 2018

What you expected to happen?

Background

Some Ideas

Versions:

Logs:

murali-reddy commented Sep 21, 2018