GCE: ingress only shows the first backend's healthiness in `backends` annotation #35

bowei · 2017-10-11T17:49:30Z

From @MrHohn on September 20, 2017 1:43

From kubernetes/enhancements#27 (comment).

We attach backends annotation to ingress object after LB creation:

  ...
  backends:		{"k8s-be-30910--7b4223ab4c1af15d":"UNHEALTHY"}

And from the implementation:
https://github.com/kubernetes/ingress/blob/937cde666e533e4f70087207910d6135c672340a/controllers/gce/backends/backends.go#L437-L452

Using only the first backend's healthiness to represent the healthiness for all backends seems incorrect.

cc @freehan

Copied from original issue: kubernetes/ingress-nginx#1395

The text was updated successfully, but these errors were encountered:

bowei · 2017-10-11T17:49:31Z

From @yastij on September 20, 2017 14:50

@MrHohn - anyone working on this one ? if not I can send a PR

bowei · 2017-10-11T17:49:31Z

From @MrHohn on September 20, 2017 20:57

@yastij Nope, though I'm not quite sure how should we present the backends healthiness in annotation --- for huge cluster, we might have too many backends (nodes). It also seems unwise to append all of them via annotation...

cc @nicksardo

bowei · 2017-10-11T17:49:31Z

From @yastij on September 20, 2017 21:13

Maybe state healthy when all the backends are (checking all backends) , and unhealthy when some are (specifying which ones aren't healthy)

bowei · 2017-10-11T17:49:32Z

From @MrHohn on September 20, 2017 21:18

Maybe state healthy when all the backends are (checking all backends) , and unhealthy when some are (specifying which ones aren't healthy)

Yeah that sort of makes sense, though for the externlTrafficPolicy=Local case, some of the backends (nodes) may intentionally fail LB healthcheck so that traffic will only go to nodes that contains backend pods. Showing this as unhealthy may scare users, and it isn't actually unhealthy :(

bowei · 2017-10-11T17:49:32Z

From @yastij on September 20, 2017 21:59

@MrHohn - we can detect this case nope ? if it is set to Local we can ignore the unhealthy status ?

bowei · 2017-10-11T17:49:33Z

From @nicksardo on September 25, 2017 23:29

I do not know if people use externalTrafficPolicy=Local with ingress (I've never tried it) and it's not something we document with ingress. It may technically work, but I don't know how well it works in production with rolling updates and other edge cases. Although if we wanted to support that case, another option is to correlate the instance status with the pods location(node).

I agree that this annotation is not accurate. Even if it shows a correct status, the annotation is only refreshed on every sync (which may be 10 minutes or longer if there are a lot of ingress objects). My question is whether this annotation is worth keeping. Wouldn't users be better off looking at the GCP Console for backend status? Do users have daemons which poll this annotation and perform alerts? If the only case we're concerned about is bad healthcheck configuration breaking all backends, couldn't we create an alert saying "All backends are unhealthy - please investigate"?

cc @csbell @nikhiljindal

fejta-bot · 2018-01-09T18:50:37Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

Prevent issues from auto-closing with an /lifecycle frozen comment.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or @fejta.
/lifecycle stale

yastij · 2018-01-09T19:23:04Z

@bowei @MrHohn @nicksardo - is this still open ?

fejta-bot · 2018-02-10T21:47:09Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

yastij · 2018-02-11T11:58:13Z

/remove-lifecycle rotten

bowei · 2018-02-12T05:37:32Z

thanks -- let's keep this one open

fejta-bot · 2018-05-13T06:07:54Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

fejta-bot · 2018-06-12T06:54:23Z

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten
/remove-lifecycle stale

yastij · 2018-06-12T06:55:43Z

/remove-lifecycle

fejta-bot · 2018-07-12T07:40:12Z

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

fejta-bot · 2018-10-14T20:21:58Z

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

bowei · 2018-11-06T19:21:01Z

/lifecycle frozen

ashi009 · 2019-08-15T03:48:01Z

Any update on this?

In the kube-proxy world, showing first backends health status is probably OK, as the traffic can hop between nodes when access with node port. Each backend group will report health regardless if there is a backend in that zone at all.

However, with NEG backends, it's possible to have zones with no corresponding backends at all. In that case, the health status for those backend groups will be constantly unknown, and the console looks broken though the ingress works fine:

For this reason, I think this should be fixed.

bowei · 2019-08-15T05:55:07Z

@freehan -- can we put this in the backlog? It looks like a self contained item

ashi009 · 2019-08-15T08:12:02Z

FTR: the GKE console issue is tracked at https://issuetracker.google.com/issues/130748827.

swetharepakula · 2022-05-19T18:25:26Z

This has been fixed with #936.

bowei added the backend/gce label Oct 11, 2017

bowei mentioned this issue Oct 11, 2017

GCE: ingress only shows the first backend's healthiness in backends annotation kubernetes/ingress-nginx#1395

Closed

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 9, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Feb 10, 2018

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Feb 11, 2018

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 13, 2018

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 12, 2018

k8s-ci-robot closed this as completed Jul 12, 2018

nicksardo reopened this Jul 12, 2018

nicksardo added kind/bug Categorizes issue or PR as related to a bug. and removed lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. labels Jul 16, 2018

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 14, 2018

k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Nov 6, 2018

rramkumar1 mentioned this issue Nov 27, 2018

Backend health is reported as "Unknown" if there are no pods in the first zone of a regional cluster #554

Closed

rramkumar1 removed the backend/gce label Feb 20, 2019

freehan mentioned this issue Nov 13, 2019

iterate more than one backend for backend service status #936

Merged

swetharepakula closed this as completed May 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GCE: ingress only shows the first backend's healthiness in `backends` annotation #35

GCE: ingress only shows the first backend's healthiness in `backends` annotation #35

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

fejta-bot commented Jan 9, 2018

yastij commented Jan 9, 2018

fejta-bot commented Feb 10, 2018

yastij commented Feb 11, 2018

bowei commented Feb 12, 2018

fejta-bot commented May 13, 2018

fejta-bot commented Jun 12, 2018

yastij commented Jun 12, 2018

fejta-bot commented Jul 12, 2018

fejta-bot commented Oct 14, 2018

bowei commented Nov 6, 2018

ashi009 commented Aug 15, 2019

bowei commented Aug 15, 2019

ashi009 commented Aug 15, 2019

swetharepakula commented May 19, 2022

GCE: ingress only shows the first backend's healthiness in backends annotation #35

GCE: ingress only shows the first backend's healthiness in backends annotation #35

Comments

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

bowei commented Oct 11, 2017

fejta-bot commented Jan 9, 2018

yastij commented Jan 9, 2018

fejta-bot commented Feb 10, 2018

yastij commented Feb 11, 2018

bowei commented Feb 12, 2018

fejta-bot commented May 13, 2018

fejta-bot commented Jun 12, 2018

yastij commented Jun 12, 2018

fejta-bot commented Jul 12, 2018

fejta-bot commented Oct 14, 2018

bowei commented Nov 6, 2018

ashi009 commented Aug 15, 2019

bowei commented Aug 15, 2019

ashi009 commented Aug 15, 2019

swetharepakula commented May 19, 2022

GCE: ingress only shows the first backend's healthiness in `backends` annotation #35

GCE: ingress only shows the first backend's healthiness in `backends` annotation #35