-
Notifications
You must be signed in to change notification settings - Fork 63
Sometimes port 80 goes to the tcp-router healthcheck #1199
Comments
A first attempt at reproduction, minikube, diego, SA, using go-env for the app. In this attempt access via http works:
Given the non-deterministic nature this is not yet repro failure. That said, I decided to look at the Pulling logs and grepping I see
The app routes are announced via route-emitter, for the go-router to see. First thought now that there is no proper announcement of the route in the bad case ? @jandubois When you say
Is that for different kubecf deployments, or for different app deployments on the same kubecf ? |
I believe the problem is not gorouter; it's that we expose the TCP router health check service at the Kubernetes level on port 80, so which service gets used when you contact kubecf/chart/templates/ingress.yaml Lines 204 to 207 in 7960f1e
I believe you'll only hit this when not using ingress. |
@mook-as Non-deterministic at what level ? Just tried in my current setup (minikube, no ingress, go-env app) ... A series of 40 curl's to the app all returned Next, re-deployed the app 10 times, and checked if that was enough to see the issue ... It wasn't. It seems whatever happens is locked in as part of kubecf deployment. I.e. will now have to re-deploy kubecf several times as part of checking. |
@andreas-kupries I believe that once one service has bound, it'll stay bound (so you will consistently see one or the other, until you re-deploy KubeCF). |
Hm. A race condition then, which service is up first, or at least, is seen and bound first by kube. |
Definitely a a race condition. @mook-as proposed
With the cluster I had up and working, i.e. http access ok I then did:
So, whichever of the services has the luck to be bound by kube first on startup is where access to port 80 is dispatched. With the above we currently have to workaround to fix the issue when it happens:
A proper fix would be to not declare the healthcheck port from the beginning. However, IIRC, we need the port declared for some platforms to work correctly, i.e. to not break the entire tcp routing. IIRC it is AWS which needs the healthcheck port so that the AWS loadbalancer can detect availability of the service and thus open it to the public. IOW without AWS would not open/start the loadbalancer, leaving tcp routing offline. Is there a way for a helm chart to detect the kind of kube platform it is deployed to ? ping @jandubois @mook-as |
I believe that we need a healthcheck when using a load balancer, at least on EKS / AKS / whatever? But not when we're using |
Hm.
The two exposed services are Another thing to consider. We have seen this only for minikube deployments, right ? ping @jandubois @mook-as |
Describe the bug
Sometimes I set up a kubecf on minikube, login and push an app. Everything works, but accessing the app returns a 503:
Accessing the app via https works:
@mook-as provided some feedback:
So I ran
kubectl edit svc -n kubecf tcp-router-public
and removed this section:Afterwards the app worked normally on port 80 as well.
I've experienced this issues 4 or 5 times over the last couple of days, so maybe 30-40% of the time.
The text was updated successfully, but these errors were encountered: