Create routes on OpenShift and ingresses on Kubernetes #50

amisevsk · 2020-04-09T23:39:30Z

What does this PR do?

Change WorkspaceRoutings controller to always create routes on OpenShift and ingresses on Kubernetes.

One of the goals in this PR was to eliminate the need for ingress.global.domain when running on OpenShift by not specifying hostnames when creating routes. However, OpenShift's automatically-generated route hostnames are <route-name>-<namespace>.<routing-suffix>, which means they're frequently invalid (if e.g. deployed in namespace che-workspace-controller. As a result, I've renamed ingress.global.domain to cluster.routing.suffix and added setting the value via the makefile.

Note: the controller does contain logic to automatically set routing suffix on OpenShift; however, I have had it fail/fall out of sync in the past so the separate option is still useful.

Is it tested? How?

Tested on crc with the usual matrix of settings. Have not currently tested on minikube but will before merge.

sleshchenko · 2020-04-14T12:24:24Z

pkg/controller/workspacerouting/workspacerouting_controller.go

+			}
+		}
+	}
+	return "", fmt.Errorf("could not get URL for endpoint %s", endpoint.Name)


Is it really an error? While workspace is starting I got quite a lot of spam

Are we able to return only that endpoints which are already exposed and schedule requeuing?
other (without ingress/route) could be skipped or returned with status - Exposing (not sure if we have endpoint status now).

Makes sense -- where are you running the operator? It may be a route/ingress exposure issue (i.e. it takes longer than expected). I don't see this on crc.

I'll add this logic to the PR.

Looking into this a bit more, I'm not sure what's causing the issue; we only return that error when we cannot find any ingress with the appropriate label. If the url field is empty, we just return empty.

I tried it on local crc

Looking into this a bit more, I'm not sure what's causing the issue; we only return that error when we cannot find any ingress with the appropriate label

Exactly. Now I see the same error but only once

So, instead of failing reconciling loop it would be better to make endpoint as problematic without host and retry later. I think it worths the dedicated PR because it needs introducing some phases,conditions for endpoints.

@sleshchenko I think the error you describe initially (could not get URL...) is a failure; we should never reach that point in the reconcile and not have routes/ingresses available. The previous step, prior to attempting this matching, is to make sure routes/ingresses are in sync between the cluster and spec. Not being able to match the two means that we have an endpoint (which defines the spec), that does not have a route/ingress associated with it. Until we have a concrete case where this can occur, it should mark the workspace as failed.

The latter failure, already exists, is a kind of familiar issue -- I haven't gotten around to implementing it, but we shouldn't log that error and just continue. It just means the state of the cluster has changed since we started our reconcile loop. The correct solution there would be to just requeue if errors.IsConflict(err) or errors.IsAlreadyExists(err)

Exactly. Now I see the same error but only once

This is a different error message, to be clear.

This is a different error message, to be clear.

My fault. But seems there were two tries to create ingress and it can indicate in the issues in the reconciling loop, because I did not create such a route but hand.

Will test more precisely. BTW it's not a blocker

Yeah, it's already exists issue and seems routes are not created immediately

RetryAfter 200 helps me to solve an issue. 100 is not enough...

routesInSync, clusterRoutes, err := r.syncRoutes(instance, routes) if err != nil || !routesInSync { reqLogger.Info("Routes not in sync") return reconcile.Result{ RequeueAfter: 100 * time.Millisecond}, err }

But not sure if it's a really good solution since this limit depends on the infrastructure, I'm OK with leaving this error propagated constantly for time being.

The main thing stopping us implementing this is that it's going to be a fair bit of boilerplate error checking; it might also be improved by eventually filtering reconciles (by determining if changes are necessary)

pkg/controller/workspacerouting/solvers/openshift_oauth_solver.go

pkg/config/constants.go

amisevsk · 2020-04-14T20:17:31Z

Added:

Use annotations for storing endpoint name instead of labels, use original endpoint name in annotation instead of DNS-sanitized version
Fix makefile not resetting registry ingress hostname when deploying to k8s
Don't try to sync routes when we're running on k8s, as we cannot list Routes
Add some validation to controller config: fail startup if we're not in OpenShift but use openshift-oauth as default routing
Set workspace routing phase to "Preparing" when we can't get route/ingress URLs.

sleshchenko

👍 tested on crc and works pretty fine except one error logged during workspace start.

Left comments can be considered to be addressed/discussed in dedicated issues scopes.

Makefile

sleshchenko · 2020-04-15T10:57:35Z

pkg/config/config.go

@@ -252,7 +260,7 @@ func fillOpenShiftRouteSuffixIfNecessary(nonCachedClient client.Client, configMa
 	host := testRoute.Spec.Host
 	if host != "" {
 		prefixToRemove := "che-workspace-controller-test-route-" + configMap.Namespace + "."


it may fail with default routing host generation is changes... So, maybe we should detect hostname only when it's missing in the configmap?

Also, che-workspace-controller-test-route-che-workspace-controller is near to drain 64 chars limit ) Consider making test route name shorter.

It's legacy functionality I didn't even know existed until I worked on this PR; I'm inclined to remove it entirely if there's doubts as to its usefulness. Thus far, the concerns are

It depends on deployed namespace being che-workspace-controller, will fail if we change that

Will fail if how default hostnames are computed is changed

Only works on OpenShift

This kind of points towards "we shouldn't support this at the moment". Executing this only when the entry in the cm is blank means that we just move all those failure cases down the line, except it makes it look like we're changing config requirements rather than the cluster is changing how it resolves things.

Regarding

Also, che-workspace-controller-test-route-che-workspace-controller is near to drain 64 chars limit ) Consider making test route name shorter.

This is actually not a problem; OpenShift will generously generate an invalid hostname and only tell you its a problem later :) :

$ oc get route che-plugin-registry-abcdefghijklmnopqrstuvwxyz -o yaml | grep host: host: che-plugin-registry-abcdefghijklmnopqrstuvwxyz-che-workspace-controller.apps-crc.testing message: 'host name validation errors: spec.host: Invalid value: "che-plugin-registry-abcdefghijklmnopqrstuvwxyz-che-workspace-controller.apps-crc.testing": host: che-plugin-registry-abcdefghijklmnopqrstuvwxyz-che-workspace-controller.apps-crc.testing

However, it might make sense to try and use .status.ingress[].routerCanonicalHostname:

$ oc get route che-plugin-registry-abcdefghijklmnopqrstuvwxyz -o yaml | yq '.status.ingress[].routerCanonicalHostname' "apps-crc.testing"

This is actually not a problem; OpenShift will generously generate an invalid hostname and only tell you its a problem later :) :

:-D It's true and it may be strange but it helps use in this case

Looked into using the status field, but it's not populated immediately. For now I'm leaving the functionality in place, but we should test its reliability in the future and potentially remove/improve it.

pkg/controller/workspacerouting/workspacerouting_controller.go

pkg/controller/workspacerouting/solvers/common.go

Change solvers behavior in workspaceroutings controller to always create routes when running on OpenShift and ingresses otherwise, regardless of routingClass.

Add Validate function to controller config to be used during set-up. Currently only fails if default routing class is openshift-oauth on a kubernetes cluster Signed-off-by: Angel Misevski <amisevsk@redhat.com>

- Use preparing state when not all ingresses/routes have URL set on cluster - Improve error message when we can't get a URL for an endpoint Signed-off-by: Angel Misevski <amisevsk@redhat.com>

Signed-off-by: Angel Misevski <amisevsk@redhat.com>

amisevsk requested review from davidfestal, sleshchenko and JPinkney April 9, 2020 23:39

sleshchenko reviewed Apr 14, 2020

View reviewed changes

pkg/controller/workspacerouting/solvers/openshift_oauth_solver.go Show resolved Hide resolved

sleshchenko reviewed Apr 14, 2020

View reviewed changes

pkg/config/constants.go Outdated Show resolved Hide resolved

sleshchenko approved these changes Apr 15, 2020

View reviewed changes

amisevsk force-pushed the use-routes-on-openshift branch 2 times, most recently from 1f21471 to a3b95fe Compare April 15, 2020 19:01

sleshchenko approved these changes Apr 16, 2020

View reviewed changes

JPinkney approved these changes Apr 16, 2020

View reviewed changes

amisevsk added 4 commits April 17, 2020 01:24

Create routes on OpenShift and ingresses on Kubernetes

c1c54ea

Change solvers behavior in workspaceroutings controller to always create routes when running on OpenShift and ingresses otherwise, regardless of routingClass.

Add validation to controller config

86e82ef

Add Validate function to controller config to be used during set-up. Currently only fails if default routing class is openshift-oauth on a kubernetes cluster Signed-off-by: Angel Misevski <amisevsk@redhat.com>

Use preparing state in workspaceroutings; improve error message

591effe

- Use preparing state when not all ingresses/routes have URL set on cluster - Improve error message when we can't get a URL for an endpoint Signed-off-by: Angel Misevski <amisevsk@redhat.com>

Extract endpoint resolving to separate file in workspacerouting

1679b6a

Signed-off-by: Angel Misevski <amisevsk@redhat.com>

amisevsk force-pushed the use-routes-on-openshift branch from a3b95fe to 1679b6a Compare April 17, 2020 05:44

amisevsk merged commit bf5e318 into devfile:master Apr 17, 2020

amisevsk mentioned this pull request Apr 17, 2020

Workspace controller nightly builds will break whenever CRDs are modified eclipse-che/che#16658

Closed

amisevsk deleted the use-routes-on-openshift branch February 8, 2023 15:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Create routes on OpenShift and ingresses on Kubernetes #50

Create routes on OpenShift and ingresses on Kubernetes #50

amisevsk commented Apr 9, 2020

sleshchenko Apr 14, 2020

amisevsk Apr 14, 2020

amisevsk Apr 14, 2020

sleshchenko Apr 15, 2020

sleshchenko Apr 15, 2020

amisevsk Apr 15, 2020

amisevsk Apr 15, 2020

sleshchenko Apr 16, 2020

sleshchenko Apr 16, 2020

amisevsk Apr 16, 2020

amisevsk commented Apr 14, 2020

sleshchenko left a comment

sleshchenko Apr 15, 2020

amisevsk Apr 15, 2020 •

edited

Loading

sleshchenko Apr 16, 2020

amisevsk Apr 17, 2020

Create routes on OpenShift and ingresses on Kubernetes #50

Create routes on OpenShift and ingresses on Kubernetes #50

Conversation

amisevsk commented Apr 9, 2020

What does this PR do?

Is it tested? How?

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amisevsk commented Apr 14, 2020

sleshchenko left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amisevsk Apr 15, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

amisevsk Apr 15, 2020 •

edited

Loading