Che operator timing out when creating CheCluster in OpenShift #20487

jawnsy · 2021-09-18T12:40:03Z

Describe the bug

After creating a CheCluster, the controller seems to hang indefinitely. The che-operator container reports the following in its logs:

time="2021-09-18T12:15:10Z" level=info msg="Running exec for 'create Keycloak DB, user, privileges' in the pod 'postgres-7f797d9448-vrx8m'"
time="2021-09-18T12:15:10Z" level=error msg="Error running exec: Internal error occurred: failed calling webhook \"validate-exec.devworkspace-controller.svc\": Post \"https://devworkspace-webhookserver.devworkspace-controller.svc:443/validate?timeout=10s\": service \"devworkspace-webhookserver\" not found, command: [/bin/bash -c OUT=$(psql postgres -tAc \"SELECT 1 FROM pg_roles WHERE rolname='keycloak'\"); if [ $OUT -eq 1 ]; then echo \"DB exists\"; exit 0; fi && psql -c \"CREATE USER keycloak WITH PASSWORD 'oEuaiDBmcqxM'\" && psql -c \"CREATE DATABASE keycloak\" && psql -c \"GRANT ALL PRIVILEGES ON DATABASE keycloak TO keycloak\" && psql -c \"ALTER USER ${POSTGRESQL_USER} WITH SUPERUSER\"]"
time="2021-09-18T12:15:10Z" level=error msg="Internal error occurred: failed calling webhook \"validate-exec.devworkspace-controller.svc\": Post \"https://devworkspace-webhookserver.devworkspace-controller.svc:443/validate?timeout=10s\": service \"devworkspace-webhookserver\" not found"
{"level":"error","ts":1631967310.996203,"logger":"controller","msg":"Reconciler error","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","name":"eclipse-che","namespace":"eclipse-che","error":"Internal error occurred: failed calling webhook \"validate-exec.devworkspace-controller.svc\": Post \"https://devworkspace-webhookserver.devworkspace-controller.svc:443/validate?timeout=10s\": service \"devworkspace-webhookserver\" not found","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/che-operator/vendor/github.com/go-logr/zapr/zapr.go:132\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/che-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:246\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/che-operator/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:218\nsigs.k8s.io/controller-runtime/pkg/internal/controller...

$ oc version
Client Version: 4.7.0-0.okd-2021-06-19-191547
Server Version: 4.7.0-0.okd-2021-08-22-163618
Kubernetes Version: v1.20.0-1093+4593a24e8fd58d-dirty

Che version

7.36@latest

Steps to reproduce

Install the Che Operator from OperatorHub (version 7.36.1)
Create a CheCluster with default settings

Expected behavior

Following the installation instructions, it seems creating a Che operator with default settings should be OK.

Runtime

OpenShift

Screenshots

No response

Installation method

OperatorHub

Environment

Linux

Eclipse Che Logs

chectl-logs.tar.gz

Additional context

No response

The text was updated successfully, but these errors were encountered:

tolusha · 2021-09-20T06:27:37Z

Possible duplicates #19243
@jawnsy Did you enable devworkspace in a checluster?

/cc @sleshchenko

jawnsy · 2021-09-20T12:42:42Z

@tolusha Thanks for the quick reply! I was following the instructions in the documentation, which mention using the default settings, so I didn't enable the DevWorkspace stuff. I'm pretty new to Che, so I apologize if this was documented somewhere else.

I did try again using the DevWorkspace operator and image puller, but things still never install.

Here are my CheCluster settings:

apiVersion: org.eclipse.che/v1
kind: CheCluster
metadata:
  name: eclipse-che
  namespace: eclipse-che
spec:
  auth:
    identityProviderURL: ''
    identityProviderRealm: ''
    oAuthSecret: ''
    identityProviderPassword: ''
    oAuthClientName: ''
    initialOpenShiftOAuthUser: true
    identityProviderClientId: ''
    identityProviderAdminUserName: ''
    externalIdentityProvider: false
    openShiftoAuth: true
  database:
    chePostgresDb: ''
    chePostgresHostName: ''
    chePostgresPassword: ''
    chePostgresPort: ''
    chePostgresUser: ''
    externalDb: false
  devWorkspace:
    enable: true
  imagePuller:
    enable: true
  metrics:
    enable: true
  server:
    proxyURL: ''
    cheClusterRoles: ''
    proxyPassword: ''
    nonProxyHosts: ''
    proxyPort: ''
    tlsSupport: true
    allowUserDefinedWorkspaceNamespaces: false
    serverTrustStoreConfigMapName: ''
    proxyUser: ''
    cheWorkspaceClusterRole: ''
    workspaceNamespaceDefault: <username>-che
    serverExposureStrategy: ''
    gitSelfSignedCert: false
    cheFlavor: ''
  storage:
    postgresPVCStorageClassName: ''
    preCreateSubPaths: true
    pvcClaimSize: 10Gi
    pvcStrategy: common
    workspacePVCStorageClassName: ''

After enabling the devworkspace thing, the che-operator instead seems to go into CrashLoopBackOff:

      state:
        waiting:
          reason: CrashLoopBackOff
          message: >-
            back-off 5m0s restarting failed container=che-operator
            pod=che-operator-77f76d4cdb-sv9lt_eclipse-che(b7dcbe8a-12bf-428a-81ea-f7d331c3d649)

The logs for that pod don't show much:

{"level":"info","ts":1632141423.2957177,"msg":"Binary info ","Go version":"go1.15.14"}
{"level":"info","ts":1632141423.2957582,"msg":"Binary info ","OS":"linux","Arch":"amd64"}
{"level":"info","ts":1632141423.295763,"msg":"Address ","Metrics":":60000"}
{"level":"info","ts":1632141423.2957666,"msg":"Address ","Probe":":6789"}
{"level":"info","ts":1632141423.2957697,"msg":"Operator is running on ","Infrastructure":"OpenShift v4.x"}
I0920 12:37:04.348632       1 request.go:655] Throttling request took 1.046983251s, request: GET:https://172.30.0.1:443/apis/scheduling.k8s.io/v1?timeout=32s
{"level":"info","ts":1632141427.4579651,"logger":"controller-runtime.metrics","msg":"metrics server is starting to listen","addr":":60000"}
time="2021-09-20T12:37:11Z" level=info msg="Use 'terminationGracePeriodSeconds' 20 sec. from operator deployment."
{"level":"info","ts":1632141431.6351597,"logger":"setup","msg":"starting manager"}
time="2021-09-20T12:37:11Z" level=info msg="Set up process signal handler"
I0920 12:37:11.635625       1 leaderelection.go:243] attempting to acquire leader lease eclipse-che/e79b08a4.org.eclipse.che...
{"level":"info","ts":1632141431.6356351,"logger":"controller-runtime.manager","msg":"starting metrics server","path":"/metrics"}
I0920 12:37:27.969148       1 leaderelection.go:253] successfully acquired lease eclipse-che/e79b08a4.org.eclipse.che
{"level":"info","ts":1632141447.9693384,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141447.9693425,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141447.9694028,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheClusterBackup","controller":"checlusterbackup-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1632141447.9693723,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheClusterRestore","controller":"checlusterrestore-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1632141448.0699866,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141448.0701144,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141448.0701604,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheClusterRestore","controller":"checlusterrestore-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1632141448.0702782,"logger":"controller","msg":"Starting Controller","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheClusterRestore","controller":"checlusterrestore-controller"}
{"level":"info","ts":1632141448.0702825,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141448.0704453,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheClusterBackup","controller":"checlusterbackup-controller","source":"kind source: /, Kind="}
{"level":"info","ts":1632141448.070481,"logger":"controller","msg":"Starting Controller","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheClusterBackup","controller":"checlusterbackup-controller"}
{"level":"info","ts":1632141448.1704342,"logger":"controller","msg":"Starting workers","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheClusterRestore","controller":"checlusterrestore-controller","worker count":1}
{"level":"info","ts":1632141448.1705747,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141448.1708722,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141448.271135,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141450.8724709,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141451.0732284,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141451.5738478,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141451.676314,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}
{"level":"info","ts":1632141453.246946,"logger":"controller","msg":"Starting EventSource","reconcilerGroup":"org.eclipse.che","reconcilerKind":"CheCluster","controller":"checluster","source":"kind source: /, Kind="}

tolusha · 2021-09-20T15:06:55Z

Which channel did you use to install eclipse-che from ? I think it was tech-preview channel.

jawnsy · 2021-09-20T17:03:03Z

I installed from the stable channel I believe, here's my Subscription

apiVersion: operators.coreos.com/v1alpha1
kind: Subscription
metadata:
  name: eclipse-che
  namespace: eclipse-che
  labels:
    operators.coreos.com/eclipse-che.eclipse-che: ''
spec:
  channel: stable
  installPlanApproval: Automatic
  name: eclipse-che
  source: community-operators
  sourceNamespace: openshift-marketplace
  startingCSV: eclipse-che.v7.36.1
status:
  installplan:
    apiVersion: operators.coreos.com/v1alpha1
    kind: InstallPlan
    name: install-9j75x
    uuid: 458b5123-90ff-4a87-8709-be12cea4d3dd
  lastUpdated: '2021-09-20T12:04:19Z'
  installedCSV: eclipse-che.v7.36.1
  currentCSV: eclipse-che.v7.36.1
  installPlanRef:
    apiVersion: operators.coreos.com/v1alpha1
    kind: InstallPlan
    name: install-9j75x
    namespace: eclipse-che
    resourceVersion: '56385216'
    uid: 458b5123-90ff-4a87-8709-be12cea4d3dd
  state: AtLatestKnown
  catalogHealth:
    - catalogSourceRef:
        apiVersion: operators.coreos.com/v1alpha1
        kind: CatalogSource
        name: community-operators
        namespace: openshift-marketplace
        resourceVersion: '56378334'
        uid: 52a13556-2e10-41fe-8aac-cba654585fbf
      healthy: true
      lastUpdated: '2021-09-20T12:04:05Z'
  conditions:
    - lastTransitionTime: '2021-09-20T12:04:05Z'
      message: all available catalogsources are healthy
      reason: AllCatalogSourcesHealthy
      status: 'False'
      type: CatalogSourcesUnhealthy
  installPlanGeneration: 2

tolusha · 2021-09-21T14:04:13Z

I wasn't able to reproduce the issue.
Probably it is a matter of sequence of some actions.
So, I recommend you:

Remove DevWorkspace using make uninstall command from DevWorkspace Operator repository
Redeploy Eclipse Che with spec.devworkspace.enable: false (if you use the stable channel)

che-bot · 2022-03-20T00:36:35Z

Issues go stale after 180 days of inactivity. lifecycle/stale issues rot after an additional 7 days of inactivity and eventually close.

Mark the issue as fresh with /remove-lifecycle stale in a new comment.

If this issue is safe to close now please do so.

Moderators: Add lifecycle/frozen label to avoid stale mode.

jawnsy added the kind/bug Outline of a bug - must adhere to the bug report template. label Sep 18, 2021

che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Sep 18, 2021

tolusha removed the status/info-needed More information is needed before the issue can move into the “analyzing” state for engineering. label Sep 21, 2021

che-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 20, 2022

che-bot closed this as completed Mar 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Che operator timing out when creating CheCluster in OpenShift #20487

Che operator timing out when creating CheCluster in OpenShift #20487

jawnsy commented Sep 18, 2021 •

edited

Loading

tolusha commented Sep 20, 2021 •

edited

Loading

jawnsy commented Sep 20, 2021

tolusha commented Sep 20, 2021

jawnsy commented Sep 20, 2021

tolusha commented Sep 21, 2021

che-bot commented Mar 20, 2022

Che operator timing out when creating CheCluster in OpenShift #20487

Che operator timing out when creating CheCluster in OpenShift #20487

Comments

jawnsy commented Sep 18, 2021 • edited Loading

Describe the bug

Che version

Steps to reproduce

Expected behavior

Runtime

Screenshots

Installation method

Environment

Eclipse Che Logs

Additional context

tolusha commented Sep 20, 2021 • edited Loading

jawnsy commented Sep 20, 2021

tolusha commented Sep 20, 2021

jawnsy commented Sep 20, 2021

tolusha commented Sep 21, 2021

che-bot commented Mar 20, 2022

jawnsy commented Sep 18, 2021 •

edited

Loading

tolusha commented Sep 20, 2021 •

edited

Loading