Eclipse che - volume mount error while launching more than 5 workspace at a time #19355

andr-azeez · 2021-03-22T12:03:05Z

Describe the bug

Logged in 10 different users at the same time and launched 10 workspaces of each users at a time. 3 - 5 users are able to launch the workspace successfully, for remaining users getting time out error in mount volume, some users keep on loading the workspace and nothing is initialised in log window.

Che version

latest

Advanced configuration:

Runtime

kubernetes (include output of kubectl version)
- kubectl version - Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.4", GitCommit:"e87da0bd6e03ec3fea7933c4b5263d151aafd07c", GitTreeState:"clean", BuildDate:"2021-02-18T16:12:00Z", GoVersion:"go1.15.8", Compiler:"gc", Platform:"linux/amd64"}
  Server Version: version.Info{Major:"1", Minor:"18+", GitVersion:"v1.18.12-gke.1210", GitCommit:"199a41188dc0ca5d6d95b1cc7e8ba96e05f9dd0a", GitTreeState:"clean", BuildDate:"2021-02-05T18:03:16Z", GoVersion:"go1.13.15b4", Compiler:"gc", Platform:"linux/amd64"}

Installation method

chectl
- chectl installation command - sudo chectl server:deploy --installer=helm --platform=k8s --domain=cl-ide.domain.com --multiuser --installer=operator
- chectl version - chectl/7.27.1 linux-x64 node-v12.21.0

Environment

my computer
- Linux
Cloud
- GCE
  - Google Kubernete Engine (GKE) - with 2 nodes running
  - Node Configuration - 2 core 8 GB memory Machine with 30 GB hardisk
  - Auto scale is enabaled

Screenshots

Steps to reproduce

Login 10 users at a time.
launch 10 different workspace at time.

Expected behavior

It should be able to launch more than 50 workspaces at a time.

The text was updated successfully, but these errors were encountered:

vadirajspringpeople · 2021-03-23T14:28:17Z

Even I am facing this issue, requesting to fix the issue ASAP.

andr-azeez · 2021-03-24T06:47:52Z

Same test is repeated again with the upgraded machine as
Machine size: 8 core vcpu / 16 gb memory
Nodes: 3
Auto Scaling : Enabled

8 users are able to login successfully, 2 users got error as in screenshot

Is there any concurrent queing issue or it has to be updated somewhere in advanced configuration?

sleshchenko · 2021-03-26T07:33:19Z

I believe there is not much Che can do with it.
I assume K8s behaves bad when one K8s namespace has many configmaps, which then should be mount to pod.
Che could merge all data to one configmap, but then you'll hit the same issue but with more running workspaces.
The only solution: use per-user namespace strategy. It's workspaceNamespaceDefault: <username>-che.
Note also that all workspace in one namespace, is going to be fully deprecated soon #19365

andr-azeez · 2021-03-26T10:30:01Z

Tried with the following config,

server:
workspaceNamespaceDefault: <username>-che
storage:
pvcClaimSize: 1Gi
pvcStrategy: common

again facing the same issue, while loading 10 workspace at a time. It launches ~6 workspace successfully, remaining fails.

It takes more time in pvc attachment while launching workspace. Is there any way where we can pre-attach the pvc for all the workspace?

sleshchenko · 2021-03-26T13:25:00Z

Well, the issue you faced is not about PVC but about configmap cache, which is K8s internals.
=( that namespace strategy did not help.

The last thing you can try to workaround the issue: remove FailedMount from unrecoverable events, in CheCluster CR it should be like the following:

spec:
  server:
    customCheProperties:
      CHE_INFRA_KUBERNETES_WORKSPACE__UNRECOVERABLE__EVENTS: FailedScheduling,MountVolume.SetUpfailed,Failed to pull image,FailedCreate,ReplicaSetCreateError

Then workspace won't fail immediately after warning happened and maybe it will go further.

You may find a solution for you cluster and you search kubernetes failed to sync configmap cache timed out waiting for the condition, I see some issues on github created.

andr-azeez · 2021-03-31T09:02:43Z

Changed the configuration as follows:

  server:
    customCheProperties:
      CHE_INFRA_KUBERNETES_PVC_JOBS_MEMORYLIMIT: 756Mi
      CHE_INFRA_KUBERNETES_WORKSPACE__UNRECOVERABLE__EVENTS: FailedScheduling,MountVolume.SetUpfailed,Failed to pull image,FailedCreate,ReplicaSetCreateError`

Now tested with 20 users and launched 20 workspaces of random stacks.
Most of the workspace skipped the error with kubernetes failed to sync configmap cache timed out but able to launch the workspace successfully.
Nearly 15 - 17 workspace were launched successfully and nearly 4 - 5 workspace got workspace launch time out error.

Next step is to test about to launch 50 workspace simultaneously, and I will post the result here once the test is done.

che-bot · 2021-10-12T12:48:01Z

Issues go stale after 180 days of inactivity. lifecycle/stale issues rot after an additional 7 days of inactivity and eventually close.

Mark the issue as fresh with /remove-lifecycle stale in a new comment.

If this issue is safe to close now please do so.

Moderators: Add lifecycle/frozen label to avoid stale mode.

andr-azeez added the kind/bug Outline of a bug - must adhere to the bug report template. label Mar 22, 2021

che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Mar 22, 2021

ericwill added severity/P2 Has a minor but important impact to the usage or development of the system. area/che-server and removed status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. labels Mar 22, 2021

che-bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Oct 12, 2021

che-bot closed this as completed Nov 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eclipse che - volume mount error while launching more than 5 workspace at a time #19355

Eclipse che - volume mount error while launching more than 5 workspace at a time #19355

andr-azeez commented Mar 22, 2021 •

edited

Loading

vadirajspringpeople commented Mar 23, 2021

andr-azeez commented Mar 24, 2021

sleshchenko commented Mar 26, 2021 •

edited

Loading

andr-azeez commented Mar 26, 2021 •

edited

Loading

sleshchenko commented Mar 26, 2021 •

edited

Loading

andr-azeez commented Mar 31, 2021 •

edited

Loading

che-bot commented Oct 12, 2021

Eclipse che - volume mount error while launching more than 5 workspace at a time #19355

Eclipse che - volume mount error while launching more than 5 workspace at a time #19355

Comments

andr-azeez commented Mar 22, 2021 • edited Loading

Describe the bug

Che version

Advanced configuration:

Runtime

Installation method

Environment

Screenshots

Steps to reproduce

Expected behavior

vadirajspringpeople commented Mar 23, 2021

andr-azeez commented Mar 24, 2021

sleshchenko commented Mar 26, 2021 • edited Loading

andr-azeez commented Mar 26, 2021 • edited Loading

sleshchenko commented Mar 26, 2021 • edited Loading

andr-azeez commented Mar 31, 2021 • edited Loading

che-bot commented Oct 12, 2021

andr-azeez commented Mar 22, 2021 •

edited

Loading

sleshchenko commented Mar 26, 2021 •

edited

Loading

andr-azeez commented Mar 26, 2021 •

edited

Loading

sleshchenko commented Mar 26, 2021 •

edited

Loading

andr-azeez commented Mar 31, 2021 •

edited

Loading