Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Identify the reason of slow workspace startup #14600

Closed
skabashnyuk opened this issue Sep 19, 2019 · 1 comment
Closed

Identify the reason of slow workspace startup #14600

skabashnyuk opened this issue Sep 19, 2019 · 1 comment
Labels
kind/task Internal things, technical debt, and to-do tasks to be performed. severity/P1 Has a major impact to usage or development of the system.
Milestone

Comments

@skabashnyuk
Copy link
Contributor

Is your task related to a problem? Please describe.

If multiple users would like to start workspaces at the same time, for example, after some announcement, we can observe that some workspaces started very slowly.
Знімок екрана  о 15 51 21

Describe the solution you'd like

Identify a bottleneck, propose solution/create and issue to remove them.

Describe alternatives you've considered

n/a

Additional context

n/a

@skabashnyuk skabashnyuk added kind/task Internal things, technical debt, and to-do tasks to be performed. team/platform labels Sep 19, 2019
@skabashnyuk skabashnyuk added this to the 7.3.0 milestone Sep 19, 2019
@skabashnyuk skabashnyuk added the severity/P1 Has a major impact to usage or development of the system. label Sep 19, 2019
@skabashnyuk
Copy link
Contributor Author

skabashnyuk commented Oct 14, 2019

WorkspaceSharedPool

One of the reason, that affects UX during workspace start is the configuration of WorkspaceSharedPool. Now by default, it's connected to https://github.com/eclipse/che/blob/eb66fee86ffeaa87ce27465edda32845374102c2/wsmaster/che-core-api-workspace/src/main/java/org/eclipse/che/api/workspace/server/WorkspaceSharedPool.java#L67 that is visible for java process.
Знімок екрана  о 13 00 50
Знімок екрана  о 13 01 20
In these diagrams, it is seen that the pool is configured with two threads which are not enough to handle a massive load. Since Che6 workspace start took about 10-15sec and che7 40-50 or more - the configuration of this pool becomes even more critical.

WaitMachinesStart

That metric should not be related to WorkspaceSharedPool. Since the start is already in progress. It means that doStartMachine aka the start of Pods and PodTemplateSpec are already initiated in https://github.com/eclipse/che/blob/eb66fee86ffeaa87ce27465edda32845374102c2/infrastructures/kubernetes/src/main/java/org/eclipse/che/workspace/infrastructure/kubernetes/KubernetesInternalRuntime.java#L661 and we just waiting for them to start. Due to the lack of traces/logs/metrics, there is nothing that I can say more. That can be a natural process or not a bug or regression in case if the user adds some customization that is long/impossible to deploy.

doStartMachine

The metric by itself doesn't contain any suspicion parts. Usually, it completes less than a second.
https://github.com/eclipse/che/blob/eb66fee86ffeaa87ce27465edda32845374102c2/infrastructures/kubernetes/src/main/java/org/eclipse/che/workspace/infrastructure/kubernetes/KubernetesInternalRuntime.java#L661-L685 I have nothing that can explain 60-sec spikes of this operation execution. Potential suspect OkHttpClient from KubernetesClientFactory https://github.com/eclipse/che/blob/eb66fee86ffeaa87ce27465edda32845374102c2/infrastructures/kubernetes/src/main/java/org/eclipse/che/workspace/infrastructure/kubernetes/KubernetesClientFactory.java#L65-L74

Action items

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/task Internal things, technical debt, and to-do tasks to be performed. severity/P1 Has a major impact to usage or development of the system.
Projects
None yet
Development

No branches or pull requests

1 participant