Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[devworkspace] Investigate the scale-to-zero after a period of inactivity #16683

Closed
sleshchenko opened this issue Apr 21, 2020 · 6 comments · Fixed by devfile/devworkspace-operator#84
Assignees
Labels
engine/devworkspace Issues related to Che configured to use the devworkspace controller as workspace engine. kind/enhancement A feature request - must adhere to the feature request template.

Comments

@sleshchenko
Copy link
Member

Is your enhancement related to a problem? Please describe.

The CloudShell needs the scale-to-zero after a period of inactivity.
It's needed to figure out what is the best/good enough option to go.

Here I put some options but implementor should feel free to try to find other ones:

  1. CloudShell is responsible for tracking activity and scaling itself to 0.
    Pros:
  • does not seem to be difficult to implement;
    Cons:
  • workspace service account needs permissions to update the workspace to be able to stop itself if user does not login at all, or the latest used token is expired;
  • in the future, each editor should implement scale-to-zero, like theia;
  1. CloudShell tracks activity and sends it to a dedicated activity manager(possible embedded into controller). Activity manager stops the workspace after the inactivity period.
    Pros:
  • each editor should only report activity which happens but the scale-to-zero mechanism is reused;
    Cons:
  • It seems to be a more difficult option with a new component with REST API (not sure if it's typically for operators) which possible needs database.
@sleshchenko sleshchenko added kind/enhancement A feature request - must adhere to the feature request template. engine/devworkspace Issues related to Che configured to use the devworkspace controller as workspace engine. labels Apr 21, 2020
@l0rd l0rd mentioned this issue Apr 21, 2020
38 tasks
@che-bot che-bot added the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Apr 21, 2020
@vzhukovs vzhukovs removed the status/need-triage An issue that needs to be prioritized by the curator responsible for the triage. See https://github. label Apr 21, 2020
@amisevsk
Copy link
Contributor

Another con for option 2 is that it requires workspaces to be aware of the operator that controls them. Currently, once a workspace is provisioned, it does not care about the existence of the workspace controller at all -- it is more-or-less entirely self contained. If we require workspaces to report back to the controller in some way, it will complicate both workspaces and the controller significantly.

Are we sure that this is a controller feature? I would expect the workspace creator to be responsible for shutting down their workspace (c.f. to a user-defined deployment, for example). Similarly, for examples like cloud shell, I would expect the console to shut down cloud shells when they are no longer needed.

In terms of implementing, I think the cleanest way to do it would be to separate concerns between the workspace and the controller:

  • Activity monitor runs in workspace (perhaps as a small container?)
  • Workspace deployment has a health check for the activity monitor
  • Controller detects when health check fails (activity timeout reached) and scales down deployment.

Note this issue would depend on #16696

@amisevsk
Copy link
Contributor

One convenient way to implement activity monitoring, if we go down the route of customizing the oauth-proxy to suit our needs, would be to do the activity monitoring there, since all user requests are processed there already. This would have the benefit of requiring no additional changes to editors, etc.

@sleshchenko
Copy link
Member Author

if we go down the route of customizing the oauth-proxy to suit our needs, would be to do the activity monitoring there, since all user requests are processed there already.

I assume it might be possible but not sure if it's so easy with WebSocket connections... Typically WebSocket connection should have ping/pong messages when tab is opened but there is no any activity. And OpenShiftOAuth Proxy just makes sure that HTTP request to upgrade is authorized, but messages themselves are not validated and not sure if possible. Putting it here for further investigation, I like the idea in general.

@metlos
Copy link
Contributor

metlos commented Apr 22, 2020

Not sure if it would be applicable in this concrete case, but wouldn't some general purpose scale-to-0 solution, like https://github.com/deislabs/osiris, be a more appropriate solution for the users?

@amisevsk
Copy link
Contributor

@metlos It depends on how osiris is implemented -- what metric is used for an "idle" pod? I can imagine e.g. editing text in Theia but not compiling to appear as "idle", whereas leaving a cryptocurrency miner in the background to be "busy".

@amisevsk
Copy link
Contributor

In the context of our current work with including cloud terminals in the console directly, shell access will be provided by exec directly -- this pushes the burden of activity monitoring/scale-to-zero on the workspace creator (the OpenShift console in this case).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
engine/devworkspace Issues related to Che configured to use the devworkspace controller as workspace engine. kind/enhancement A feature request - must adhere to the feature request template.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants