Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ws-daemon] Refactor enforcing of CPU limits #7797

Closed
wants to merge 5 commits into from
Closed

Conversation

aledbf
Copy link
Member

@aledbf aledbf commented Jan 24, 2022

Description

  • Remove jiffies concept
  • Rely on cpu.cfs_period_us

How to test

  • Open a workspace
  • Download cpuburn wget https://cdn.pmylund.com/files/tools/cpuburn/linux/cpuburn-1.0-amd64.tar.gz && tar zxpvf cpuburn-1.0-amd64.tar.gz
  • Start CPU burn ./cpuburn/cpuburn
  • After 30s check ws-daemon logs
  • The log should contain an output similar to the next one
{"containerID":"0f9f1e5553d6510e9a65376976760f3f33f6dffd7cee8d22dc516f5363f0741c","currentQuota":-1,"instanceId":"974f65d2-9e4d-4c75-8825-126e8811605b","level":"info","limit":500,"message":"set new CPU limit","quota":500000,"serviceContext":{"service":"ws-daemon","version":"commit-ee12972d9d496abec7d93a48989384ce8f05338f"},"severity":"INFO","time":"2022-01-24T23:11:12Z","userId":"","workspaceId":"gitpodio-templatetypescr-z3dr46y2j9j"}
{"containerID":"0f9f1e5553d6510e9a65376976760f3f33f6dffd7cee8d22dc516f5363f0741c","currentQuota":500000,"instanceId":"974f65d2-9e4d-4c75-8825-126e8811605b","level":"info","limit":400,"message":"set new CPU limit","quota":400000,"serviceContext":{"service":"ws-daemon","version":"commit-ee12972d9d496abec7d93a48989384ce8f05338f"},"severity":"INFO","time":"2022-01-24T23:11:42Z","userId":"","workspaceId":"gitpodio-templatetypescr-z3dr46y2j9j"}
{"containerID":"0f9f1e5553d6510e9a65376976760f3f33f6dffd7cee8d22dc516f5363f0741c","currentQuota":400000,"instanceId":"974f65d2-9e4d-4c75-8825-126e8811605b","level":"info","limit":200,"message":"set new CPU limit","quota":200000,"serviceContext":{"service":"ws-daemon","version":"commit-ee12972d9d496abec7d93a48989384ce8f05338f"},"severity":"INFO","time":"2022-01-24T23:12:12Z","userId":"","workspaceId":"gitpodio-templatetypescr-z3dr46y2j9j"}
{"containerID":"0f9f1e5553d6510e9a65376976760f3f33f6dffd7cee8d22dc516f5363f0741c","instanceId":"3214228e-a0eb-4b6a-91eb-c34bb7747028","level":"debug","message":"the quota has not changed","quota":200000,"serviceContext":{"service":"ws-daemon","version":"commit-b7b4c81e4c98efecca85740b0e07a2ebf79261f8"},"severity":"DEBUG","time":"2022-01-25T00:21:39Z","userId":"","workspaceId":"gitpodio-templatetypescr-loh4kwdzby1"}

Release Notes

[ws-daemon] Refactor enforcing of CPU limits

Documentation

@roboquat
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please ask for approval from aledbf after the PR has been reviewed.

No associated issue. Update pull-request body to add a reference to an issue, or get approval with /approve no-issue

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@codecov
Copy link

codecov bot commented Jan 24, 2022

Codecov Report

Merging #7797 (5459604) into main (b89990c) will increase coverage by 3.53%.
The diff coverage is 56.52%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #7797      +/-   ##
==========================================
+ Coverage   10.86%   14.39%   +3.53%     
==========================================
  Files          18       44      +26     
  Lines        1022     4370    +3348     
==========================================
+ Hits          111      629     +518     
- Misses        909     3676    +2767     
- Partials        2       65      +63     
Flag Coverage Δ
components-gitpod-cli-app 10.86% <ø> (ø)
components-ws-daemon-app 20.89% <56.52%> (?)
components-ws-daemon-lib 20.89% <56.52%> (?)
installer-raw-app 5.63% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
components/ws-daemon/pkg/resources/dispatch.go 0.00% <0.00%> (ø)
components/ws-daemon/pkg/resources/controller.go 25.89% <53.33%> (ø)
components/ws-daemon/pkg/resources/limiter.go 56.41% <100.00%> (ø)
components/ws-daemon/pkg/content/config.go 33.33% <0.00%> (ø)
components/ws-daemon/pkg/content/initializer.go 0.00% <0.00%> (ø)
installer/pkg/common/objects.go 0.00% <0.00%> (ø)
components/ws-daemon/pkg/internal/session/store.go 19.38% <0.00%> (ø)
...staller/pkg/components/ws-manager/networkpolicy.go 0.00% <0.00%> (ø)
... and 19 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c866212...5459604. Read the comment docs.

@aledbf aledbf marked this pull request as ready for review January 24, 2022 20:32
@aledbf aledbf requested a review from csweichel January 24, 2022 20:32
@roboquat roboquat added size/M and removed size/L labels Jan 24, 2022
@aledbf aledbf force-pushed the aledbf/quota branch 4 times, most recently from b7b4c81 to cff7aa4 Compare January 25, 2022 00:21
@aledbf aledbf force-pushed the aledbf/quota branch 2 times, most recently from 50e9f1a to 59ff214 Compare January 25, 2022 09:44
@aledbf aledbf requested a review from csweichel January 25, 2022 10:59
Copy link
Contributor

@csweichel csweichel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something is off. My workspace dropped within 16 seconds from the 500 mCPU bucket to the 400 mCPU bucket:

{"containerID":"3076281461de03fd31482f809bdf57b6b60c5cf09412ddddd0269c8d926bb52f","currentQuota":-1,"instanceId":"d9152bba-23fa-4340-8649-e3fae2188473","level":"info","limit":500,"message":"set new CPU limit","quota":500000,"serviceContext":{"service":"ws-daemon","version":"commit-59ff2145fae57917e41f0711a8fea880d4ccb719"},"severity":"INFO","time":"2022-01-25T10:56:55Z","userId":"","workspaceId":"gitpodio-templatetypescr-0phi5h3mz8b"}
{"containerID":"3076281461de03fd31482f809bdf57b6b60c5cf09412ddddd0269c8d926bb52f","currentQuota":500000,"instanceId":"d9152bba-23fa-4340-8649-e3fae2188473","level":"info","limit":400,"message":"set new CPU limit","quota":400000,"serviceContext":{"service":"ws-daemon","version":"commit-59ff2145fae57917e41f0711a8fea880d4ccb719"},"severity":"INFO","time":"2022-01-25T11:04:55Z","userId":"","workspaceId":"gitpodio-templatetypescr-0phi5h3mz8b"}

@aledbf aledbf force-pushed the aledbf/quota branch 3 times, most recently from ef988b8 to 26eeabe Compare January 25, 2022 13:44
@sagor999 sagor999 self-assigned this Feb 1, 2022
@sagor999
Copy link
Contributor

sagor999 commented Feb 3, 2022

@csweichel this seems to be working correctly.
I just ran a test. Updated ws-daemon configmap with budget map that allocated 600s of work per bucket.
Then ran workspace and ran cpu burn in there.

gov.log.WithField("diff", diff).WithField("sample", sample).WithField("prev", prev).WithField("budget", bdgtSpent).Info("updating cpu expenditure")

for each 10 seconds, it was logging 50seconds of work done (5 cpu each doing 10s of work)
after 600s of work done, new cpu limit was applied, limiting to 4 cpu. now for each 10 seconds it was reporting 40 seconds of work. Then next and last limit, for each 10s was reporting 20s of work.
Could also see similar limiting when running htop in that workspace.

Could you try again?
if it works for you, I will probably just update logging to be less verbose (switch it to debug). Unless there is something else missing in this PR?

Also I assume this is by design(?), but limits persists per workspace. So if I stop workspace and start a new one, I start fresh.

@sagor999
Copy link
Contributor

sagor999 commented Feb 4, 2022

Closing this PR as it is superseded by this one: #8036

@sagor999 sagor999 closed this Feb 4, 2022
@aledbf aledbf deleted the aledbf/quota branch August 13, 2022 13:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-note size/M team: workspace Issue belongs to the Workspace team
Projects
No open projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants