Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase limits_cpu for the short_running_workers #615

Conversation

majamassarini
Copy link
Member

@majamassarini majamassarini commented Nov 27, 2024

I am monitoring the hard time limits exceptions since a while, seems like that when we have spikes both in the short running workers and in the long running we sometimes get an hard time limit exception for the celery tasks.

We have spike above the requested limits (as you can see in the today picture) mainly for the short running workers.

We have exceptions for the spikes even when the single worker cpu usage is below the limits_cpu and I am wondering if this limit applies to the sum of all our replicas or just for the single replica?

I would say it is the limit for all the replicas together and in this case we need to increase much more the limits_cpu both for the long running and the short running pods (I would put there 1G of limits_cpu for both kind of workers in this case).

Screenshot from 2024-11-27 10-30-09
Screenshot from 2024-11-27 10-46-15

One of the highest cpu requests (for just a single short running worker) I registered:
Screenshot from 2024-11-22 09-03-10

Seems like that when we have spikes,
above the limits_cpu, we sometimes got an hard time limit
exception for the celery tasks.
@majamassarini majamassarini force-pushed the increase-cpu-limits-for-short-running-workers branch from 5943e59 to 1f61425 Compare November 27, 2024 10:09
Copy link
Contributor

@majamassarini majamassarini added the mergeit When set, zuul wil gate and merge the PR. label Nov 28, 2024
Copy link
Contributor

Build succeeded (gate pipeline).
https://softwarefactory-project.io/zuul/t/packit-service/buildset/92777f60c7d44fc7a27a76410173c183

✔️ pre-commit SUCCESS in 1m 40s

@softwarefactory-project-zuul softwarefactory-project-zuul bot merged commit c093640 into packit:main Nov 28, 2024
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mergeit When set, zuul wil gate and merge the PR.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants