Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Show CPU/Memory/Disk usage for jobs #42

Closed
lucyb opened this issue Nov 22, 2023 · 1 comment
Closed

Show CPU/Memory/Disk usage for jobs #42

lucyb opened this issue Nov 22, 2023 · 1 comment
Assignees

Comments

@lucyb
Copy link
Collaborator

lucyb commented Nov 22, 2023

Having this data in Job Server and also within Grafana will allow us to:
(1) proactively manage user experience and therefore reduce copilot support burden
(2) scale the service to prevent critical resource exhaustion and the resulting firefighting and sadness

We believe that Job Runner can provide this information in the data it sends up to Job Server. Job Server can then store this data with the rest of the information about a Job.

We don't yet know what sort of aggregation to do to provide an accurate measure of CPU/memory/disk, so we should start with something easy to do and build up from there. Once it's stored in the Job Server database it will be easy to show in Grafana.

Note: surfacing this information in Job Server is out of scope for this ticket.

@bloodearnest
Copy link

The first part of this is done in opensafely-core/job-runner#691

There's a separate ticket is in opensafely-core/job-server#3867

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants