You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Support Slurm 24.11 and Slurm REST API v0.0.40 (#366 → #400).
agent:
Return RacksDB infrastructure name and a boolean to indicate if metrics feature is enabled in /info endpoint, in addition to the cluster name.
Add optional /metrics endpoint with various Slurm metrics in OpenMetrics format designed to be scraped by Prometheus or compatible (#274).
Add possibility to query metrics from Prometheus database with /v<version>/metrics/<metric> endpoint.
Add possibility to filter jobs which are allocated a specific node with node query parameter on /v<version>/jobs endpoint.
gateway:
Return RacksDB infrastructure name and boolean metrics feature flag of every clusters in /clusters endpoint.
Return optional markdown login service message as rendered HTML page with /messages/login endpoint.
Proxy metrics requests to agent through /api/agents/<cluster>/metrics/<metric> endpoint.
frontend:
Request RacksDB with the infrastructure name provided by the gateway (#348).
Display time limit of running jobs in job details page (#352).
Display service message below login form if defined (#253).
Add dependency on charts.js and luxon adapter to draw charts with timeseries metrics.
Display charts of resources (nodes/cores) status and jobs queue in dashboard page based on metrics from Prometheus (#275).
Display list of jobs which have resources allocated on the node in node details page (#292).
Display hash near all jobs fields in job details page to generate link to highlight specific field (#251).
Represent terminated jobs with colored bullet in job status badge, using respectively green for completed (ie. successful) jobs, red for failed jobs and dark orange for timeout jobs (#354).
conf:
Add racksdb > infrastructure parameter for the agent.
Add metrics > enabled parameter for the agent.
Add metrics > restrict parameter for the agent.
Add metrics > host parameter for the agent.
Add metrics > job parameter for the agent.
Add ui > templates, message_template, message_login parameters for the gateway.
Select alloc_cpus and alloc_idle_cpus nodes fields on slurmrestd/slurm/*/nodes and /slurm/*/node/<node> endpoints.
Select nodes jobs field on slurmrestd/slurm/*/jobs endpoint.
Introduce service message template.
show-conf: Introduce slurm-web-show-conf utility to dump current configuration settings of gateway and agent components with their origin, which can either be configuration definition file or site override (#349).
Mention metrics optional feature in quickstart guide.
Mention metrics export and charts feature in overview page.
Mention possible Prometheus integration in architecture page.
Mention login service message feature in overview page.
Mention jobs badges to visualize job status in overview page.
Add page to document Service Messages configuration.
Mention support of Fedora 41.
pkgs:
Introduce gateway Python extra package.
Add requirement on markdown external library for gateway extra package.
Add dependency on prometheus-client for the agent.
Add direct dependency on ClusterShell for the agent.
Changed
agent: Bump minimal required Slurm version from 23.02.0 to 23.11.0.
gateway: Change error message when unable to parse agent info fields.
docs:
Update configuration reference documentation.
Update dashboard screenshot in overview page with example of resource chart.
Replace mention of Slurm REST API version v0.0.39 by v0.0.40.
Mention requirement of Slurm >= 23.11 and dropped support of Slurm 23.02.
conf:
Convert [cache] > password agent parameter from string to password type.
Convert [ldap] > bind_password gateway parameter from string to password type.
Bump [slurmrestd] > version default value from 0.0.39 to 0.0.40 in agent configuration for compatibility with Slurm 24.11.
pkgs:
Add requirement on RFL.core >= 1.1.0.
Add requirement on RFL.settings >= 1.1.1.
Fixed
agent:
Fix retrieval of terminated jobs only available in accounting service with an option to ignore 404 for specific slurmrestd requests.
Fix compatibility issue with Requests >= 2.32.2 (#350).
Return HTTP/404 not found with meaningful error message when requesting unexisting node.
gateway:
Catch generic requests.exceptions.RequestException when retrieving information from agents to avoid AttributeError with more specific exceptions on old versions on Requests library (#391).
Catch JSONDecodeError from simpleson external library and json standard library module not managed by Requests < 2.27.
frontend:
Notifications not visible when browser is not at the top (#367).