Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-3122: Expose Flavors in LocalQueue Status. #3181

Merged
merged 9 commits into from
Oct 14, 2024
192 changes: 192 additions & 0 deletions keps/3122-expose-flavors-in-localqueue-status/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,192 @@
# KEP-3122: Expose Flavors in LocalQueue Status

<!-- toc -->
- [Summary](#summary)
- [Motivation](#motivation)
- [Goals](#goals)
- [Non-Goals](#non-goals)
- [Proposal](#proposal)
- [User Stories (Optional)](#user-stories-optional)
- [Story 1](#story-1)
- [Notes/Constraints/Caveats (Optional)](#notesconstraintscaveats-optional)
- [Risks and Mitigations](#risks-and-mitigations)
- [Design Details](#design-details)
- [API](#api)
- [Implementation overview](#implementation-overview)
- [Future works](#future-works)
- [Test Plan](#test-plan)
- [Prerequisite testing updates](#prerequisite-testing-updates)
- [Unit Tests](#unit-tests)
- [Integration tests](#integration-tests)
- [Graduation Criteria](#graduation-criteria)
- [Implementation History](#implementation-history)
- [Drawbacks](#drawbacks)
- [Alternatives](#alternatives)
<!-- /toc -->

## Summary

This KEP introduces a new status field in LocalQueue, allowing users to see
all currently available ResourceFlavors in the LocalQueue.

## Motivation

Currently, users without RBAC access to ResourceFlavors cannot view the list
of available flavors. Depending on the RBAC rules, users might also lack read
access to ClusterQueues. Providing users with information about the available
flavors is useful, as it gives them an idea of the capabilities provided by
a LocalQueue (e.g., a flavor might include newer GPUs).

### Goals

- Provide a possibility to see all currently available ResourceFlavors in
the LocalQueue.

### Non-Goals

- Verify that the ResourceFlavors exist and show only the existing flavors.
mbobrovskyi marked this conversation as resolved.
Show resolved Hide resolved
- The stopPolicies are never considered in the LocalQueue flavors status field.

## Proposal

Introduce a new status field `flavors` in LocalQueue
that will be updated when ClusterQueue flavors are modified.

### User Stories (Optional)

#### Story 1

As a user I want to see the list of all ResourceFlavors available in each LocalQueue
due to the RBAC configuration for ClusterQueue in my cluster I cannot inspect the
ClusterQueue objects directly, only LocalQueues.

### Notes/Constraints/Caveats (Optional)

### Risks and Mitigations
mbobrovskyi marked this conversation as resolved.
Show resolved Hide resolved

Risk: Increased size of the status object due to adding 16 resource flavors
to the LocalQueue.

Mitigation: The number of available flavor names is limited to 16, so this
additional field does not significantly impact performance.

## Design Details

### API

Create `Flavor` API object:

```go
type Flavor struct {
// name of the flavor.
Name ResourceFlavorReference `json:"name"`

// resources used in the flavor.
// +listType=set
// +kubebuilder:validation:MaxItems=16
// +optional
Resources []string `json:"resources,omitempty"`

// nodeLabels are labels that associate the ResourceFlavor with Nodes that
// have the same labels.
// +mapType=atomic
// +kubebuilder:validation:MaxProperties=8
// +optional
NodeLabels map[string]string `json:"nodeLabels,omitempty"`

// nodeTaints are taints that the nodes associated with this ResourceFlavor
// have.
// +listType=atomic
// +kubebuilder:validation:MaxItems=8
// +optional
NodeTaints []corev1.Taint `json:"nodeTaints,omitempty"`
}
```

Modify `LocalQueueStatus` API object:

```go
// LocalQueueStatus defines the observed state of LocalQueue
type LocalQueueStatus struct {
...
// flavors lists all currently available ResourceFlavors in specified ClusterQueue.
//
// +listType=map
// +listMapKey=name
// +kubebuilder:validation:MaxItems=16
// +optional
Flavors []Flavor `json:"flavors,omitempty"`
}
```

### Implementation overview
mbobrovskyi marked this conversation as resolved.
Show resolved Hide resolved
mbobrovskyi marked this conversation as resolved.
Show resolved Hide resolved

Modify `LocalQueueUsageStats` object:

```go
type LocalQueueUsageStats struct {
mbobrovskyi marked this conversation as resolved.
Show resolved Hide resolved
...
Flavors []kueue.ResourceFlavorReference
}
```

Get available `Flavors` from `cqImpl.ResourceGroups` in `cache.LocalQueueUsage(...)`
method and update `Flavors` field on `UpdateStatusIfChanged(...)`
on each LocalQueue reconcile when it was updated.

### Future works

In the future, we can also add the `availableCoveredResources` field. This field allows
batch users to understand which resources may be available for the LocalQueue.


### Test Plan

<!--
**Note:** *Not required until targeted at a release.*
The goal is to ensure that we don't accept enhancements with inadequate testing.

All code is expected to have adequate tests (eventually with coverage
expectations). Please adhere to the [Kubernetes testing guidelines][testing-guidelines]
when drafting this test plan.

[testing-guidelines]: https://git.k8s.io/community/contributors/devel/sig-testing/testing.md
-->

[x] I/we understand the owners of the involved components may require updates to
existing tests to make this code solid enough prior to committing the changes necessary
to implement this enhancement.

##### Prerequisite testing updates

<!--
Based on reviewers feedback describe what additional tests need to be added prior
implementing this enhancement to ensure the enhancements have also solid foundations.
-->

None.

#### Unit Tests

Existing unit tests should be updated to tests whether the new data are correctly
passed and applied on CRD.

#### Integration tests

Existing integration tests should be updated to tests whether the new data are correctly
passed and applied on CRD.

Additionally, add an integration test case to check that Flavors are updated after
forcefully removing the ClusterQueue while ignoring validations.

### Graduation Criteria

We will graduate this feature to stable together with the whole LocalQueue API.

## Implementation History

2024-10-02 KEP

## Drawbacks

## Alternatives
31 changes: 31 additions & 0 deletions keps/3122-expose-flavors-in-localqueue-status/kep.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
title: Expose Flavors in LocalQueue Status
kep-number: 3122
authors:
- "@mbobrovskyi"
status: implementable
creation-date: 2024-10-02
reviewers:
- "@mimowo"
- "@alculquicondor"
- "@tenzen-y"
approvers:
- "@mimowo"
- "@alculquicondor"
- "@tenzen-y"

# The target maturity stage in the current dev cycle for this KEP.
stage: beta
mbobrovskyi marked this conversation as resolved.
Show resolved Hide resolved

# The most recent milestone for which work toward delivery of this KEP has been
# done. This can be the current (upcoming) milestone, if it is being actively
# worked on.
latest-milestone: "v0.9"

# The milestone at which this feature was, or is targeted to be, at each stage.
milestone:
beta: "v0.9"

# List the feature gate name and the components for which it must be enabled
feature-gates:
- name: ExposeFlavorsInLocalQueue
disable-supported: true