Introduce `ResourceGroup` API #98

stefanprodan · 2024-09-24T17:45:27Z

ResourceGroup is a declarative API for generating a group of Kubernetes objects based on a matrix of input values and a set of templated resources.

The ResourceGroup API offers a high-level abstraction for defining and managing Flux resources and related Kubernetes objects as a single unit. It is designed to reduce the complexity of Kustomize overlays by providing a compact way
of defining different configurations for a set of workloads per tenant and/or environment.

Use cases:

Application definition: Bundle a set of Kubernetes resources (Flux HelmRelease, OCIRepository, Alert, Provider, Receiver, ImagePolicy) into a single deployable unit.
Dependency management: Define dependencies between apps to ensure that the resources are applied in the correct order. The dependencies are more flexible than in Flux, they can be for other ResourceGroups, CRDs, or any other Kubernetes object.
Multi-instance provisioning: Generate multiple instances of the same application with different configurations.
Multi-cluster provisioning: Generate multiple instances of the same application for each target cluster that are deployed by Flux from a management cluster.
Multi-tenancy provisioning: Generate a set of resources (Namespace, ServiceAccount, RoleBinding) for each tenant with specific roles and permissions.

Example

The following example shows a ResourceGroup that generates an application instance consisting of a Flux HelmRelease and OCIRepository for each tenant with a specific version and replica count.

apiVersion: fluxcd.controlplane.io/v1
kind: ResourceGroup
metadata:
  name: podinfo
  namespace: default
  annotations:
    fluxcd.controlplane.io/reconcile: "enabled"
    fluxcd.controlplane.io/reconcileEvery: "30m"
    fluxcd.controlplane.io/reconcileTimeout: "5m"
spec:
  commonMetadata:
    labels:
      app.kubernetes.io/name: podinfo
  inputs:
    - tenant: "team1"
      version: "6.7.x"
      replicas: "2"
    - tenant: "team2"
      version: "6.6.x"
      replicas: "3"
  resources:
    - apiVersion: source.toolkit.fluxcd.io/v1beta2
      kind: OCIRepository
      metadata:
        name: podinfo-<< inputs.tenant >>
        namespace: default
      spec:
        interval: 10m
        url: oci://ghcr.io/stefanprodan/charts/podinfo
        ref:
          semver: << inputs.version | quote >>
    - apiVersion: helm.toolkit.fluxcd.io/v2
      kind: HelmRelease
      metadata:
        name: podinfo-<< inputs.tenant >>
        namespace: default
      spec:
        interval: 1h
        releaseName: podinfo-<< inputs.tenant >>
        chartRef:
          kind: OCIRepository
          name: podinfo-<< inputs.tenant >>
        values:
          replicaCount: << inputs.replicas | int >>

Writing a ResourceGroup spec

As with all other Kubernetes config, a ResourceGroup needs apiVersion,
kind, and metadata fields. The name of a ResourceGroup object must be a valid DNS subdomain name.
A ResourceGroup also needs a .spec section.

Inputs configuration

The .spec.inputs field is optional and specifies a list of input values
to be used in the resources templates.

Example inputs:

spec:
  inputs:
   - tenant: team1
     version: "6.7.x"
     replicas: "2"
   - tenant: team2
     version: "6.6.x"
     replicas: "3"

An input value is a key-value pair of strings, where the key is the input name
which can be referenced in the resource templates using the << inputs.name >> syntax.

Resources configuration

The .spec.resources field is optional and specifies the list of Kubernetes resource
to be generated and reconciled on the cluster.

Example of plain resources without any templating:

spec:
  resources:
   - apiVersion: v1
     kind: Namespace
     metadata:
      name: apps
   - apiVersion: v1
     kind: ServiceAccount
     metadata:
      name: flux
      namespace: apps

Templating resources

The resources can be templated using the << inputs.name >> syntax. The templating engine
is based on Go text template. The << >> delimiters are used instead of {{ }} to avoid
conflicts with Helm templating and allow ResourceGroups to be included in Helm charts.

Example of templated resources:

spec:
  inputs:
   - tenant: team1
     role: admin
   - tenant: team2
     role: cluster-admin
  resources:
   - apiVersion: v1
     kind: Namespace
     metadata:
      name: << inputs.tenant >>
   - apiVersion: v1
     kind: ServiceAccount
     metadata:
      name: flux
      namespace: << inputs.tenant >>
   - apiVersion: rbac.authorization.k8s.io/v1
     kind: RoleBinding
     metadata:
      name: flux
      namespace: << inputs.tenant >>
     subjects:
      - kind: ServiceAccount
        name: flux
        namespace: << inputs.tenant >>
        roleRef:
         kind: ClusterRole
         name: << inputs.role >>
         apiGroup: rbac.authorization.k8s.io

The above example will generate a Namespace, ServiceAccount and RoleBinding for each tenant
with the specified role.

Templating functions

The templating engine supports slim-sprig functions.

It is recommended to use the quote function when templating strings to avoid issues with
special characters e.g. << inputs.version | quote >>.

When templating integers, use the int function to convert the string to an integer
e.g. << inputs.replicas | int >>.

When templating booleans, use the bool function to convert the string to a boolean
e.g. << inputs.enabled | bool >>.

When using integer or boolean inputs as metadata label values, use the quote function to convert
the value to a string e.g. << inputs.enabled | quote >>.

When using multi-line strings containing YAML, use the nindent function to properly format the string
e.g.:

spec:
  inputs:
    - tenant: team1
      layerSelector: |
        mediaType: "application/vnd.cncf.helm.chart.content.v1.tar+gzip"
        operation: copy
  resources:
    - apiVersion: source.toolkit.fluxcd.io/v1beta2
      kind: OCIRepository
      metadata:
        name: << inputs.tenant >>
      spec:
        layerSelector: << inputs.layerSelector | nindent 4 >>

Resource deduplication

The flux-operator deduplicates resources based on the
apiVersion, kind, metadata.name and metadata.namespace fields.

This allows defining shared resources that are applied only once, regardless of the number of inputs.

Example of a shared Flux source:

spec:
  inputs:
    - tenant: "team1"
      replicas: "2"
    - tenant: "team2"
      replicas: "3"
  resources:
    - apiVersion: source.toolkit.fluxcd.io/v1beta2
      kind: OCIRepository
      metadata:
        name: podinfo
        namespace: default
      spec:
        interval: 10m
        url: oci://ghcr.io/stefanprodan/charts/podinfo
        ref:
          semver: '*'
    - apiVersion: helm.toolkit.fluxcd.io/v2
      kind: HelmRelease
      metadata:
        name: podinfo-<< inputs.tenant >>
        namespace: default
      spec:
        interval: 1h
        releaseName: podinfo-<< inputs.tenant >>
        chartRef:
          kind: OCIRepository
          name: podinfo
        values:
          replicaCount: << inputs.replicas | int >>

In the above example, the OCIRepository resource is created only once
and referred by all HelmRelease resources.

Common metadata

The .spec.commonMetadata field is optional and specifies common metadata to be applied to all resources.

It has two optional fields:

labels: A map used for setting labels
on an object. Any existing label will be overridden if it matches with a key in
this map.
annotations: A map used for setting annotations
on an object. Any existing annotation will be overridden if it matches with a key
in this map.

Example common metadata:

spec:
  commonMetadata:
    labels:
      app.kubernetes.io/name: podinfo
    annotations:
      fluxcd.controlplane.io/prune: disabled

In the above example, all resources generated by the ResourceGroup
will not be pruned by the garbage collection process as the fluxcd.controlplane.io/prune
annotation is set to disabled.

Dependency management

.spec.dependsOn is an optional list used to refer to Kubernetes
objects that the ResourceGroup depends on. If specified, then the ResourceGroup
is reconciled after the referred objects exist in the cluster.

A dependency is a reference to a Kubernetes object with the following fields:

apiVersion: The API version of the referred object (required).
kind: The kind of the referred object (required).
name: The name of the referred object (required).
namespace: The namespace of the referred object (optional).
ready: A boolean indicating if the referred object must have the Ready status condition set to True (optional, default is false).

Example of conditional reconciliation based on the existence of CustomResourceDefinitions
and the readiness of a ResourceGroup:

spec:
  dependsOn:
    - apiVersion: apiextensions.k8s.io/v1
      kind: CustomResourceDefinition
      name: helmreleases.helm.toolkit.fluxcd.io
    - apiVersion: apiextensions.k8s.io/v1
      kind: CustomResourceDefinition
      name: servicemonitors.monitoring.coreos.com
    - apiVersion: fluxcd.controlplane.io/v1
      kind: ResourceGroup
      name: cluster-addons
      namespace: flux-system
      ready: true

Note that is recommended to define dependencies on CustomResourceDefinitions if the ResourceGroup
deploys Flux HelmReleases which contain custom resources.

When the dependencies are not met, the flux-operator will reevaluate the requirements
every five seconds and reconcile the ResourceGroup when the dependencies are satisfied.
Failed dependencies are reported in the ResourceGroup Ready status condition,
in log messages and Kubernetes events.

Reconciliation configuration

The reconciliation of behaviour of a ResourceGroup can be configured using the following annotations:

fluxcd.controlplane.io/reconcile: Enable or disable the reconciliation loop. Default is enabled, set to disabled to pause the reconciliation.
fluxcd.controlplane.io/reconcileEvery: Set the reconciliation interval used for drift detection and correction. Default is 1h.
fluxcd.controlplane.io/reconcileTimeout: Set the reconciliation timeout including health checks. Default is 5m.

Health check configuration

The .spec.wait field is optional and instructs the flux-operator to perform
a health check on all applied resources and waits for them to become ready. The health
check is enabled by default and can be disabled by setting the .spec.wait field to false.

The health check is performed for the following resources types:

Kubernetes built-in kinds: Deployment, DaemonSet, StatefulSet,
PersistentVolumeClaim, Service, Ingress, CustomResourceDefinition.
Flux kinds: HelmRelease, OCIRepository, Kustomization, GitRepository, etc.
Custom resources that are compatible with kstatus.

By default, the wait timeout is 5m and can be changed with the
fluxcd.controlplane.io/reconcileTimeout annotation, set on the ResourceGroup object.

Role-based access control

The .spec.serviceAccountName field is optional and specifies the name of the
Kubernetes ServiceAccount used by the flux-operator to reconcile the ResourceGroup.
The ServiceAccount must exist in the same namespace as the ResourceGroup
and must have the necessary permissions to create, update and delete
the resources defined in the ResourceGroup.

On multi-tenant clusters, it is recommended to use a dedicated ServiceAccount per tenant namespace
with the minimum required permissions. To enforce a ServiceAccount for all ResourceGroups,
the --default-service-account=flux-operatorflag can be set in the flux-operator container arguments.
With this flag set, only the ResourceGroups created in the same namespace as the flux-operator
will run with cluster-admin permissions.

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>

stefanprodan added 4 commits September 24, 2024 20:38

Add ResourceGroup API

228af28

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>

Implement ResourceGroup template builder

0bbeff7

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>

Implement ResourceGroup reconciler

98b1a39

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>

Add ResourceGroup sample manifest

fee8134

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>

stefanprodan added enhancement New feature or request area/api API related issues and pull requests labels Sep 24, 2024

stefanprodan force-pushed the resource-group branch 2 times, most recently from 8b96f89 to 43fa4fc Compare September 24, 2024 19:35

stefanprodan added 2 commits September 24, 2024 22:59

Add ResourceGroup to e2e tests

38d1040

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>

Add ResourceGroup API docs

7a2a6fc

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>

stefanprodan force-pushed the resource-group branch 2 times, most recently from 1fa1070 to 738205c Compare September 25, 2024 12:50

Implement dependency management for ResourceGroups

2631b60

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>

stefanprodan force-pushed the resource-group branch from 738205c to 2631b60 Compare September 25, 2024 12:51

Implement ServiceAccount impersonation for ResourceGroups

bc20588

Signed-off-by: Stefan Prodan <stefan.prodan@gmail.com>

stefanprodan force-pushed the resource-group branch from 54aaa3c to bc20588 Compare September 26, 2024 11:06

stefanprodan marked this pull request as ready for review September 27, 2024 17:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce `ResourceGroup` API #98

Introduce `ResourceGroup` API #98

stefanprodan commented Sep 24, 2024 •

edited

Loading

Introduce ResourceGroup API #98

Are you sure you want to change the base?

Introduce ResourceGroup API #98

Conversation

stefanprodan commented Sep 24, 2024 • edited Loading

Example

Writing a ResourceGroup spec

Inputs configuration

Resources configuration

Templating resources

Templating functions

Resource deduplication

Common metadata

Dependency management

Reconciliation configuration

Health check configuration

Role-based access control

Introduce `ResourceGroup` API #98

Introduce `ResourceGroup` API #98

stefanprodan commented Sep 24, 2024 •

edited

Loading