Tracking issue for improving resources status in CAPO #2290

EmilienM · 2024-11-27T19:20:14Z

This is a tracking issue for CAPO-related effort to improve resources status. This is linked to this CAPI ongoing effort: kubernetes-sigs/cluster-api#10897.

High level required changes with the new CAPI contract

Most of these changes will be required in the v1beta2 API contract (tentative Apr 2025).

`OpenStackCluster`

Following changes are planned for the contract for the OpenStackCluster resource:

Disambiguate the usage of the ready term by renaming fields used for the initial provisioning workflow
- Rename status.ready into status.initialization.provisioned.
Remove failureReason and failureMessage.

Notes:

OpenStackCluster's status.initialization.provisioned will surface into Cluster's status.initialization.infrastructureProvisioned field.
OpenStackCluster's status.initialization.provisioned must signal the completion of the initial provisioning of the cluster infrastructure. The value of this field should never be updated after provisioning is completed, and Cluster API will ignore any changes to it.
OpenStackCluster's status.conditions[Ready] will surface into Machine's status.conditions[InfrastructureReady] condition.
OpenStackCluster's status.conditions[Ready] must surface issues during the entire lifecycle of the OpenStackCluster (both during initial OpenStackCluster provisioning and after the initial provisioning is completed).

`OpenStackMachine`

Following changes are planned for the contract for the OpenStackMachine resource:

Disambiguate the usage of the ready term by renaming fields used for the initial provisioning workflow
- Rename status.ready into status.initialization.provisioned.
Remove failureReason and failureMessage.

Notes:

OpenStackMachine's status.initialization.provisioned will surface into Machine's status.initialization.infrastructureProvisioned field.
OpenStackMachine's status.initialization.provisioned must signal the completion of the initial provisioning of the cluster infrastructure. The value of this field should never be updated after provisioning is completed, and Cluster API will ignore any changes to it.
OpenStackMachine's status.conditions[Ready] will surface into Cluster's status.conditions[InfrastructureReady] condition.
OpenStackMachine's status.conditions[Ready] must surface issues during the entire lifecycle of the Machine (both during initial OpenStackMachine provisioning and after the initial provisioning is completed).

Notes on Conditions

Some remarks about Kubernetes API conventions in regard to conditions:

Polarity: Condition type names should make sense for humans; neither positive nor negative polarity can be recommended
as a general rule
Use of the Reason field is required (currently in Cluster API reasons is added only when condition are false)
Controllers should apply their conditions to a resource the first time they visit the resource, even if the status is Unknown.
(currently Cluster API controllers add conditions at different stages of the reconcile loops). Please note that:
- If more than one controller adds conditions to the same resources, conditions managed by the different controllers will be
  applied at different times.
- Kubernetes API conventions account for exceptions to this rule; for known conditions, the absence of a condition status should
  be interpreted the same as Unknown, and typically indicates that reconciliation has not yet finished.
We'll be using metav1.Conditions from the Kubernetes API.

Terminal Failures

By getting rid of the terminal failures, we have an opportunity to improve CAPO's reliability to handle OpenStack infrastructure failures, such as API rate limits or temporary unavailability which unfortunately happen often in large-scale production clouds.
We'll need to investigate what these failures can be, and how we threat them:

CAPO continues to reconcile the resource and update conditions with a temporary state
CAPO stops reconciling the resource and update conditions to an human readable error message

Work items

Tasks

Give feedback

@EmilienM Create a new Issue and draft an overview of this effort.
Options

The text was updated successfully, but these errors were encountered:

github-project-automation bot added this to CAPO Roadmap Nov 27, 2024

github-project-automation bot moved this to Inbox in CAPO Roadmap Nov 27, 2024

EmilienM assigned EmilienM, MaysaMacedo and lentzi90 Nov 27, 2024

EmilienM added this to the vNext milestone Nov 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking issue for improving resources status in CAPO #2290

Tracking issue for improving resources status in CAPO #2290

EmilienM commented Nov 27, 2024 •

edited

Loading

Tasks

Tracking issue for improving resources status in CAPO #2290

Tracking issue for improving resources status in CAPO #2290

Comments

EmilienM commented Nov 27, 2024 • edited Loading

High level required changes with the new CAPI contract

OpenStackCluster

OpenStackMachine

Notes on Conditions

Terminal Failures

Work items

Tasks

EmilienM commented Nov 27, 2024 •

edited

Loading

`OpenStackCluster`

`OpenStackMachine`