Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate the 'Synced' status condition #198

Closed
negz opened this issue Sep 10, 2020 · 11 comments
Closed

Deprecate the 'Synced' status condition #198

negz opened this issue Sep 10, 2020 · 11 comments
Labels
enhancement New feature or request stale

Comments

@negz
Copy link
Member

negz commented Sep 10, 2020

What problem are you facing?

Most Crossplane resources (managed, claims, composites, etc) typically expose two types of status condition:

  • Ready indicates whether the thing being reconciled (e.g. the external resource) is expected to be functional.
  • Synced indicates whether the most recent reconcile attempt succeeded or failed, including any error.

Sometimes Ready goes by different names - e.g. Established for XRDs.

Originally Crossplane only had a Ready condition, but it was conceptually conflated with "synced" and attempted to reflect both whether the underlying resource was ready and whether Crossplane was successfully able to reconcile the underlying resource. The Synced condition has served us well for some time, but I think it should be removed because:

  • It's duplicated by events. Anecdotally, we emit an event corresponding to ~95% of our SetConditions calls.
  • It's duplicated by debug logs in the other ~5% of cases, which are 'internal' errors like being unable to remove a finalizer.
  • It's a footgun when writing multiple controllers that reconcile the same kind of resource.

The apiextensions 'definition' and 'offered' controllers are examples of the latter issue. Both controllers collaborate to reconcile a CompositeResourceDefinition. The definition controller manages the lifecycle of the defined composite resource and its controller, while the offered controller manages the lifecycle of the offered resource claim (if any) and its controller. Currently both reconcilers attempt to set the Synced condition of the XRD; if one controller was reconciling successfully and the other was encountering an error they would get stuck in a tight reconcile loop.

How could Crossplane help solve your problem?

I suggest we deprecate and stop using the Synced condition in any reconciler that has been updated to emit events and logs.

@negz negz added the enhancement New feature or request label Sep 10, 2020
negz added a commit to negz/crossplane that referenced this issue Sep 10, 2020
crossplane/crossplane-runtime#198

See the above issue for context.

Signed-off-by: Nic Cope <negz@rk0n.org>
negz added a commit to negz/crossplane that referenced this issue Sep 10, 2020
crossplane/crossplane-runtime#198

See the above issue for context.

Signed-off-by: Nic Cope <negz@rk0n.org>
@muvaf
Copy link
Member

muvaf commented Sep 10, 2020

I believe conditions are most useful in machine to machine communication, which is not really what events/logs are for. However, Synced doesn't really mean much for other controllers or systems, at least I haven't seen us writing code to check Synced condition, it's mostly the Ready one. So, in the places we're able to communicate that there is a misconfiguration problem to the user via another mechanism like event, I agree that we don't need Synced condition.

@muvaf
Copy link
Member

muvaf commented Sep 10, 2020

Hmm, actually UI systems are usually built using YAML output of the resources, which do not have the events. So that's a machine usage of the Synced condition; to show user that there is a misconfiguration through the UI. I believe Upbound Cloud is built this way. I went to look at Pod to see what conditions it has:

  conditions:
  - lastProbeTime: null
    lastTransitionTime: "2020-09-03T15:12:15Z"
    status: "True"
    type: Initialized
  - lastProbeTime: null
    lastTransitionTime: "2020-09-10T07:54:42Z"
    status: "True"
    type: Ready
  - lastProbeTime: null
    lastTransitionTime: "2020-09-10T07:54:42Z"
    status: "True"
    type: ContainersReady
  - lastProbeTime: null
    lastTransitionTime: "2020-09-03T15:12:15Z"
    status: "True"
    type: PodScheduled

From what I can tell, there is an abundance of conditions which also have corresponding events published. So, it comes down to whether we'd like the YAML, i.e. etcd record of the resource, to fully represent everything we know about it at the time of printing or not. I generally lean towards abundance of information as long as it's easy to keep it all in sync. Maybe merge errors that we report under Synced into Ready condition? i.e. look at Ready condition for all kinds of errors we are able to report?

@negz
Copy link
Member Author

negz commented Sep 11, 2020

Maybe merge errors that we report under Synced into Ready condition? i.e. look at Ready condition for all kinds of errors we are able to report?

This sounds like how Crossplane originally used conditions, which we found to be overloaded and confusing. A controller encountering an error during reconciliation doesn't necessarily mean that the external resource is not ready; they're two different things.

I believe conditions are most useful in machine to machine communication, which is not really what events/logs are for.

I believe machine to machine communication is in scope for events. They're actually a resource like any other, e.g.:

# $ kubectl get event -o yaml compositepostgresqlinstances.database.example.org.16338ed5a523dfe0
apiVersion: v1
count: 2487
eventTime: null
firstTimestamp: "2020-09-10T23:09:00Z"
involvedObject:
  apiVersion: apiextensions.crossplane.io/v1alpha1
  kind: CompositeResourceDefinition
  name: compositepostgresqlinstances.database.example.org
  resourceVersion: "3241413"
  uid: e29f6357-fe42-4a4b-9f9b-781e0efc1d96
kind: Event
lastTimestamp: "2020-09-11T01:14:02Z"
message: Rendered composite resource claim CustomResourceDefinition
metadata:
  creationTimestamp: "2020-09-10T23:09:00Z"
  name: compositepostgresqlinstances.database.example.org.16338ed5a523dfe0
  namespace: default
  resourceVersion: "3371564"
  selfLink: /api/v1/namespaces/default/events/compositepostgresqlinstances.database.example.org.16338ed5a523dfe0
  uid: 2f76e628-d381-41a3-a0e8-9132504fd16b
reason: RenderCRD
reportingComponent: ""
reportingInstance: ""
source:
  component: offered/compositeresourcedefinition.apiextensions.crossplane.io
type: Normal

The events API supports filtering using field selectors on the involvedObject API, and the event API object is like a more detailed status condition. To me this means the functionality is basically a superset of conditions, because you can also see historical transitions rather than just the current state.

@negz
Copy link
Member Author

negz commented Sep 11, 2020

From @hasheddan on crossplane/crossplane#1721:

I wonder if we could just confine synced to resources that talk to APIs outside of the cluster?

I think it's appropriate to remove Synced everywhere; even for external resources. I think they're still redundant as long as we emit an event whenever we would have set the Synced condition.

negz added a commit to crossplane/crossplane that referenced this issue Sep 11, 2020
* Bump crossplane-runtime dependency

This should switch us to CamelCase status condition reasons.

Signed-off-by: Nic Cope <negz@rk0n.org>

* Remove 'Synced' status condition from XRD controllers

crossplane/crossplane-runtime#198

See the above issue for context.

Signed-off-by: Nic Cope <negz@rk0n.org>

* Remove 'Synced' Condition from XR and XRC controllers

crossplane/crossplane-runtime#198

See the above issue for context.

Signed-off-by: Nic Cope <negz@rk0n.org>

* Use briefer, more user-focused event reasons in XRD controllers

These hopefully mean a little more than 'apply XRD' and 'delete XRD', which was
used by both the claim and the composite XRD controllers.

Signed-off-by: Nic Cope <negz@rk0n.org>

* Reintroduce conditions to indicate that XRDs are being deleted

Signed-off-by: Nic Cope <negz@rk0n.org>
@displague
Copy link
Member

displague commented Oct 8, 2021

I'm curious where this stands.
I encountered debate on status Conditions (kubernetes/community#4521) which is ultimately reflected in https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md#typical-status-properties.

I was spelunking relevant Crossplane issues and found this issue (198) and a 2019 comment crossplane/crossplane#495 (comment) where Synced was first proposed and defended.

I think they're still redundant as long as we emit an event whenever we would have set the Synced condition.

Does the 2019 comment in support of Synced offer any other guidance (if only for reasons in support of removal)?

@negz
Copy link
Member Author

negz commented Oct 11, 2021

@displague I'd still like to deprecate it but I haven't yet had time to find out whether folks are onboard - I could imagine it might be a bit controversial. In my mind Ready is fairly idiomatic - it represents the state/'condition' of the thing under management. Synced really just means "did we or did we not hit and error on the most recent reconcile". To me it seems like events (and logs) are the more idiomatic place for that kind of issue.

@MisterMX
Copy link
Contributor

MisterMX commented Jan 24, 2022

I think those conditions are quite useful. Especially in combination with additionalPrinterColumns because they allow you to quickly distinguish between reconcile errors and external resources that are not ready yet without having to kubectl describe
into the resource.

If setting these conditions does not cause any issues I would prefer to keep them. In addition to that I would propose to add the synced condition/column to XRCs and XRs as well: crossplane/crossplane#2850

@negz
Copy link
Member Author

negz commented Feb 9, 2022

https://kpt.dev/reference/schema/crd-status-convention/

More prior art - looks like the kpt folks recommend something similar.

@stale
Copy link

stale bot commented Aug 13, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix This will not be worked on label Aug 13, 2022
@negz negz removed the wontfix This will not be worked on label Jul 6, 2023
@bobh66
Copy link
Contributor

bobh66 commented Jul 22, 2024

My .02 - I like having SYNCED because it tells me whether I need to fix something to get the resource to reconcile properly and it's easy to see in kubectl get output. If all I have is READY=False and I have to pull logs or events to find out why it would be a lot more effort. The combination of SYNCED and READY makes it easy to know where to focus my attention when something isn't working properly.

Copy link

Crossplane does not currently have enough maintainers to address every issue and pull request. This issue has been automatically marked as stale because it has had no activity in the last 90 days. It will be closed in 14 days if no further activity occurs. Leaving a comment starting with /fresh will mark this issue as not stale.

@github-actions github-actions bot added the stale label Oct 21, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request stale
Projects
None yet
Development

No branches or pull requests

5 participants