Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pkg/resource: applicator breaks optimistic concurrency and overwrites values #483

Open
sttts opened this issue Jul 14, 2023 · 2 comments · May be fixed by #491
Open

pkg/resource: applicator breaks optimistic concurrency and overwrites values #483

sttts opened this issue Jul 14, 2023 · 2 comments · May be fixed by #491
Labels
bug Something isn't working stale

Comments

@sttts
Copy link
Contributor

sttts commented Jul 14, 2023

What happened?

Looking at the claim controller, updating the XR (cp in the code), the following race with other actors in the system (controllers, user with kubectl) exists:

  1. reconciler gets latest XR state cp
  2. reconciler updates in the cp whatever it wants to update
  3. reconciler calls Apply with cp

Assuming between 1 and 3a some client updates the XR out of band, e.g. some other controller, or kubectl. 3c will wipe out that change, won't it?

Above is just one example for that issue. The apply func is widely used in Crossplane.

How can we reproduce it?

This will likely show up as subtle bugs as this is some kind of time travel, or controllers fighting for their desired state of the world and racing with their informers.

What environment did it happen in?

Crossplane version: main

@sttts sttts added the bug Something isn't working label Jul 14, 2023
@negz negz transferred this issue from crossplane/crossplane Jul 14, 2023
@negz
Copy link
Member

negz commented Jul 14, 2023

Thanks for spotting this @sttts. At first glance I agreed with your assessment, but after digging a little deeper I think we're okay. Or more specifically, I don't think we're breaking optimistic concurrency anywhere that it matters much.

As described in crossplane/crossplane#4047 we have two applicator implementations - APIUpdatingApplicator and APIPatchingApplicator. I just did a search to refresh my memory on all the places we use these.

APIUpdatingApplicator

https://github.com/search?q=org%3Acrossplane%20NewAPIUpdatingApplicator&type=code

The APIUpdatingApplicator definitely does break optimistic concurrency. This is dangerous, but I believe we only use it in places today where that's inconsequential. We use it to intentionally clobber other state, in cases where a resource Crossplane owns should always be synced from the state of another resource and where we don't expect other writers. In all cases we don't use a get-mutate-apply pattern - we only use it to build desired state (a "sparse object") and apply it.

I found the following uses of APIUpdatingApplicator:

  • RBAC manager controllers - use it to render and apply RBAC Roles and ClusterRoles.
  • XRD controller - uses it to render and apply the CRDs that power XRs and XRCs.
  • ProviderConfig controller - uses it to render and apply a ProviderConfigUsage.
  • Claim controller - uses it to render and apply a connection secret.

In these cases I wouldn't expect other clients or controllers to be writing to the resources Crossplane applies, and if they did I think it's reasonable that Crossplane would undo their changes.

APIPatchingApplicator

https://github.com/search?q=org%3Acrossplane+NewAPIPatchingApplicator&type=code

The example you point to in the claim controller uses the APIPatchingApplicator. I think this applicator does actually respect the resource version of the applied resource though, per #258. We send a "merge patch" which is really just our entire serialized desired state as the HTTP PATCH request body to the API server. So if we did a get-mutate-apply, I believe our HTTP request would contain the metadata.resourceVersion that we read and be rejected by the API server if it was stale. So, I think we're safe from data-loss in this much more widely used applicator.

That said, it will lead to a lot of retries and noise (per crossplane/crossplane#2472), and can't delete fields that we no longer want (per crossplane/crossplane#4047). I'd still like to replace it with something better.

I found the following uses of APIPatchingApplicator:

  • The core init container - uses it to render and apply CRDs, package Lock, and optionally packages.
  • Package reconciler - Uses it to render and apply package revisions.
  • Kubernetes secret stores - Use it to render and apply connection secrets.
  • Managed resource reconciler - Uses it to render and apply connection secrets.
  • XR controller - Uses it to render and apply the XR and any composed resources during Composition.
  • Claim controller - Uses it to render and apply the XR and claim.

@sttts sttts linked a pull request Jul 31, 2023 that will close this issue
3 tasks
Copy link

github-actions bot commented Sep 3, 2024

Crossplane does not currently have enough maintainers to address every issue and pull request. This issue has been automatically marked as stale because it has had no activity in the last 90 days. It will be closed in 14 days if no further activity occurs. Leaving a comment starting with /fresh will mark this issue as not stale.

@github-actions github-actions bot added the stale label Sep 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working stale
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants