Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

⚠️ Client: Ensure no stale data remains in target object #1640

Merged
merged 1 commit into from
Aug 26, 2021

Conversation

alvaroaleman
Copy link
Member

Fixes #1639

The json deserializer of the stdlib and the one from Kube which aims to
be compatible won't zero out all field types in the object it
deserializes into, for example it lets slices be if the json does not
contain that field. This means that if a non-empty variable is used for
any api call with the client, the resulting content might be a mixture
of previous content and what is on the server.

This PR adds a wrapper around the deserializer that will first zero the
target object.

/assign @joelanford @vincepri

Fixes kubernetes-sigs#1639

The json deserializer of the stdlib and the one from Kube which aims to
be compatible won't zero out all field types in the object it
deserializes into, for example it lets slices be if the json does not
contain that field. This means that if a non-empty variable is used for
any api call with the client, the resulting content might be a mixture
of previous content and what is on the server.

This PR adds a wrapper around the deserializer that will first zero the
target object.
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Aug 22, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Aug 22, 2021
@vincepri
Copy link
Member

Shouldn't clients make sure that the object passed in is clean?

@alvaroaleman
Copy link
Member Author

Shouldn't clients make sure that the object passed in is clean?

Ideally yes, realistically there are a bunch of situation where that might not happen, for example in "Wait for change" loops. An extreme example of this is Update, if for example a mutatingwebhook would patch a slice out, the returned object would still have it (as we re-use the passed-in object to decode the result into).

@coderanger
Copy link
Contributor

Is the overhead small enough to handle this unconditionally? We could probably get away with a heuristic like skipping it if .Name == "" but I do like how clean the code is so if we think it's fast enough, maybe just leave it as is.

@alvaroaleman
Copy link
Member Author

Is the overhead small enough to handle this unconditionally? We could probably get away with a heuristic like skipping it if .Name == "" but I do like how clean the code is so if we think it's fast enough, maybe just leave it as is.

I haven't done any benchmarking so I don't know how much performance overhead exactly this entails. I did check the generated clients though, which always and uncoditionally allocate a new object to serialize into.

It should also be noted that this only affects the api client. I would think that its generally unlikely someone makes so many mutating request that the performance overhead of this becomes noticeable and for reading requests, it would make sense to use the cache-backed client if one cares about performance. That one doesn't have this issue, as it doesn't deserialize json into the passed object.

@vincepri
Copy link
Member

Not too worried about performance impacts of this change, but more of masking user errors. This change is definitely a behavioral change in the client code (should be marked as a breaking change), although I'm not sure if we should zero out objects altogether for users.

Shouldn't our users be responsible to either clean out the object or just use a new one?

@alvaroaleman
Copy link
Member Author

alvaroaleman commented Aug 23, 2021

Not too worried about performance impacts of this change, but more of masking user errors. This change is definitely a behavioral change in the client code (should be marked as a breaking change), although I'm not sure if we should zero out objects altogether for users.

It is a change in behavior (all bugfixes are), but I can not think of an example where someone would want existing data plus the response to be mixed in an object which led me to the conclusion that this is not going to break anyones code, hence it is marked as bugfix.

Shouldn't our users be responsible to either clean out the object or just use a new one?

I don't think we have anything to that effect documented. It also isn't really possible to test this for an existing codebase. I can say that I have written code that re-uses a variable to fetch data to avoid unnecessary allocations (And assuming wrongly that the deserialization would make sure the object gets cleared)

@joelanford
Copy link
Member

/lgtm

/hold
For @vincepri in case he's got a legit example of this being a breaking change.

@k8s-ci-robot k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Aug 23, 2021
@alvaroaleman alvaroaleman changed the title 🐛 Client: Ensure no stale data remains in target object ⚠️ Client: Ensure no stale data remains in target object Aug 26, 2021
@alvaroaleman
Copy link
Member Author

Marked it as breaking as it might make tests start failing that (incorrectly) passed before.

@vincepri
Copy link
Member

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Aug 26, 2021
@k8s-ci-robot k8s-ci-robot merged commit 8f9d0df into kubernetes-sigs:master Aug 26, 2021
@k8s-ci-robot k8s-ci-robot added this to the v0.10.x milestone Aug 26, 2021
@alvaroaleman
Copy link
Member Author

Just testing something...
/cherrypick release-0.9

@alvaroaleman
Copy link
Member Author

Lets give this another shot...
/cherrypick release-0.9

@alvaroaleman
Copy link
Member Author

Lets try again:
/cherrypick release-0.9

@alvaroaleman
Copy link
Member Author

/cherrypick release-0.9

@k8s-ci-robot
Copy link
Contributor

@alvaroaleman: new pull request created: #1658

In response to this:

/cherrypick release-0.9

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@spiffxp
Copy link
Member

spiffxp commented Sep 10, 2021

/cherrypick release-0.9
testing again to see if the new bot is used

@spiffxp
Copy link
Member

spiffxp commented Sep 10, 2021

/cherrypick release-0.9
now that kubernetes/test-infra#23556 landed

@k8s-infra-cherrypick-robot

@spiffxp: new pull request created: #1661

In response to this:

/cherrypick release-0.9
now that kubernetes/test-infra#23556 landed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

darkowlzz added a commit to darkowlzz/image-automation-controller that referenced this pull request Oct 1, 2021
In controller-runtime v0.10.0, the client is updated to clean any stale
data in the target object when performing any operation. This results in
test failure for the code that constructs an object with both spec and
status, and creates the object and updates status it with the same
object. The fix is to set the status separately on the object before
updating it.

Refer: kubernetes-sigs/controller-runtime#1640

Signed-off-by: Sunny <darkowlzz@protonmail.com>
timebertt added a commit to timebertt/gardener that referenced this pull request Oct 14, 2021
Now that the c-r client zeroes fields before decoding into the object,
we can drop our workarounds for this, so basically drop
kutil.CreateResetObjectFunc and its usages.

ref kubernetes-sigs/controller-runtime#1640
timebertt added a commit to gardener/gardener that referenced this pull request Oct 14, 2021
* Upgrade to k8s.io/*@v0.22.2 in go.mod

* [automated] make revendor

* [automated] make generate

* [automated] make revendor

github.com/go-openapi/spec seems to be orphaned after previous make generate

* Upgrade to c-r@v0.10.2 in go.mod

Also, upgrade setup-envtest (doesn't have a tagged release yet, so use
release commit instead)

* [automated] make revendor

* Upgrade to controller-tools@v0.7.0 in go.mod

* [automated] make revendor

* Add missing WarningsOn{Create,Update} to rest strategies

* Replace dot imports for github.com/onsi/gomega/types

Fix linting errors: `Assertion` redeclared in this block (typecheck)

* Switch to typed values for WebhookInstallOptions.*Webhooks

ref kubernetes-sigs/controller-runtime#1626

* RequestCertificate now takes an optional requestedDuration

ref kubernetes/kubernetes#99494

* Switch to matchers.DeepEqual to test semantic equality

Maps (e.g. labels, selectors, resource requirements) might be sorted differently
than expected. Hence, use semantic equality instead of strict equality, as this
is what matters to us.
Also, DeepEqual outputs yaml and adds a nice diff indicator instead of printing
some large confusing go struct representation.

* Add new memorySwap field to expected kubelet config

ref kubernetes/kubernetes#102823

* Round condition.lastUpdateTime to seconds in test

There were several changes in the fake clients that might cause the failure
to happen just now.

* Correct unit tests falsely succeeding

These tests were not preparing the test objects correctly: they only updated
them in memory but not on the fake client. This wasn't caught until now
because the fake client mimicked the real json decoder, which didn't unset
fields not present on the server. Now that the fake client zeroes fields,
the tests started failing (which is correct). So fix the tests.

ref kubernetes-sigs/controller-runtime#1651

* Remove workarounds for missing zeroing in json decoder

Now that the c-r client zeroes fields before decoding into the object,
we can drop our workarounds for this, so basically drop
kutil.CreateResetObjectFunc and its usages.

ref kubernetes-sigs/controller-runtime#1640

* Drop setting webhook gvk explicitly in envtest

webhookConfig.SetGroupVersionKind is not needed anymore with
kubernetes-sigs/controller-runtime#1665

* Add some follow-up TODO comments

* [automated] make generate

but with go 1.16.9

* Address review comments
darkowlzz added a commit to darkowlzz/image-automation-controller that referenced this pull request Oct 20, 2021
In controller-runtime v0.10.0, the client is updated to clean any stale
data in the target object when performing any operation. This results in
test failure for the code that constructs an object with both spec and
status, and creates the object and updates status it with the same
object. The fix is to set the status separately on the object before
updating it.

Refer: kubernetes-sigs/controller-runtime#1640

Signed-off-by: Sunny <darkowlzz@protonmail.com>
darkowlzz added a commit to darkowlzz/image-automation-controller that referenced this pull request Oct 20, 2021
In controller-runtime v0.10.0, the client is updated to clean any stale
data in the target object when performing any operation. This results in
test failure for the code that constructs an object with both spec and
status, and creates the object and updates status it with the same
object. The fix is to set the status separately on the object before
updating it.

Refer: kubernetes-sigs/controller-runtime#1640

Signed-off-by: Sunny <darkowlzz@protonmail.com>
darkowlzz added a commit to darkowlzz/image-automation-controller that referenced this pull request Nov 11, 2021
In controller-runtime v0.10.0, the client is updated to clean any stale
data in the target object when performing any operation. This results in
test failure for the code that constructs an object with both spec and
status, and creates the object and updates status it with the same
object. The fix is to set the status separately on the object before
updating it.

Refer: kubernetes-sigs/controller-runtime#1640

Signed-off-by: Sunny <darkowlzz@protonmail.com>
thbkrkr pushed a commit to elastic/cloud-on-k8s that referenced this pull request Nov 18, 2021
And fix broken unit tests:
- elasticsearch/driver.TestUpgradePodsDeletion_WithNodeTypeMutations
- elasticsearch/driver.TestUpgradePodsDeletion_Delete
- enterprisesearch.TestReconcileEnterpriseSearch_doReconcile_AssociationDelaysVersionUpgrade
- license/trial.TestReconcileTrials_Reconcile/valid_trial_secret_inits_trial

With:
- Set resource version in newTestPod
- Do not reuse the same variable to get trial/license secrets
- Always SetAssociationConf before doReconcile

Because of:
- Client: Ensure no stale data remains in target object kubernetes-sigs/controller-runtime#1640
- Fakeclient: Reject Delete with mismatched ResourceVersion kubernetes-sigs/controller-runtime#1582
darkowlzz added a commit to fluxcd/image-automation-controller that referenced this pull request Nov 30, 2021
In controller-runtime v0.10.0, the client is updated to clean any stale
data in the target object when performing any operation. This results in
test failure for the code that constructs an object with both spec and
status, and creates the object and updates status it with the same
object. The fix is to set the status separately on the object before
updating it.

Refer: kubernetes-sigs/controller-runtime#1640

Signed-off-by: Sunny <darkowlzz@protonmail.com>
krgostev pushed a commit to krgostev/gardener that referenced this pull request Apr 21, 2022
* Upgrade to k8s.io/*@v0.22.2 in go.mod

* [automated] make revendor

* [automated] make generate

* [automated] make revendor

github.com/go-openapi/spec seems to be orphaned after previous make generate

* Upgrade to c-r@v0.10.2 in go.mod

Also, upgrade setup-envtest (doesn't have a tagged release yet, so use
release commit instead)

* [automated] make revendor

* Upgrade to controller-tools@v0.7.0 in go.mod

* [automated] make revendor

* Add missing WarningsOn{Create,Update} to rest strategies

* Replace dot imports for github.com/onsi/gomega/types

Fix linting errors: `Assertion` redeclared in this block (typecheck)

* Switch to typed values for WebhookInstallOptions.*Webhooks

ref kubernetes-sigs/controller-runtime#1626

* RequestCertificate now takes an optional requestedDuration

ref kubernetes/kubernetes#99494

* Switch to matchers.DeepEqual to test semantic equality

Maps (e.g. labels, selectors, resource requirements) might be sorted differently
than expected. Hence, use semantic equality instead of strict equality, as this
is what matters to us.
Also, DeepEqual outputs yaml and adds a nice diff indicator instead of printing
some large confusing go struct representation.

* Add new memorySwap field to expected kubelet config

ref kubernetes/kubernetes#102823

* Round condition.lastUpdateTime to seconds in test

There were several changes in the fake clients that might cause the failure
to happen just now.

* Correct unit tests falsely succeeding

These tests were not preparing the test objects correctly: they only updated
them in memory but not on the fake client. This wasn't caught until now
because the fake client mimicked the real json decoder, which didn't unset
fields not present on the server. Now that the fake client zeroes fields,
the tests started failing (which is correct). So fix the tests.

ref kubernetes-sigs/controller-runtime#1651

* Remove workarounds for missing zeroing in json decoder

Now that the c-r client zeroes fields before decoding into the object,
we can drop our workarounds for this, so basically drop
kutil.CreateResetObjectFunc and its usages.

ref kubernetes-sigs/controller-runtime#1640

* Drop setting webhook gvk explicitly in envtest

webhookConfig.SetGroupVersionKind is not needed anymore with
kubernetes-sigs/controller-runtime#1665

* Add some follow-up TODO comments

* [automated] make generate

but with go 1.16.9

* Address review comments
krgostev pushed a commit to krgostev/gardener that referenced this pull request Jul 5, 2022
* Upgrade to k8s.io/*@v0.22.2 in go.mod

* [automated] make revendor

* [automated] make generate

* [automated] make revendor

github.com/go-openapi/spec seems to be orphaned after previous make generate

* Upgrade to c-r@v0.10.2 in go.mod

Also, upgrade setup-envtest (doesn't have a tagged release yet, so use
release commit instead)

* [automated] make revendor

* Upgrade to controller-tools@v0.7.0 in go.mod

* [automated] make revendor

* Add missing WarningsOn{Create,Update} to rest strategies

* Replace dot imports for github.com/onsi/gomega/types

Fix linting errors: `Assertion` redeclared in this block (typecheck)

* Switch to typed values for WebhookInstallOptions.*Webhooks

ref kubernetes-sigs/controller-runtime#1626

* RequestCertificate now takes an optional requestedDuration

ref kubernetes/kubernetes#99494

* Switch to matchers.DeepEqual to test semantic equality

Maps (e.g. labels, selectors, resource requirements) might be sorted differently
than expected. Hence, use semantic equality instead of strict equality, as this
is what matters to us.
Also, DeepEqual outputs yaml and adds a nice diff indicator instead of printing
some large confusing go struct representation.

* Add new memorySwap field to expected kubelet config

ref kubernetes/kubernetes#102823

* Round condition.lastUpdateTime to seconds in test

There were several changes in the fake clients that might cause the failure
to happen just now.

* Correct unit tests falsely succeeding

These tests were not preparing the test objects correctly: they only updated
them in memory but not on the fake client. This wasn't caught until now
because the fake client mimicked the real json decoder, which didn't unset
fields not present on the server. Now that the fake client zeroes fields,
the tests started failing (which is correct). So fix the tests.

ref kubernetes-sigs/controller-runtime#1651

* Remove workarounds for missing zeroing in json decoder

Now that the c-r client zeroes fields before decoding into the object,
we can drop our workarounds for this, so basically drop
kutil.CreateResetObjectFunc and its usages.

ref kubernetes-sigs/controller-runtime#1640

* Drop setting webhook gvk explicitly in envtest

webhookConfig.SetGroupVersionKind is not needed anymore with
kubernetes-sigs/controller-runtime#1665

* Add some follow-up TODO comments

* [automated] make generate

but with go 1.16.9

* Address review comments
fantapsody pushed a commit to fantapsody/cloud-on-k8s that referenced this pull request Feb 7, 2023
And fix broken unit tests:
- elasticsearch/driver.TestUpgradePodsDeletion_WithNodeTypeMutations
- elasticsearch/driver.TestUpgradePodsDeletion_Delete
- enterprisesearch.TestReconcileEnterpriseSearch_doReconcile_AssociationDelaysVersionUpgrade
- license/trial.TestReconcileTrials_Reconcile/valid_trial_secret_inits_trial

With:
- Set resource version in newTestPod
- Do not reuse the same variable to get trial/license secrets
- Always SetAssociationConf before doReconcile

Because of:
- Client: Ensure no stale data remains in target object kubernetes-sigs/controller-runtime#1640
- Fakeclient: Reject Delete with mismatched ResourceVersion kubernetes-sigs/controller-runtime#1582
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

client.Get() does not zero out slice fields on CRD objects
7 participants