-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
It is too easy to create duplicate caches in different projections #1660
Comments
Sorry, I don't follow. The code you are showing results in two watches, one for ReplicaSets and one for Pods in the metadata projection, but not in two watches for pods in different projections. I am not really sure what you are trying to do here. Yes, resources in the apiserver are independent of each other which includes watches, so no one can guarantee you that if you observe a change in resource A (say replicaset) you are also observing an expected change in resource b (say pods). Watching the same api in different projections is technically possible and it can not be guaranteed that they will be identical at all times but there really is no good reason to ever do that. |
I realize that the example with pods and replicasets does not a good job at showing the "race" issue. @wallrj and @irbekrm helped me build a minimal working example that would show the "race" issue; the code to reproduce the below example is available in https://github.com/maelvls/controller-runtime-cache-race. In this example, I create a controller that adds the
In normal circumstances (no race), we can see in the logs that the two reflectors (
When a race occurs, the two reflectors do not process (or receive, I'm not sure) the event
I totally agree, I don't think there is a good reason to use two different projections for the same type. My point is rather that there should be more warnings in the documentation about the issues that you may encounter when mixing up projections for the same type, and that it is unsupported by controller-runtime. @wallrj and I wrote a controller that was mixing the metadata and "concrete" projections for the Secret resource, and the only reason we know this is because we have this "race" in our integration tests. |
It is still unclear to me what motivation could be there for doing that. It is wasteful for the apiserver and your component, you are essentially requesting the same data multiple times (in one case only a subset of it). What is your reason for doing that? I don't really see a point in warning ppl of an issue they might encounter after creating duplicate caches in different projections, because that shouldn't be done in the first place. If anything, we should make it harder to create duplicate caches. |
We had no motivation or purpose in mixing the metadata and concrete projections. We did not know that we were mixing things, we just wanted to use the medatada projection. We did not realize that the line: _ = mgr.GetClient().Get(context.Background(), r.NamespaceAndName, secret) meant that another cache would get created. The symptom of this secondary cache is a race.
I agree. Users should be aware when they create two caches and that having two caches of the same object on different projections is not supported by controller-runtime (because of the cache duplication that it creates as well as the risk of data races). How can I help? |
Ah ok, that is the bit I was missing :) /retitle It is too easy to create duplicate caches in different projections So at the end of the day this is a programmer error and not something we can detect at compile time. We could add a check for this into pkg/cache and error out if someone requests something that we already have in a different projection? |
There is another similar condition when someone tries to get/list/watch both typed and unstructured objects with the same GVK. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
I'd be happy to help add this check, I will open a PR. |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
Fixed in #1747. /close |
@maelvls: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Since v0.7.0, it has become possible to use a "metadata-only" cache (#1174). One caveat that is (I think) not highlighted enough in the Go comments is that you may end up writing racy code when mixing multiple types of clients.
For example, the following snippet watches the same type (v1.Pod) using two watchers:
A simple use-case could be the one of reconciling ReplicaSets whenever a Pod changes.
When a Pod changes, the apiserver sends a
MODIFIED
event twice since two watchers are running (one is watchingmetav1.PartialMetadataObject
, the otherv1.Pod
). This means that the metadata cache may get updated first, triggering the reconciliation, while the second has not been propagated.This race may lead to a "not found" error in the
Get
call above. The issue #1454 also talks about this unexpected behavior.Update: I had no motivation or purpose in mixing the metadata and concrete projections. I did not know that I was mixing things. I just wanted to use the medatada projection.
Does it make sense to document this caveat?
The text was updated successfully, but these errors were encountered: