-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement an Informer in python-client #868
Comments
Philosophical consensus: should the openapi-generated client libraries contain implementations of higher-level abstractions like this? |
/cc |
Any updates on this issue? |
I wrote a SharedIndexInformer in Python after ~4000 lines of code including tests by directly translating it, its dependencies, and unit tests from client-go. I could share it but would want to be sure licensing/copyright isn't an issue; maybe it's ok as long as the "Kubernetes Authors" license header is on each translated file and the license is Apache? Also, I'm not sure if it's fit for this client:
Is there any interest in such a translation? Would licensing be a concern? |
@alanjcastonguay if you aren't actively working on this, it may make sense to un-assign yourself from this issue. That way any would-be contributors won't have to wonder whether work is in progress before making their own attempt. |
Issues go stale after 90d of inactivity. If this issue is safe to close now please do so with Send feedback to sig-testing, kubernetes/test-infra and/or fejta. |
There's a lot at https://github.com/tbarrella/aiok8s, although I realized I probably won't be able to maintain it right now... |
/lifecycle frozen |
I'm working on hacking together an informer-like object for kubernetes_asyncio. If it turns into something useful I'll publish it and link it here for reference. Regarding the comment above:
Have there been any discussions about this? |
IMO it should be a part of the library. We already have such high-level abstractions like recently added Leader-Election or applying manifests from yaml files etc. |
The kopf's has a watch implementation that is claimed to be informer-equivalent, which may be interesting to look into. |
fwiw (since I just noticed this issue) there are Informers for both the Java (https://github.com/kubernetes-client/java/tree/master/util/src/main/java/io/kubernetes/client/informer) and JavaScript (https://github.com/kubernetes-client/javascript/blob/master/src/informer.ts) client libraries. |
I'm kind of suprised by how long this has been open, given that the Python client is one of the officially supported ones for K8s. I had naievely expected that official support would imply approximate feature parity across the different languages, but in tihs case it does not. (All the other officially supported client libraries, except the Haskel one, have Informer support). Also, the difference between repeated LIST class vs an Informer can be significant in terms of API server load and performance. Does anyone on this thread have any updates on whether this issue may be resolved soon? |
any updates on this? |
Is this going to be accepted? |
any updates on this? |
From time to time, the watcher connection will drop, this will leave the workflow hanging forever to catch an event that will never arrive. Upon exception catching of: ``` Unexpected error: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read)) ``` We re-establish the watcher's connection. We also configured a timeout of 1m for each watcher. Once, the informers are implemented in the kubernetes python library it will be more robust to switch that. This is tracked here: kubernetes-client/python#868. Signed-off-by: Sébastien Han <seb@redhat.com>
From time to time, the watcher connection will drop, this will leave the workflow hanging forever to catch an event that will never arrive. Upon exception catching of: ``` Unexpected error: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read)) ``` We re-establish the watcher's connection. We also configured a timeout of 1m for each watcher. Once, the informers are implemented in the kubernetes python library it will be more robust to switch that. This is tracked here: kubernetes-client/python#868. Signed-off-by: Sébastien Han <seb@redhat.com>
any updates on this? |
It would be really nice to have informers and workqueues in Python maintained by the Kubernetes project. |
I started to look at the kopf code. Here are some things that I found. It uses WAIT as much as possible, LIST otherwise, like in a client-go informer. I see this in https://github.com/nolar/kopf/blob/1.37.4/kopf/_cogs/clients/watching.py . The queuing and working is different from the workqueue used in client-go based controllers. The heart of the matter seems to be in https://github.com/nolar/kopf/blob/1.37.4/kopf/_core/reactor/queueing.py . The approach there is a coroutine per recently-received object, each with an incoming queue of full object values delivered from watching/listing. This is less efficient than in Golang controllers and I do not see how to implement a controller like the one for StatefulSet, where a watch/list event about a Pod needs to enqueue a reference to the StatefulSet. |
Thanks for mentioning me. Sorry, I didn’t fully get what is the problem statement, on which I should comment. Kopf indeed has a few patterns to implement things that keep the latest state of the object available in memory. E.g., indexers. If there is some activity involved, daemons/timers can be used. I forgot the terminology, but I guess, Kopf’s indexers are the closest concept to Go’s informers — as I understood them back then. You are absolutely correct at how the core of Kopf works: it consumes a stream of jsons for a resource list, then multiplexes that into per-object tasks (coroutines). That is done intentionally so that no single event (json line) is ever missed — e.g. in the on-event handlers. Then, either some implicit logic happens (for indexers, or daemons’ body-views, etc.), or the handlers are explicitly invoked. To address the point on “less efficient”, I should know what is more efficient and how Go’s controllers work — which I do not know, sadly. If you hint on this cause of this inefficiency, I might give an idea how to solve it. Worth noting that in Python, especially in asynchronous Python, all objects are passed by reference under the hood, so the queue actually holds the references to those json structures, not the whole things. And it is supposed that the handlers are fast, so the queues never clog. The cpu-heavy processing must be moved to threads. If only the latest state of the object is needed in memory, see indexers/daemons/timers mentioned above.
Make an indexer on StatefulSets (with or without filters), access it from the events of Pods? The lookup key can be e.g. a StatefulSet’s namespace/name, or any other tuple. I hope this helps. And I will be happy to help for more specific questions. Feel free to reach out to me by social media or email if you have specifics of this task to discuss — so that we do not overshare and overflood this issue for this library here ;-) |
https://github.com/kubernetes/client-go has an Informer implementation. Internally it leverages a watcher of some collection of resources, continually streams changes (add/modify/delete events), reflects the resources into a downstream store (cache), handles connectivity drops, and periodically does a full resync. This all happens on some background thread (goroutine). A client of the informer is free to iterate over that stored cache without concern for how it's populated, and immediately get (possibly outdated) state.
Applications using https://github.com/kubernetes-client/python that want a local store of resources reflecting some in-cluster state need to concern themselves with those lower-level details. There's a lot of room for error.
On 2019-06-25, go#28 added a simple informer implementation to the openapi-generated client for Golang. It defines a
Cache
struct, with both a list of all objects and event handler callbacks that a consumer could register.https://github.com/kubernetes-client/python should contain a similar implementation.
People have been talking about this a bit in various places.
The text was updated successfully, but these errors were encountered: