Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement an Informer in python-client #868

Open
ellieayla opened this issue Jul 8, 2019 · 21 comments
Open

Implement an Informer in python-client #868

ellieayla opened this issue Jul 8, 2019 · 21 comments
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.

Comments

@ellieayla
Copy link

ellieayla commented Jul 8, 2019

https://github.com/kubernetes/client-go has an Informer implementation. Internally it leverages a watcher of some collection of resources, continually streams changes (add/modify/delete events), reflects the resources into a downstream store (cache), handles connectivity drops, and periodically does a full resync. This all happens on some background thread (goroutine). A client of the informer is free to iterate over that stored cache without concern for how it's populated, and immediately get (possibly outdated) state.

Applications using https://github.com/kubernetes-client/python that want a local store of resources reflecting some in-cluster state need to concern themselves with those lower-level details. There's a lot of room for error.

On 2019-06-25, go#28 added a simple informer implementation to the openapi-generated client for Golang. It defines a Cache struct, with both a list of all objects and event handler callbacks that a consumer could register.

https://github.com/kubernetes-client/python should contain a similar implementation.

People have been talking about this a bit in various places.

@ellieayla
Copy link
Author

https://github.com/kubernetes-client/python should contain a similar implementation.

Philosophical consensus: should the openapi-generated client libraries contain implementations of higher-level abstractions like this?

@alejandrox1
Copy link

/cc

@rfum
Copy link

rfum commented Sep 17, 2019

Any updates on this issue?

@tbarrella
Copy link

I wrote a SharedIndexInformer in Python after ~4000 lines of code including tests by directly translating it, its dependencies, and unit tests from client-go. I could share it but would want to be sure licensing/copyright isn't an issue; maybe it's ok as long as the "Kubernetes Authors" license header is on each translated file and the license is Apache? Also, I'm not sure if it's fit for this client:

  • It uses asyncio (and isn't thread-safe) and currently requires Python 3.7. I think it would make sense to add types because it's relatively complex
  • The APIs would need some work. They aren't always Pythonic because I focused on translating it first
  • I got an example working with a kind cluster but haven't tried to measure its performance. There may be stray tasks as of now
  • I'm missing context about client-go such as knowledge about technical debt that shouldn't be replicated. In general I'm not sure if it makes sense to maintain a Python version of DeltaFIFO, Reflector, and SharedIndexInformer...

Is there any interest in such a translation? Would licensing be a concern?

@gtaylor
Copy link

gtaylor commented Feb 6, 2020

@alanjcastonguay if you aren't actively working on this, it may make sense to un-assign yourself from this issue. That way any would-be contributors won't have to wonder whether work is in progress before making their own attempt.

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 6, 2020
@tbarrella
Copy link

There's a lot at https://github.com/tbarrella/aiok8s, although I realized I probably won't be able to maintain it right now...

@ellieayla
Copy link
Author

/lifecycle frozen

@k8s-ci-robot k8s-ci-robot added lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 11, 2020
@roycaihw roycaihw added the help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. label May 11, 2020
@JacobHenner
Copy link

I'm working on hacking together an informer-like object for kubernetes_asyncio. If it turns into something useful I'll publish it and link it here for reference.

Regarding the comment above:

Philosophical consensus: should the openapi-generated client libraries contain implementations of higher-level abstractions like this?

Have there been any discussions about this?

@tomplus
Copy link
Member

tomplus commented Feb 15, 2021

IMO it should be a part of the library. We already have such high-level abstractions like recently added Leader-Election or applying manifests from yaml files etc.

@roycaihw
Copy link
Member

The kopf's has a watch implementation that is claimed to be informer-equivalent, which may be interesting to look into.

@brendandburns
Copy link
Contributor

@JohnRusk
Copy link

I'm kind of suprised by how long this has been open, given that the Python client is one of the officially supported ones for K8s. I had naievely expected that official support would imply approximate feature parity across the different languages, but in tihs case it does not. (All the other officially supported client libraries, except the Haskel one, have Informer support).

Also, the difference between repeated LIST class vs an Informer can be significant in terms of API server load and performance.

Does anyone on this thread have any updates on whether this issue may be resolved soon?

@ellieayla ellieayla removed their assignment Mar 24, 2022
@wukunliu
Copy link

any updates on this?

@centromere
Copy link

Is this going to be accepted?

@fighterhit
Copy link

any updates on this?

leseb added a commit to leseb/ilab-on-ocp that referenced this issue Oct 16, 2024
From time to time, the watcher connection will drop, this will leave the
workflow hanging forever to catch an event that will never arrive.
Upon exception catching of:

```
Unexpected error: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))
```

We re-establish the watcher's connection.
We also configured a timeout of 1m for each watcher.

Once, the informers are implemented in the kubernetes python library it
will be more robust to switch that. This is tracked here:
kubernetes-client/python#868.

Signed-off-by: Sébastien Han <seb@redhat.com>
leseb added a commit to leseb/ilab-on-ocp that referenced this issue Oct 16, 2024
From time to time, the watcher connection will drop, this will leave the
workflow hanging forever to catch an event that will never arrive.
Upon exception catching of:

```
Unexpected error: ("Connection broken: InvalidChunkLength(got length b'', 0 bytes read)", InvalidChunkLength(got length b'', 0 bytes read))
```

We re-establish the watcher's connection.
We also configured a timeout of 1m for each watcher.

Once, the informers are implemented in the kubernetes python library it
will be more robust to switch that. This is tracked here:
kubernetes-client/python#868.

Signed-off-by: Sébastien Han <seb@redhat.com>
@waltforme
Copy link

any updates on this?

@MikeSpreitzer
Copy link

It would be really nice to have informers and workqueues in Python maintained by the Kubernetes project.

@MikeSpreitzer
Copy link

I, too, found kopf when I went looking for such a thing. It might be good to hear from @nolar on this subject.

@MikeSpreitzer
Copy link

MikeSpreitzer commented Mar 13, 2025

I started to look at the kopf code. Here are some things that I found.

It uses WAIT as much as possible, LIST otherwise, like in a client-go informer. I see this in https://github.com/nolar/kopf/blob/1.37.4/kopf/_cogs/clients/watching.py .

The queuing and working is different from the workqueue used in client-go based controllers. The heart of the matter seems to be in https://github.com/nolar/kopf/blob/1.37.4/kopf/_core/reactor/queueing.py . The approach there is a coroutine per recently-received object, each with an incoming queue of full object values delivered from watching/listing. This is less efficient than in Golang controllers and I do not see how to implement a controller like the one for StatefulSet, where a watch/list event about a Pod needs to enqueue a reference to the StatefulSet.

@nolar
Copy link

nolar commented Mar 13, 2025

Thanks for mentioning me. Sorry, I didn’t fully get what is the problem statement, on which I should comment.

Kopf indeed has a few patterns to implement things that keep the latest state of the object available in memory. E.g., indexers. If there is some activity involved, daemons/timers can be used.

I forgot the terminology, but I guess, Kopf’s indexers are the closest concept to Go’s informers — as I understood them back then.

You are absolutely correct at how the core of Kopf works: it consumes a stream of jsons for a resource list, then multiplexes that into per-object tasks (coroutines). That is done intentionally so that no single event (json line) is ever missed — e.g. in the on-event handlers. Then, either some implicit logic happens (for indexers, or daemons’ body-views, etc.), or the handlers are explicitly invoked.

To address the point on “less efficient”, I should know what is more efficient and how Go’s controllers work — which I do not know, sadly. If you hint on this cause of this inefficiency, I might give an idea how to solve it.

Worth noting that in Python, especially in asynchronous Python, all objects are passed by reference under the hood, so the queue actually holds the references to those json structures, not the whole things. And it is supposed that the handlers are fast, so the queues never clog. The cpu-heavy processing must be moved to threads. If only the latest state of the object is needed in memory, see indexers/daemons/timers mentioned above.

where a watch/list event about a Pod needs to enqueue a reference to the StatefulSet.

Make an indexer on StatefulSets (with or without filters), access it from the events of Pods? The lookup key can be e.g. a StatefulSet’s namespace/name, or any other tuple.

I hope this helps. And I will be happy to help for more specific questions. Feel free to reach out to me by social media or email if you have specifics of this task to discuss — so that we do not overshare and overflood this issue for this library here ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Denotes an issue that needs help from a contributor. Must meet "help wanted" guidelines. lifecycle/frozen Indicates that an issue or PR should not be auto-closed due to staleness.
Projects
None yet
Development

No branches or pull requests