Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

notification events from CRI runtime #39

Closed
ffromani opened this issue Oct 1, 2021 · 7 comments
Closed

notification events from CRI runtime #39

ffromani opened this issue Oct 1, 2021 · 7 comments

Comments

@ffromani
Copy link
Contributor

ffromani commented Oct 1, 2021

The major container runtimes, containerd and cri-o, both offer extensive hooking mechanism we can leverage to get container lifecycle events while the podresources API catches up. It could work like this:

  1. the RTE opens a notification socket on a well known location
  2. RTE offer a client program to be called from the CRI runtime hooks
  3. When a noteworthy container lifeycle event (create and remove; what else?) happens, the hook calls the client to notify RTE
  4. Depending on further design decisions, RTE either trusts the client program with the data it received from the notification socket, or performs a full GetAllocatable/List poll to update its status.
@swatisehgal
Copy link
Collaborator

This is a good idea and definitely worth exploring! Currently, I only see PreStart and PostStop hooks here which should take care of the container lifecycle event you mentioned above (create and remove). In addition to this, I can think of update lifecycle event where resources are updated but that could be phase 2 of this work.

@AlexeyPerevalov
Copy link

@swatisehgal, @fromanirh my college interested in this, but for solving issues with PLEG.
He has a draft at https://github.com/ikeeip/containerd/tree/cri_subscribe_events

@ffromani
Copy link
Contributor Author

ffromani commented Oct 5, 2021

@AlexeyPerevalov very nice! thanks for letting us know. I'll surely have a look ASAP.

@ffromani
Copy link
Contributor Author

Brainstorming a bit more of implementation details.
Prerequisite:

  1. RTE must be the listener (e.g. the one creating the endpoint and waiting for notification)
  2. The protocol must be as simple as possible
  3. It should be possible to notify events using plain shell scripts - to make the job of the hook writers as simple as possible
  4. No pod/container-details data should be passed alongside as notifications. E.g. not the pod spec.
  5. We should leave the option open to be forward compatible and to be able to send the container resources alongside the notification in the future

Hence the implementation could look like

  1. RTE gains an option to enable this feature
  2. If enabled, RTE creates a fifo (not a unix domain socket, see requirements 2+3) optionally with a user-supplied location (let's have a sane default)
  3. RTE adds an event loop to read from the fifo
  4. Messages in the fifo are fixed size, considering we target amd64 I'd say exactly 8 bytes
  5. We don't define actual content of the messages now. The content of each message is discarded; we only get the notification, and the notification triggers a poll event as usual.
  6. Because of the point above, each message can be just "0" x 8 (eight "0" chars)
  7. Throttling (if ever a concern) will be done in the server side (aka RTE). meaning clients can just try to write the message in the fifo, discarding (maybe just logging) any error

@ffromani
Copy link
Contributor Author

Even simpler implementation discussed offline with @cynepco3hahue

  1. Start the ds with some host directory /path/to/whatever/rte
  2. the ds pod will create a new file under /path/to/whatever/rte say /path/to/whatever/rte/notify
  3. the hooks will touch the file each time a (guaranteed) is pod created or deleted
  4. the fsnotifier under the ds pod will watch for CHMOD event

I think this is actually better than my proposal because leaving room for future expansion is a double edged sword. The real path forward is to make the podresources kubelet api watchable.

@ffromani
Copy link
Contributor Author

Even simpler implementation discussed offline with @cynepco3hahue

1. Start the ds with some host directory `/path/to/whatever/rte`

2. the ds pod will create a new file under `/path/to/whatever/rte` say `/path/to/whatever/rte/notify`

3. the hooks will touch the file each time a (guaranteed) is pod created or deleted

4. the fsnotifier under the ds pod will watch for CHMOD event

I think this is actually better than my proposal because leaving room for future expansion is a double edged sword. The real path forward is to make the podresources kubelet api watchable.

tentative implementation: #54

@ffromani
Copy link
Contributor Author

implemented in #54 (merged)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants