add implementation details to sidecar kep #841

Joseph-Irving · 2019-02-18T15:34:25Z

This adds some clarification around a few points and adds some initial implementation details.

I didn't want to go too specific with code example as I'm not sure how valuable that would be at this stage and instead just call out the places where I think we should implement the logic so we can discuss whether this seems like a sensible approach. Once some POCing has been done we can link to the POC branch in the implementation details.

/assign @enisoc @kow3ns

k8s-ci-robot · 2019-02-18T15:34:32Z

Hi @Joseph-Irving. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Joseph-Irving · 2019-02-18T15:36:00Z

/sig node

cc @dchen1107 @derekwaynecarr, let me know if you any further questions, parts you would like expanded etc.

thockin · 2019-02-19T06:45:49Z

keps/sig-apps/sidecarcontainers.md

@@ -67,7 +67,7 @@ Solve issues so that they don't require application modification:

 ## Non-Goals

-Allowing multiple containers to run at once during the init phase. //TODO See if we can solve this problem with this proposal
+Allowing multiple containers to run at once during the init phase - this could be solved using the same principal but can be implemented separately. //TODO should this be a non-goal or an extension? is there enough support for this?


Maybe not quite what you were asking: Do we have any use-cases or even hypotheticals that speak to the intersection of this design and init containers? E.g. Do init containers need service-mesh functionality? I would assume they do.

In other words, should sidecars start before init containers? And then, do we have sequencing within init containers? Concretely, consider something like Istio which has an init container with privilegs to write iptables rules, then a sidecar which processes net traffic. It seems to me that the sidecar would be needed for any user-provided init containers. I don't know if this has come up in real life yet.

How do we think that holds up in general?

@louiscryan @geeknoid @costinm for some perspective

@thockin So this was brought up by istio specifically, see #753 and https://github.com/istio/cni/issues/77. I think init containers needing service mesh access seems like a fairly valid use case. I think more generally the idea of an init container needing a sidecar isn't unheard of, e.g I have an init container that performs a db schema migration but it should go through mysql proxy.

I think you can also assume that some sidecars may need the init containers to run before they start such as the example of istio with init doing iptables and sidecar then starting for traffic.

You could potentially achieve this using what we've already suggested but adapting it for init e.g

initContainers: - name: iptables command: ['set up some iptables] - name: proxy command: ['proxy-stuff'] sidecar: true - name: myinitapp command: ['do some init stuff that requires the proxy']

The difference being it would follow order like init does already, but when it gets to the sidecar it could wait for it to be ready instead of complete, you then clean up the init sidecars at the end of the init phase.
This has the advantage that it keeps the UX relatively simple, the init phase is still logically separated, while hopefully enabling the use case.

I agree with the need for the functionality, but something feels wonky UX-wise about boot order jumping between the containers and initContainers sections.

Would it make sense to have a more explicit DAG of boot operations?

@vllry, what do you mean by boot order jumping? Perhaps I wasn't that clear but I was suggesting we still have all the init containers finishing before the containers start. You could just have init sidecars that run alongside the init containers for that phase only.

What I meant was that having a sidecar toggle for initContainers seems clunky in the current config (the idea overall idea makes sense).

EG I might have a boot order like this:

initContainers: - name: 1 sidecar: true command: ... - name: 3 command: ... containers: - name: 2 sidecar: true - name: 4

It definitely works, but is awkward to read.

I was suggesting that Init container sidecars would start and finish in the init phase, container sidecars would start after the init phase is over, so there would be no way to have a boot order like you described. I agree it would be confusing if this was the case.

My vote would be to explore this just enough to be able to write a short section in the doc explaining how this could be extended to init containers in the future (at a similar level of implementation detail as the current section for implementing non-init sidecars).

We have clear, concrete use cases for non-init sidecars (Jobs, log/result shippers, sql proxy, Istio) and I don't think we should block addressing those. Non-init sidecar is already a pattern that's in use in the real world, and we just want to replace ugly workarounds (wrapper scripts, etc) with true support. Init sidecars are not in use because it's not possible even as a workaround.

I think init containers should continue to be executed to completion strictly before sidecars are launched. The purpose of init containers is to perform actions that pertain to the Pod that need to happen prior to the main containers starting (e.g. setting up configuration files, preparing the file system, pulling data down). Launching run to completion containers subsequent to "main" containers might be problematic. Perhaps in the future we could add init-sidecars?

enisoc

This looks like a good first pass to me. I added some suggestions for things to consider in the second pass.

/lgtm

enisoc · 2019-02-26T23:02:24Z

keps/sig-apps/sidecarcontainers.md

+###### Shutdown triggering:
+Package `kuberuntime`
+
+Modify `kuberuntime_manager.go`, function `computePodActions`. Have a check in this function that will see if all the non-sidecars had permanently exited, if true: return all the running sidecars in `ContainersToKill`.


When we trigger shutdown of sidecars through ContainersToKill, what exit status do they get? Do we send them SIGTERM and give them a chance to terminate gracefully? We need to verify that the overall Pod phase (Succeeded/Failed) will reflect the right semantics in the presence of sidecars, so that Job will work as intended.

So when you return ContainersToKill it iterates over them and does KillContainer on each container. This is the function that sends prestop, sigterms/kills etc, grace period is obeyed. So as long as the sidecar exits 0 it should count towards succeeded.

enisoc · 2019-02-26T23:17:03Z

keps/sig-apps/sidecarcontainers.md

+###### Sidecars terminated last
+Package `kuberuntime`
+
+Modify `kuberuntime_container.go`, function `killContainersWithSyncResult`. Break up the looping over containers so that it goes through killing the non-sidecars before terminating the sidecars.


The inner loop iterates over kubecontainer.Container (as opposed to v1.Container), which doesn't seem to have a way to tell whether the container is a sidecar. You might be able to cross-reference from the v1.Pod that also gets passed in, but we need to be careful to validate that that argument contains what we think it does. For example, I saw some comments that seemed to imply pod might be nil sometimes; we need to know what those cases are, and whether we can handle them properly.

It might end up being necessary to add data to kubecontainer.Container. I'm not sure if that represents a change to CRI? I'm also new to this code.

oh yeah I missed that. Looking around it seems to indicate that Pod may be nil, running pod must not be. If the pod is dead we're not as bothered about term ordering, so cross referencing may work with a if pod != nil.

If we can verify that pod is only ever nil when the Pod object on the API server has been force-deleted, then I agree it's reasonable to say, "We don't respect sidecar termination order if you force-delete the Pod." Have you found any evidence to say exactly when pod could be nil? For example, is it possible as a transient condition during a Node restart, before it fetches Pods from the API server?

I'm not confident from looking at the code that I could tell you all the scenarios where pod could be nil. If @feiskyer could weigh in it would be appreciated. If there are scenarios where it can be nil and we still care about ordering then we'll need to look at a different way to enable this behaviour. Changing kubecontainer.Container is a possibility but it means changing the container runtime.

enisoc · 2019-02-26T23:26:27Z

keps/sig-apps/sidecarcontainers.md

+###### PreStop hooks sent to Sidecars first
+Package `kuberuntime`
+
+Modify `kuberuntime_container.go`, function `killContainersWithSyncResult`. Loop over sidecars and execute `executePreStopHook` on them before moving on to terminating the containers.


The current code groups pre-stop and stop in one function (killContainer). Do you plan to break that up and fix all call sites?

I'm quite unsure about this part, killContainer does not appear in too many places so we could potentially change every call to split the pre-stop from the term, this definitely complicates the code a little bit as going forward everyone would have to be aware they need to call both pre-stop and kill. Alternatively you could execute prestop on the sidecars twice, once before the containers are terminated and again when the sidecars are terminated, this would of course rely on the assumption that pre-stop hooks on sidecars are idempotent.

enisoc · 2019-02-26T23:34:12Z

keps/sig-apps/sidecarcontainers.md

+###### Sidecars started first
+Package `kuberuntime`
+
+Modify `kuberuntime_manager.go`, function `computePodActions`. If pods has sidecars it will return these first in `ContainersToStart`, until they are all ready it will not return the non-sidecars.


After we return the sidecars, is it guaranteed that we'll sync again at the right time to get a chance to check readiness and start the non-sidecars? For example, do we sync every time any container status (as in, status on the machine, not k8s API object .status) changes? It might be obvious to people familiar with this code, but I just want to make sure.

From my understanding, sync pods is frequently ran with the kubelet option of SyncFrequency indicating the max gap between syncs, this defaults to 1minute

enisoc · 2019-02-26T23:46:17Z

keps/sig-apps/sidecarcontainers.md

@@ -67,7 +67,7 @@ Solve issues so that they don't require application modification:

 ## Non-Goals

-Allowing multiple containers to run at once during the init phase. //TODO See if we can solve this problem with this proposal
+Allowing multiple containers to run at once during the init phase - this could be solved using the same principal but can be implemented separately. //TODO should this be a non-goal or an extension? is there enough support for this?


My vote would be to explore this just enough to be able to write a short section in the doc explaining how this could be extended to init containers in the future (at a similar level of implementation detail as the current section for implementing non-init sidecars).

We have clear, concrete use cases for non-init sidecars (Jobs, log/result shippers, sql proxy, Istio) and I don't think we should block addressing those. Non-init sidecar is already a pattern that's in use in the real world, and we just want to replace ugly workarounds (wrapper scripts, etc) with true support. Init sidecars are not in use because it's not possible even as a workaround.

enisoc · 2019-02-28T22:13:53Z

keps/sig-apps/sidecarcontainers.md

+###### Sidecars terminated last
+Package `kuberuntime`
+
+Modify `kuberuntime_container.go`, function `killContainersWithSyncResult`. Break up the looping over containers so that it goes through killing the non-sidecars before terminating the sidecars.


If we can verify that pod is only ever nil when the Pod object on the API server has been force-deleted, then I agree it's reasonable to say, "We don't respect sidecar termination order if you force-delete the Pod." Have you found any evidence to say exactly when pod could be nil? For example, is it possible as a transient condition during a Node restart, before it fetches Pods from the API server?

enisoc · 2019-02-28T22:17:21Z

keps/sig-apps/sidecarcontainers.md

+###### Sidecars started first
+Package `kuberuntime`
+
+Modify `kuberuntime_manager.go`, function `computePodActions`. If pods has sidecars it will return these first in `ContainersToStart`, until they are all ready it will not return the non-sidecars. `computePodActions` is ran frequently, with the max time between runs being determined by the Kubelet's `SyncFrequency` option (default 1 minute).


A minute is actually a large enough delay to be annoying. For example, if we always waited 1 minute after starting sidecars before realizing we can now start regular containers, we'd be adding that minute to the startup of any Pod using sidecars.

I would hope that there is also an event-based sync trigger in addition to the periodic one. Did you see any evidence of that?

My understanding is that any changes to the Pod status should trigger a sync via an update channel. config/config.go should send a kubetypes.RECONCILE when pod.Status has changed. This is given to the kubelet's syncLoop update channel which should trigger handler.HandlePodReconcile(u.Pods) which will cause the pod to be synced. So when the sidecar becomes ready that should trigger a sync

So this was not quite correct, it does indeed trigger handler.HandlePodReconcile(u.Pods) but this only triggers a pod sync if NeedToReconcilePodReadiness returns true, which currently only does anything if the pod is using readiness gates. Potentially we could put another check here looking to see if all sidecars are ready and non sidecars haven't been started, if so trigger a sync.
The kubelet doesn't currently have an update channel for readiness changes. only liveness so that isn't an option without making a fair amount of changes.

enisoc · 2019-03-01T23:21:49Z

/ok-to-test
/lgtm

Joseph-Irving · 2019-03-07T11:25:15Z

I have created a PoC for the sidecars and raised a WIP pr for people to view kubernetes/kubernetes#75099, very much a first draft but should hopefully help illustrate what these implementation details might look like.

Joseph-Irving · 2019-03-21T09:32:57Z

Unless anyone has some big objections, it would be good to get this merged soon and revisit some of the unanswered questions in another PR.
This is also because I would like to update the Kep to match the new format in https://github.com/kubernetes/enhancements/blob/master/keps/YYYYMMDD-kep-template.md but I'd rather do that separately as I think it will be a very large diff.

cc @enisoc @kow3ns

enisoc

/lgtm

kow3ns · 2019-03-22T18:07:32Z

/approve

k8s-ci-robot · 2019-03-22T18:07:42Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enisoc, Joseph-Irving, kow3ns

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~keps/sig-apps/OWNERS~~ [kow3ns]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot assigned enisoc and kow3ns Feb 18, 2019

k8s-ci-robot requested a review from kow3ns February 18, 2019 15:34

k8s-ci-robot added the sig/architecture Categorizes an issue or PR as relevant to SIG Architecture. label Feb 18, 2019

k8s-ci-robot requested a review from prydonius February 18, 2019 15:34

k8s-ci-robot added sig/pm cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 18, 2019

k8s-ci-robot added the sig/node Categorizes an issue or PR as relevant to SIG Node. label Feb 18, 2019

thockin reviewed Feb 19, 2019

View reviewed changes

enisoc reviewed Feb 26, 2019

View reviewed changes

k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. and removed lgtm "Looks good to me", indicates that a PR is ready to be merged. labels Feb 26, 2019

enisoc reviewed Feb 28, 2019

View reviewed changes

Joseph-Irving force-pushed the sidecar_kep_update branch from 656198c to 6186c93 Compare March 1, 2019 13:32

k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 1, 2019

Joseph-Irving mentioned this pull request Mar 7, 2019

Sidecar Containers #753

Open

15 tasks

Joseph-Irving changed the title ~~add draft implementation details to sidecar kep~~ add implementation details to sidecar kep Mar 21, 2019

Joseph-Irving force-pushed the sidecar_kep_update branch from 6186c93 to ba6357e Compare March 21, 2019 09:18

k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 21, 2019

add implementation details to sidecar kep

29c22ba

Joseph-Irving force-pushed the sidecar_kep_update branch from ba6357e to 29c22ba Compare March 21, 2019 09:21

enisoc approved these changes Mar 21, 2019

View reviewed changes

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 21, 2019

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 22, 2019

kow3ns approved these changes Mar 22, 2019

View reviewed changes

k8s-ci-robot merged commit aaa1660 into kubernetes:master Mar 22, 2019

Joseph-Irving mentioned this pull request Jul 8, 2019

Add container.lifecycle.type to api (sidecars KEP) kubernetes/kubernetes#79649

Closed

justaugustus removed the sig/pm label Apr 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add implementation details to sidecar kep #841

add implementation details to sidecar kep #841

Joseph-Irving commented Feb 18, 2019

k8s-ci-robot commented Feb 18, 2019

Joseph-Irving commented Feb 18, 2019

thockin Feb 19, 2019

Joseph-Irving Feb 19, 2019

vllry Feb 19, 2019

Joseph-Irving Feb 19, 2019

vllry Feb 19, 2019

Joseph-Irving Feb 20, 2019

enisoc Feb 26, 2019

kow3ns Mar 22, 2019

enisoc left a comment

enisoc Feb 26, 2019

Joseph-Irving Feb 27, 2019

enisoc Feb 26, 2019

Joseph-Irving Feb 27, 2019

enisoc Feb 28, 2019

Joseph-Irving Mar 1, 2019

enisoc Feb 26, 2019

Joseph-Irving Feb 27, 2019

enisoc Feb 26, 2019

Joseph-Irving Feb 27, 2019

enisoc Feb 26, 2019

enisoc Feb 28, 2019

enisoc Feb 28, 2019

Joseph-Irving Mar 1, 2019

Joseph-Irving Mar 4, 2019

enisoc commented Mar 1, 2019

Joseph-Irving commented Mar 7, 2019

Joseph-Irving commented Mar 21, 2019

enisoc left a comment

kow3ns commented Mar 22, 2019

k8s-ci-robot commented Mar 22, 2019

add implementation details to sidecar kep #841

add implementation details to sidecar kep #841

Conversation

Joseph-Irving commented Feb 18, 2019

k8s-ci-robot commented Feb 18, 2019

Joseph-Irving commented Feb 18, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

enisoc left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

enisoc commented Mar 1, 2019

Joseph-Irving commented Mar 7, 2019

Joseph-Irving commented Mar 21, 2019

enisoc left a comment

Choose a reason for hiding this comment

kow3ns commented Mar 22, 2019

k8s-ci-robot commented Mar 22, 2019