Handling exit of kerberos sidecar container for k8s < 1.28 #35748

amoghrajesh · 2023-11-20T06:02:28Z

As a part of the effort to enable kerberos to work e2e with airflow in a kubernetes deployment, some changes related to the pod template and worker deployment have gone in recently in PR: #35146 and #35548.

However, one problem remains. Which is the airflow sidecar container not exiting gracefully even after the "base" container completes or exits.

To get this working, I was able to do a POC by making airflow kerberos container and airflow base container talk using shareProcessNamespace and then using "args" field of airflow kerberos sidecar to watch the base container periodically and kill the process of itself if airflow base container has exited. This doesn't add a lot onto the performance side as well because the command run by airflow kerberos will essentially be a finding a pid in /proc/*/.

~~This solution has been tested out on a secure environment like Redhat Openshift in conjunction to on prem kubernetes distribution.~~

EDIT:
Some approaches were discussed in this PR. The consensus is to handle using combination of EXIT file for k8s version < 1.28 and using native sidecar containers for 1.28+

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

chart/templates/workers/worker-deployment.yaml

potiuk · 2023-11-20T12:08:09Z

Shared process namespace in this case kind of violates the whole assumption the separate kerberos container here is introduced for. The whole idea is that "airflow" containers do not see the kerberors / keytab used to refresh the short time ti ket. Only kerberos container/refreshing process should ever be able to get access to keytab.

Keytab are long-living and provide full access to kerberos service. They use symmetric encryption and once you get hold of it, you are able to communicate with Kerberos server and obtain the short living ticket to do the job you are supposed to do - so it should be very strongly guarded property.

Generally Airflow components should never be able to see or obtain keytab files (only airflow kerberos refreshing container should have access to it) - instead all the components should only access the short-living ticket. This is the basic assumption that the whole separtion of the containers is based on - the two containers share only filesystem where the ticket is refreshed and nothing else.

See https://www.fortinet.com/resources/cyberglossary/kerberos-authentication for example.

If airflow components (specifically worker) will get access to keytab, someone could write a DAG to - for example - send the keytab to a remote system and once this happens and such keytab can be used by anyone to do anything. When you do the same with short living ticket, you are limited to only what the service allows the ticket to do + it's time limited so exposure of such breach is potentially much more limited.

Of course, ysing SharedProcessNamespace is not explicitly violating this assumption. It does not give the airflow component direct access to the keytab. So far so good. But it opens a way to other ways of obtaining the keytab by malicious actors. For example, you could connect to the kerberos process with gdb and dump the memory of it and retrieve the keytab from memory of that process. And this is only one of the ways you could retrieve that information, there are many others. Some of them you might protect better from, but generally speaking, it significantly decreases the isolation that container mechanism introduces (and the reason why the two are run in separate containers).

So while this solution is not directly giving access to the keytab, it does decrease isolation and introduces security risks. I am quite sure when we introduce it, this will be flagged as a security issue and we will have to fix it.

I think there are two options:

we should figure out a different way how to communicate with the kerberos refreshing process and stop it - one of the options is to let the running process be stoppable by sending a "shutdown" message via TCP connection - and expose the sidecar container's port to the main containers. Another option (probably simpler to implement) is to keep some kind of "shutdown" lock file in the same filesystem where the ticket is stored and signal the rerfreshing process that it should exit. It could be based on lock mechanism available natively in python.
If we limit the solution to only kubernetes 1.28+, we have the option of using Native Sidecar Containers and marking the kerberos sidecar as such https://kubernetes.io/blog/2023/08/25/native-sidecar-containers/

Likely the best is combination of these 1) (for k8s < 1.28) and 2) (for k8s >= 1.28).

amoghrajesh · 2023-11-20T12:32:16Z

Thank you for the detailed review @potiuk. This is extremely helpful to me.

Another option (probably simpler to implement) is to keep some kind of "shutdown" lock file in the same filesystem where the ticket is stored and signal the rerfreshing process that it should exit. It could be based on lock mechanism available natively in python.

I like this idea. It is simple yet effective. I would like to explore more along these lines and get back in case of any doubts.

If we limit the solution to only kubernetes 1.28+, we have the option of using Native Sidecar Containers and marking the kerberos sidecar as such https://kubernetes.io/blog/2023/08/25/native-sidecar-containers/

Came across this too. It is not recommended as of now since it is in its alpha stage. However, few questions here:

If 1) works for old and new k8s, no need to have 2) at all?
Do we have any utils to help us identify k8s version at the time of helm install itself?

amoghrajesh · 2023-11-20T12:36:49Z

we should figure out a different way how to communicate with the kerberos refreshing process and stop it - one of the options is to let the running process be stoppable by sending a "shutdown" message via TCP connection - and expose the sidecar container's port to the main containers

This is not a bad option either. The main container could send across a kill signal through TCP as part of its postRun hook or so. That way, it would send the signal and exit.

This solution is briefly discussed here: https://medium.com/apache-airflow/enable-kerberos-with-airflow-kubernetesexecutor-6e86621e97a5

I would like to explore the better one among the two and come up with the proposal soon :D

potiuk · 2023-11-20T13:07:21Z

I think we have ways to generate different resources for 1.28+ . I believe helm takes K8s version into account when generating resources. Look in our chart I think we have a few cases of that

I also think native sidecar is the long term way to go. It's been long time in the making - KEP for it has been discussed for 3 years at least - and it handles many edge cases and hides a lot of complexity under the hood of simple label.

Also the 1) case is complex to develop and has limitations - need to open TCP/IP port or having write access from airflow components to the shared filesystem also impact security properties of this setup. Currently you just need read access from worker side to the filesystem abnd you do not need TCP/iP connection - so both proposals are also weakening the isolation (though not as much as shared process does). So I'd say we should treat native side-containers as THE solution and the others are merely workarounds for versions of kubernwtes that do not have native side containers yet.

amoghrajesh · 2023-11-21T08:30:24Z

I also think native sidecar is the long term way to go. It's been long time in the making - KEP for it has been discussed for 3 years at least - and it handles many edge cases and hides a lot of complexity under the hood of simple label.

Thanks for letting me know about this. Was not very aware of this. If this is the case, we should consider this as "THE ONLY" solution along with lock file as you mentioned for earlier versions

amoghrajesh · 2023-11-21T08:31:10Z

So I'd say we should treat native side-containers as THE solution and the others are merely workarounds for versions of kubernwtes that do not have native side containers yet.

Right. This would be a good way forward & we should look forward at going this route. I am going to try this out by deploying a 1.28 k8s cluster, hopefully things will be good there 👍🏽

potiuk · 2023-11-21T13:50:54Z

Thanks for letting me know about this. Was not very aware of this. If this is the case, we should consider this as "THE ONLY" solution along with lock file as you mentioned for earlier versions

You can read about it for example here: https://buoyant.io/blog/kubernetes-1-28-revenge-of-the-sidecars it explains the history (started 2019 so I was not too far off with 3 years).

amoghrajesh · 2023-11-23T05:13:23Z

I think we have ways to generate different resources for 1.28+ . I believe helm takes K8s version into account when generating resources. Look in our chart I think we have a few cases of that

@potiuk I did check for this. Unfortunately, we do not seem to have such a mechanism. We will eventually need it to get the 1.28 features and this is also useful in a long run.

I was thinking of introducing an env variable at the worker deployment level where the main container has the env that holds the kube version. We can conditionally set the args for containers and do a lot more things too.

For example:

      containers:
      - name: base
        env:
        - name: KUBE_VERSION
          valueFrom:
            configMapKeyRef:
              name: your-scripts
              key: kubeVersionScript.sh
        volumeMounts:
        - name: script-volume
          mountPath: /scripts
        args:
{{- if eq .Env.KUBE_VERSION "1.25" }}
        - /bin/sh
        - -c
        - |
          echo "Kube version is 1.25, do something"
{{- else if eq .Env.KUBE_VERSION "1.26" }}
        - /bin/sh
        - -c
        - |
          echo "Kube version is 1.26, do something else"
{{- else }}
        - /bin/sh
        - -c
        - |
          echo "Kube version is not 1.25 or 1.26, do default"
{{- end }}
      initContainers:
      - name: init-script
        image: your-image
        command: ["/bin/sh", "-c"]
        args:
        - |
          export KUBE_VERSION=$(source /scripts/kubeVersionScript.sh)
        volumeMounts:
        - name: script-volume
          mountPath: /scripts
      volumes:
      - name: script-volume
        configMap:
          name: your-scripts

Thoughts?
cc @jedcunningham looping you in. Pls correct me if I am misinformed here

pankajkoti · 2023-11-23T07:14:29Z

@amoghrajesh just if you wonder if Jed is jot able to respond anytime soon, Jed is on parental leave 👶

amoghrajesh · 2023-11-28T06:29:30Z

@potiuk re worked the PR to support it in a better way. Can you take a look when you have some time?

amoghrajesh · 2023-11-28T06:30:02Z

chart/templates/workers/worker-deployment.yaml

@@ -256,6 +257,10 @@ spec:
            - name: worker-logs
              containerPort: {{ .Values.ports.workerLogs }}
          volumeMounts:
+            {{- if and (semverCompare ">=2.8.0" .Values.airflowVersion) .Values.workers.kerberosSidecar.enabled (semverCompare "<1.28" $kubeVersion) }}


Looks like 2.8 will change now, no? I see the RC out for 2.8 already.
cc @potiuk

amoghrajesh · 2023-11-28T07:40:47Z

Pending:

Unit tests for the changes
Regressive testing

amoghrajesh · 2023-11-28T07:50:53Z

Ok seems like the issue in CI was intermittent. It is green now :D

chart/templates/_helpers.yaml

amoghrajesh · 2023-12-12T04:21:52Z

Yes. No need to Add 2.8.0 in this case as we are not using any new feature added in kerberos command as I understand ?

You are right. No new feature for airflow kerberos. Removing the 2.8.0 check

chart/templates/_helpers.yaml

chart/templates/workers/worker-deployment.yaml

amoghrajesh · 2023-12-18T04:12:49Z

@potiuk I think some tests fail in the CI. Working on fixing those, but the PR is up to date now. Could you take a look when you have some time?

chart/templates/_helpers.yaml

potiuk · 2023-12-19T22:23:41Z

chart/templates/workers/worker-deployment.yaml

-          {{- if .Values.workers.args }}
+          {{- if and .Values.workers.kerberosSidecar.enabled (semverCompare "<1.28" $kubeVersion) }}
+          args:
+            {{ tpl (toYaml .Values.workers.args) . | nindent 12 }} &&


I think this will not work as expected for two reasons:

what you can see in the test - exec airflow celery worker will replace the bash interpreter with the python one (this is how exec works) - this means that whatever is after && will not be executed because bash interpreter will not be there any more to execute it

Even if it did work and we had no exect, this would not work if the worker crashed and exited with error - because in this case what's after && is not executed.

Thanks for that @potiuk
Looking for alternatives, any ideas on this front?

I tried something like this:

{{- if and .Values.workers.kerberosSidecar.enabled (semverCompare "<1.28" $kubeVersion) }} args: - {{ tpl (toYaml .Values.workers.args) . | nindent 12 }} - "&&" - "sh" - "-c" - > echo "EXIT file content" > /opt/kerberos-exit/EXIT {{- else if .Values.workers.args }} args: {{ tpl (toYaml .Values.workers.args) . | nindent 12 }} {{- end }}

This evaluates to:

args: - - bash - -c - |- exec \ airflow celery worker - "&&" - "sh" - "-c" - > echo "EXIT file content" > /opt/kerberos-exit/EXIT

And adding a trim in helm makes it like this:

args: - - bash - -c - |- exec \ airflow celery worker - "&&" - "sh" - "-c" - > echo "EXIT file content" > /opt/kerberos-exit/EXIT

Thinking of some alternatives :)

amoghrajesh · 2023-12-20T06:26:28Z

@potiuk I made a change and I think it will work because:
In this example, the && is part of the command inside the shell script. This way, the entire script is executed, including the command after &&.

amoghrajesh · 2023-12-27T04:45:33Z

@potiuk made some changes to the PR. Can you take a look when you have some time?

potiuk · 2023-12-30T11:05:36Z

chart/templates/workers/worker-deployment.yaml

+            - bash
+            - -c
+            - |-
+              exec airflow celery worker &&


I do not think it's going to work either. The exec will replace bash interpreter at the moment airflow celery worker executes and what is after && will never get executed.

You can check it by just

bash -c "exec ls && echo TEST"

It will only list current directory but it will not echo TEST.

You just can't exec when you are running bash-interpreted command as at the moment of exec, the process that you run replaces bash interpreter.

What you really need to do here if you want to use shell, you need to run the worker without exec, but you also need to make sure signal propagation happens - so that the parent bash process wil propagate the signals. However maybe you do not need to do it when you are using our image - because we are using

dumb_init as entrypoint with DUMB_INIT_SETSID set to 1 which means that dumb-init will propagate signals to ALL the children of bash, not only to bash, in which case

bash -c "airflow celery worker && sh -c "echo 'EXIT file content' > /opt/kerberos-exit/EXIT"

as command might just work - but you need to test it. run it in kubernetes, find the process of worker, send a signal (say TERM) and see if it completes and if it correctly writes the EXIT file.
https://airflow.apache.org/docs/docker-stack/entrypoint.html#signal-propagation

Thanks for the comment. Looking into what is the best way to do this

github-actions · 2024-03-01T00:13:05Z

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 5 days if no further activity occurs. Thank you for your contributions.

amoghrajesh requested review from dstandish, jedcunningham and hussein-awala as code owners November 20, 2023 06:02

boring-cyborg bot added the area:helm-chart Airflow Helm Chart label Nov 20, 2023

amoghrajesh requested review from potiuk and pankajkoti November 20, 2023 06:02

amoghrajesh commented Nov 20, 2023

View reviewed changes

chart/templates/workers/worker-deployment.yaml Outdated Show resolved Hide resolved

chart/templates/workers/worker-deployment.yaml Outdated Show resolved Hide resolved

amoghrajesh marked this pull request as draft November 21, 2023 15:58

amoghrajesh marked this pull request as ready for review November 28, 2023 06:17

amoghrajesh requested a review from ashb as a code owner November 28, 2023 06:17

Handling exit of kerberos sidecar container for k8s < 1.28

e831a5e

amoghrajesh force-pushed the kerberosSharedProcess branch from 43c5c45 to e831a5e Compare November 28, 2023 06:25

amoghrajesh changed the title ~~Connecting airflow containers with shareProcessNamespace~~ Handling exit of kerberos sidecar container for k8s < 1.28 Nov 28, 2023

amoghrajesh commented Nov 28, 2023

View reviewed changes

amoghrajesh removed the request for review from ashb November 28, 2023 07:51

pankajkoti removed their request for review December 3, 2023 07:53

potiuk reviewed Dec 10, 2023

View reviewed changes

chart/templates/_helpers.yaml Outdated Show resolved Hide resolved

amoghrajesh commented Dec 12, 2023

View reviewed changes

chart/templates/_helpers.yaml Outdated Show resolved Hide resolved

chart/templates/workers/worker-deployment.yaml Outdated Show resolved Hide resolved

chart/templates/workers/worker-deployment.yaml Outdated Show resolved Hide resolved

adding unit tests to support the PR

89e0826

amoghrajesh requested a review from potiuk December 12, 2023 04:57

removing unwanted label

16054be

potiuk reviewed Dec 18, 2023

View reviewed changes

chart/templates/_helpers.yaml Show resolved Hide resolved

potiuk reviewed Dec 18, 2023

View reviewed changes

chart/templates/_helpers.yaml Show resolved Hide resolved

adding the EXIT file creation

23bf745

amoghrajesh requested a review from potiuk December 19, 2023 05:03

potiuk reviewed Dec 19, 2023

View reviewed changes

fixing the command

c16478d

amoghrajesh requested a review from potiuk December 20, 2023 06:26

fixing unit tests

3bb2612

potiuk reviewed Dec 30, 2023

View reviewed changes

Merge branch 'main' into kerberosSharedProcess

64eac81

github-actions bot added the stale Stale PRs per the .github/workflows/stale.yml policy file label Mar 1, 2024

github-actions bot closed this Mar 7, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handling exit of kerberos sidecar container for k8s < 1.28 #35748

Handling exit of kerberos sidecar container for k8s < 1.28 #35748

amoghrajesh commented Nov 20, 2023 •

edited

Loading

potiuk commented Nov 20, 2023 •

edited

Loading

amoghrajesh commented Nov 20, 2023 •

edited

Loading

amoghrajesh commented Nov 20, 2023 •

edited

Loading

potiuk commented Nov 20, 2023

amoghrajesh commented Nov 21, 2023

amoghrajesh commented Nov 21, 2023

potiuk commented Nov 21, 2023

amoghrajesh commented Nov 23, 2023

pankajkoti commented Nov 23, 2023

amoghrajesh commented Nov 28, 2023

amoghrajesh Nov 28, 2023

amoghrajesh commented Nov 28, 2023

amoghrajesh commented Nov 28, 2023

amoghrajesh commented Dec 12, 2023

amoghrajesh commented Dec 18, 2023

potiuk Dec 19, 2023

amoghrajesh Dec 20, 2023

amoghrajesh Dec 20, 2023

amoghrajesh Dec 20, 2023

amoghrajesh commented Dec 20, 2023

amoghrajesh commented Dec 27, 2023

potiuk Dec 30, 2023

amoghrajesh Jan 15, 2024

github-actions bot commented Mar 1, 2024

Handling exit of kerberos sidecar container for k8s < 1.28 #35748

Handling exit of kerberos sidecar container for k8s < 1.28 #35748

Conversation

amoghrajesh commented Nov 20, 2023 • edited Loading

potiuk commented Nov 20, 2023 • edited Loading

amoghrajesh commented Nov 20, 2023 • edited Loading

amoghrajesh commented Nov 20, 2023 • edited Loading

potiuk commented Nov 20, 2023

amoghrajesh commented Nov 21, 2023

amoghrajesh commented Nov 21, 2023

potiuk commented Nov 21, 2023

amoghrajesh commented Nov 23, 2023

pankajkoti commented Nov 23, 2023

amoghrajesh commented Nov 28, 2023

amoghrajesh Nov 28, 2023

Choose a reason for hiding this comment

amoghrajesh commented Nov 28, 2023

amoghrajesh commented Nov 28, 2023

amoghrajesh commented Dec 12, 2023

amoghrajesh commented Dec 18, 2023

potiuk Dec 19, 2023

Choose a reason for hiding this comment

amoghrajesh Dec 20, 2023

Choose a reason for hiding this comment

amoghrajesh Dec 20, 2023

Choose a reason for hiding this comment

amoghrajesh Dec 20, 2023

Choose a reason for hiding this comment

amoghrajesh commented Dec 20, 2023

amoghrajesh commented Dec 27, 2023

potiuk Dec 30, 2023

Choose a reason for hiding this comment

amoghrajesh Jan 15, 2024

Choose a reason for hiding this comment

github-actions bot commented Mar 1, 2024

amoghrajesh commented Nov 20, 2023 •

edited

Loading

potiuk commented Nov 20, 2023 •

edited

Loading

amoghrajesh commented Nov 20, 2023 •

edited

Loading

amoghrajesh commented Nov 20, 2023 •

edited

Loading