From 6e1ecc9be7eae89254ea14a08b9b4c69d5189bc9 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Sat, 10 Jan 2026 18:44:35 +0000
Subject: [PATCH 1/2] Initial plan


From a1480707df692cb17552a9ae28fd16997ebff14d Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Sat, 10 Jan 2026 18:48:21 +0000
Subject: [PATCH 2/2] Add Kubernetes security context requirements and
 troubleshooting docs

Co-authored-by: mudler <2420543+mudler@users.noreply.github.com>
---
 docs/content/getting-started/kubernetes.md | 76 ++++++++++++++++++++++
 docs/content/installation/kubernetes.md    | 76 ++++++++++++++++++++++
 2 files changed, 152 insertions(+)
diff --git a/docs/content/getting-started/kubernetes.md b/docs/content/getting-started/kubernetes.md
index 8adfa8277dee..7179ddb84e81 100644
--- a/docs/content/getting-started/kubernetes.md
+++ b/docs/content/getting-started/kubernetes.md
@@ -29,3 +29,79 @@ helm show values go-skynet/local-ai > values.yaml
 
 helm install local-ai go-skynet/local-ai -f values.yaml
 ```
+
+## Security Context Requirements
+
+LocalAI spawns child processes to run model backends (e.g., llama.cpp, diffusers, whisper). To properly stop these processes and free resources like VRAM, LocalAI needs permission to send signals to its child processes.
+
+If you're using restrictive security contexts, ensure the `CAP_KILL` capability is available:
+
+```yaml
+apiVersion: v1
+kind: Pod
+metadata:
+  name: local-ai
+spec:
+  containers:
+  - name: local-ai
+    image: quay.io/go-skynet/local-ai:latest
+    securityContext:
+      allowPrivilegeEscalation: false
+      capabilities:
+        drop:
+          - ALL
+        add:
+          - KILL  # Required for LocalAI to stop backend processes
+      seccompProfile:
+        type: RuntimeDefault
+      runAsNonRoot: true
+      runAsUser: 1000
+```
+
+Without the `KILL` capability, LocalAI cannot terminate backend processes when models are stopped, leading to:
+- VRAM and memory not being freed
+- Orphaned backend processes holding GPU resources
+- Error messages like `error while deleting process error=permission denied`
+
+## Troubleshooting
+
+### Issue: VRAM is not freed when stopping models
+
+**Symptoms:**
+- Models appear to stop but GPU memory remains allocated
+- Logs show `(deleteProcess) error while deleting process error=permission denied`
+- Backend processes remain running after model unload
+
+**Common Causes:**
+- All capabilities are dropped without adding back `CAP_KILL`
+- Using user namespacing (`hostUsers: false`) with certain configurations
+- Overly restrictive seccomp profiles that block signal-related syscalls
+- Pod Security Policies or Pod Security Standards blocking required capabilities
+
+**Solution:**
+
+1. Add the `KILL` capability to your container's security context as shown in the example above.
+
+2. If you're using a Helm chart, configure the security context in your `values.yaml`:
+
+```yaml
+securityContext:
+  allowPrivilegeEscalation: false
+  capabilities:
+    drop:
+      - ALL
+    add:
+      - KILL
+  seccompProfile:
+    type: RuntimeDefault
+```
+
+3. Verify the capability is present in the running pod:
+
+```bash
+kubectl exec -it <pod-name> -- grep CapEff /proc/1/status
+```
+
+4. If running in privileged mode works but the above doesn't, check your cluster's Pod Security Policies or Pod Security Standards. You may need to adjust cluster-level policies to allow the `KILL` capability.
+
+5. Ensure your seccomp profile (if custom) allows the `kill` syscall. The `RuntimeDefault` profile typically includes this.
diff --git a/docs/content/installation/kubernetes.md b/docs/content/installation/kubernetes.md
index f3047851d81c..054837397356 100644
--- a/docs/content/installation/kubernetes.md
+++ b/docs/content/installation/kubernetes.md
@@ -29,3 +29,79 @@ helm show values go-skynet/local-ai > values.yaml
 
 helm install local-ai go-skynet/local-ai -f values.yaml
 ```
+
+## Security Context Requirements
+
+LocalAI spawns child processes to run model backends (e.g., llama.cpp, diffusers, whisper). To properly stop these processes and free resources like VRAM, LocalAI needs permission to send signals to its child processes.
+
+If you're using restrictive security contexts, ensure the `CAP_KILL` capability is available:
+
+```yaml
+apiVersion: v1
+kind: Pod
+metadata:
+  name: local-ai
+spec:
+  containers:
+  - name: local-ai
+    image: quay.io/go-skynet/local-ai:latest
+    securityContext:
+      allowPrivilegeEscalation: false
+      capabilities:
+        drop:
+          - ALL
+        add:
+          - KILL  # Required for LocalAI to stop backend processes
+      seccompProfile:
+        type: RuntimeDefault
+      runAsNonRoot: true
+      runAsUser: 1000
+```
+
+Without the `KILL` capability, LocalAI cannot terminate backend processes when models are stopped, leading to:
+- VRAM and memory not being freed
+- Orphaned backend processes holding GPU resources
+- Error messages like `error while deleting process error=permission denied`
+
+## Troubleshooting
+
+### Issue: VRAM is not freed when stopping models
+
+**Symptoms:**
+- Models appear to stop but GPU memory remains allocated
+- Logs show `(deleteProcess) error while deleting process error=permission denied`
+- Backend processes remain running after model unload
+
+**Common Causes:**
+- All capabilities are dropped without adding back `CAP_KILL`
+- Using user namespacing (`hostUsers: false`) with certain configurations
+- Overly restrictive seccomp profiles that block signal-related syscalls
+- Pod Security Policies or Pod Security Standards blocking required capabilities
+
+**Solution:**
+
+1. Add the `KILL` capability to your container's security context as shown in the example above.
+
+2. If you're using a Helm chart, configure the security context in your `values.yaml`:
+
+```yaml
+securityContext:
+  allowPrivilegeEscalation: false
+  capabilities:
+    drop:
+      - ALL
+    add:
+      - KILL
+  seccompProfile:
+    type: RuntimeDefault
+```
+
+3. Verify the capability is present in the running pod:
+
+```bash
+kubectl exec -it <pod-name> -- grep CapEff /proc/1/status
+```
+
+4. If running in privileged mode works but the above doesn't, check your cluster's Pod Security Policies or Pod Security Standards. You may need to adjust cluster-level policies to allow the `KILL` capability.
+
+5. Ensure your seccomp profile (if custom) allows the `kill` syscall. The `RuntimeDefault` profile typically includes this.