Skip to content

KubernetesPodOperator fails to delete pods with None value labels #53472

@Vasu-Madaan

Description

@Vasu-Madaan

Apache Airflow Provider(s)

cncf-kubernetes

Versions of Apache Airflow Providers

8.4.1

Apache Airflow version

Airflow 2

Operating System

linux

Deployment

Astronomer

Deployment details

No response

What happened

The KubernetesPodOperator does not correctly clean up pods when they are created with labels that have a None value. This results in orphaned pods remaining in the Kubernetes namespace after the task has finished/failed, even when on_finish_action is set to delete_pod.

What you think should happen instead

Root Cause Analysis:

The issue stems from an inconsistency between how labels are handled during pod creation versus pod deletion.

Pod Creation: When a pod is created, the _get_ti_pod_labels method iterates through the labels and uses str(label) to process the value. In Python, str(None) evaluates to the empty string "". Consequently, a pod is created with a valid Kubernetes label like my-label="".

Pod Deletion: When the task finishes, the cleanup method attempts to find the pod to delete it. It calls _build_find_pod_label_selector to construct a query for the Kubernetes API. This method, however, does not apply the same str() conversion. It uses the raw None object from the operator's self.labels dictionary.

This inconsistency leads to a malformed or incorrect label selector, causing the Kubernetes API to return no matching pods. Since the operator cannot find the pod it created, it cannot delete it, leaving the pod orphaned.

How to reproduce

Example DAG to Reproduce:

from __future__ import annotations

import pendulum

from airflow.models.dag import DAG
from airflow.providers.cncf.kubernetes.operators.pod import KubernetesPodOperator

with DAG(
    dag_id="kpo_none_label_bug_report",
    start_date=pendulum.datetime(2025, 1, 1, tz="UTC"),
    catchup=False,
    schedule=None,
    tags=["k8s", "bug"],
) as dag:
    kpo_task = KubernetesPodOperator(
        task_id="kpo_with_none_label",
        namespace="default",
        image="faulty:latest30",
        cmds=["sh", "-c"],
        arguments=["echo 'Starting...'; sleep 60; echo 'Done sleeping'"],
        # This label with a `None` value triggers the bug
        labels={"custom-label-with-none": None},
        name="kpo-none-label-test",
        on_finish_action="delete_pod",
        # Ensure a new pod is created each time to reliably test deletion
        reattach_on_restart=False,
        config_file="/files/.kube/config.yml"
    )

Expected Behavior: After the task kpo_with_none_label completes/fails with imagePullBack Off, the corresponding pod (kpo-none-label-test-*) should be deleted from the Kubernetes namespace.

Actual Behavior: The pod is not deleted and remains in the namespace with a imagePullBackOff/completes status.

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions