Skip to content

RedshiftCreateClusterOperator leaks Redshift cluster on failure with partial IAM permissions #61974

@potiuk

Description

@potiuk

Discussed in #61930

Originally posted by SameerMesiah97 February 1, 2026

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

apache-airflow-providers-amazon>=9.21.0rc1

Apache Airflow version

main

Operating System

Debian GNU/Linux 12 (bookworm)

Deployment

Other

Deployment details

No response

What happened

When using RedshiftCreateClusterOperator, a Redshift cluster may be successfully created even when the AWS execution role has partial Redshift permissions, for example lacking redshift:DescribeClusters.

In this scenario, the operator successfully calls create_cluster and the Redshift cluster begins provisioning in AWS. However, subsequent steps—such as waiting for the cluster to become available when wait_for_completion=True—fail due to insufficient permissions.

The Airflow task then fails, but the Redshift cluster continues provisioning or remains active in AWS, resulting in leaked infrastructure and ongoing cost.

This can occur, for example, when the execution role allows redshift:CreateCluster but explicitly denies redshift:DescribeClusters, which is required by the waiter used to monitor cluster availability.

What you think should happen instead

If the operator fails after successfully initiating cluster creation (for example due to missing DescribeClusters or other follow-up permissions), it should make a best-effort attempt to clean up the partially created resource by deleting the cluster.

Cleanup should be attempted opportunistically (i.e. only if the cluster identifier is known and the necessary permissions are available), and failure to clean up should not mask or replace the original exception.

How to reproduce

  1. Create an IAM role that allows redshift:CreateCluster but denies redshift:DescribeClusters.

  2. Configure an AWS connection in Airflow using this role.
    (The connection ID aws_test_conn is used for this reproduction.)

  3. Ensure a valid Redshift cluster subnet group exists.
    (For example: example-subnet-group.)

  4. Use the following DAG:

from datetime import datetime

from airflow import DAG
from airflow.providers.amazon.aws.operators.redshift_cluster import (
    RedshiftCreateClusterOperator,
)

with DAG(
    dag_id="redshift_partial_auth_cluster_leak_repro",
    start_date=datetime(2025, 1, 1),
    schedule=None,
    catchup=False,
) as dag:
    create_cluster = RedshiftCreateClusterOperator(
        task_id="create_redshift_cluster",
        aws_conn_id="aws_test_conn",
        cluster_identifier="leaky-redshift-cluster",
        node_type="ra3.large",
        master_username="example",
        master_user_password="example",
        cluster_type="single-node",
        cluster_subnet_group_name="example-subnet-group",
        wait_for_completion=True,  # triggers DescribeClusters via waiter
    )
  1. Trigger the DAG.

Observed Behaviour

The task fails due to missing redshift:DescribeClusters permissions, but the Redshift cluster is successfully created and remains active in AWS. The cluster is not cleaned up automatically and continues incurring cost.

Anything else

Redshift clusters begin incurring cost immediately once creation starts, even if the cluster never reaches an available state. When post-creation failures occur, leaked clusters can therefore result in unexpected and ongoing cost.

This issue follows a broader pattern across AWS operators where resources are created successfully but not cleaned up when subsequent steps fail. Apache Airflow has been introducing best-effort cleanup behavior to address this class of problems consistently across providers.

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions