Skip to content

aws_conn_id in RDS operators (e.g., RdsStartExportTaskOperator, RdsCreateDbSnapshotOperator) is ignored and falls back to aws_default in apache-airflow-providers-amazon>=9.6.0 #50766

@frobb

Description

@frobb

Apache Airflow Provider(s)

amazon

Versions of Apache Airflow Providers

apache-airflow-providers-amazon>=9.6.0

Apache Airflow version

2.10.1

Operating System

MWAA

Deployment

Amazon (AWS) MWAA

Deployment details

No response

What happened

In apache-airflow-providers-amazon>=9.6.0, when using RDS operators such as RdsStartExportTaskOperator or RdsCreateDbSnapshotOperator, the aws_conn_id parameter provided during operator instantiation is not being honored. Instead, the operator attempts to use the aws_default connection.
This behavior appears to be a regression from earlier versions of the provider (e.g., 8.x.x series) where the specified aws_conn_id was correctly used.
The issue seems to stem from the RdsBaseOperator's init method (from which other RDS operators inherit). In version 9.6.0 of airflow/providers/amazon/aws/operators/rds.py, the aws_conn_id parameter in the RdsBaseOperator.init signature:

class RdsBaseOperator(AwsBaseOperator[RdsHook]):
    # ...
    def __init__(
        self,
        *args,
        aws_conn_id: str | None = "aws_conn_id",  # <--- This line
        region_name: str | None = None,
        **kwargs,
    ):
        self.aws_conn_id = aws_conn_id
        self.region_name = region_name
        super().__init__(*args, **kwargs)
        # ...

This explicitly captures the aws_conn_id if passed in kwargs. Consequently, when super().init(*args, **kwargs) is called to initialize the parent AwsBaseOperator, the aws_conn_id is no longer present in kwargs. AwsBaseOperator then falls back to its own default for its aws_conn_id parameter, which is "aws_default". The hook used by the operator then incorrectly uses this default connection.

What you think should happen instead

The RDS operator should use the specific AWS connection ID provided in its aws_conn_id parameter. If a user specifies aws_conn_id="my_custom_conn", the operator should use the Airflow connection named my_custom_conn, not aws_default.

How to reproduce

  1. Prerequisites:
    Airflow version 2.10
    apache-airflow-providers-amazon==9.6.0 (or any subsequent version where this issue persists).

  2. Airflow Connections:
    In the Airflow UI, create an AWS connection named my_custom_rds_conn. The actual credentials can be placeholders for this reproduction; the existence of the named connection is key.

Ensure that there is no Airflow connection named aws_default. (Alternatively, if aws_default must exist for other reasons, ensure its credentials would obviously fail or be different from my_custom_rds_conn for the RDS operation).

  1. Minimal DAG:
    Create and run the following DAG:

    from __future__ import annotations
    
    import pendulum
    
    from airflow.models.dag import DAG
    from airflow.providers.amazon.aws.operators.rds import RdsCreateDbSnapshotOperator # Or RdsStartExportTaskOperator
    
    with DAG(
        dag_id="rds_aws_conn_id_bug_repro",
        start_date=pendulum.datetime(2025, 5, 19, tz="UTC"),
        catchup=False,
        schedule=None,
        tags=["bug-repro"],
    ) as dag:
        create_snapshot_task = RdsCreateDbSnapshotOperator(
            task_id="create_db_snapshot_bug_test",
            aws_conn_id="my_custom_rds_conn",  # Intended connection
            db_type="instance",              # Placeholder
            db_identifier="my-dummy-db-id",  # Placeholder
            db_snapshot_identifier="my-dummy-snapshot-id" # Placeholder
        )
  2. Observe:
    Trigger the DAG.
    The create_db_snapshot_bug_test task is expected to fail (as the DB identifier is a dummy).
    Inspect the logs for the failed task instance.

  3. Expected vs. Actual Outcome:
    Expected (if bug wasn't present): If my_custom_rds_conn was actually used, the error might relate to "DB instance my-dummy-db-id not found" using the (placeholder) credentials from my_custom_rds_conn.

Actual (due to bug): The task will fail with an error indicating that the connection aws_default could not be found (e.g., airflow.exceptions.AirflowNotFoundException: The conn_id aws_default isn't defined), or an equivalent Boto3/AWS SDK error if it tries to use default credential chain after failing to find aws_default. This demonstrates that the operator ignored aws_conn_id="my_custom_rds_conn" and attempted to use aws_default.

Anything else

No response

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

No one assigned

    Labels

    area:providerskind:bugThis is a clearly a bugneeds-triagelabel for new issues that we didn't triage yet

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions