Skip to content

Conversation

@ashb
Copy link
Member

@ashb ashb commented Aug 16, 2025

If the connection is loaded from a secrets backend (be it something like
Hashicorp Vault, or even just as simple as env vars!) the mask will only be
applied to the subprocess, so won't catch much of the output. To fix this we
send a message to the Supervisor process of the value to redact.

This also captures and "mirrors" direct calls to mask_secret from user code
to the supervisor so that it can mask the output correctly.

Docs look like this

Screenshot 2025-08-17 at 16 46 22

Closes #54540

I was testing with this dag. In a follow up I'll add it to the sdk integration tests to ensure masking works fully end-to-end.

from __future__ import annotations

import logging

from airflow import DAG
from airflow.providers.standard.operators.empty import EmptyOperator
from airflow.providers.standard.operators.python import PythonOperator
from airflow.sdk import Variable

x = Variable.get("my_variable")


def my_function(my_var: str) -> None:
    logging.getLogger(__name__).info(my_var)


with DAG("test_dag") as dag:
    start = EmptyOperator(task_id="start")

    py_func = PythonOperator(task_id="py_func", python_callable=my_function, op_kwargs={"my_var": x})

    end = EmptyOperator(task_id="end")

    start >> py_func >> end

if __name__ == "__main__":
    dag.test()

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@ashb ashb requested review from amoghrajesh and kaxil as code owners August 16, 2025 15:00
@ashb ashb added the backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch label Aug 16, 2025
@ashb ashb added this to the Airflow 3.0.5 milestone Aug 16, 2025
@ashb
Copy link
Member Author

ashb commented Aug 16, 2025

I think I've caused an infinite loop

@ashb
Copy link
Member Author

ashb commented Aug 17, 2025

Ah no wasn't an infinite loop, was just not updating the dag processor handler

@ashb ashb force-pushed the mask-secret-send-to-supervisor branch from 9af3346 to 700acd1 Compare August 17, 2025 16:16
… backends

If the connection is loaded from a secrets backend (be it something like
Hashicorp Vault, or even just as simple as env vars!) the mask will only be
applied to the subprocess, so won't catch much of the output. To fix this we
send a message to the Supervisor process of the value to redact.

This also captures and "mirrors" direct calls to `mask_secret` from user code
to the supervisor so that it can mask the output correctly.
@ashb ashb force-pushed the mask-secret-send-to-supervisor branch from 700acd1 to ce03d93 Compare August 17, 2025 16:21
@ashb ashb added the full tests needed We need to run full set of tests for this PR to merge label Aug 17, 2025
@ashb ashb closed this Aug 17, 2025
@ashb ashb reopened this Aug 17, 2025
@potiuk
Copy link
Member

potiuk commented Aug 17, 2025

Nice ! Good one @ashb the pull it so quicky

@ashb ashb merged commit 1f4c55c into main Aug 17, 2025
179 checks passed
@ashb ashb deleted the mask-secret-send-to-supervisor branch August 17, 2025 19:21
@github-actions
Copy link

Backport failed to create: v3-0-test. View the failure log Run details

Status Branch Result
v3-0-test Commit Link

You can attempt to backport this manually by running:

cherry_picker 1f4c55c v3-0-test

This should apply the commit to the v3-0-test branch and leave the commit in conflict state marking
the files that need manual conflict resolution.

After you have resolved the conflicts, you can continue the backport process by running:

cherry_picker --continue

ashb added a commit that referenced this pull request Aug 18, 2025
…from secrets backends (#54574)

(cherry picked from commit 1f4c55c)
ashb added a commit that referenced this pull request Aug 18, 2025
…from secrets backends (#54574)

(cherry picked from commit 1f4c55c)
ashb added a commit that referenced this pull request Aug 18, 2025
Somewhat annoyingly it has to be `sync_to_async(mask_secret)` but that is
un-avoidable unfortunately.

Similar to #54574, but for the trigger.
ashb added a commit that referenced this pull request Aug 18, 2025
Somewhat annoyingly it has to be `sync_to_async(mask_secret)` but that is
unavoidable unfortunately.

Similar to #54574, but for the trigger.
ashb added a commit that referenced this pull request Aug 18, 2025
…ers (#54612)

Somewhat annoyingly it has to be `sync_to_async(mask_secret)` but that is
unavoidable unfortunately.

Similar to #54574, but for the trigger.

(cherry picked from commit a2bd625)
ashb added a commit that referenced this pull request Aug 18, 2025
…from secrets backends (#54574)

(cherry picked from commit 1f4c55c)
ashb added a commit that referenced this pull request Aug 18, 2025
…ers (#54612)

Somewhat annoyingly it has to be `sync_to_async(mask_secret)` but that is
unavoidable unfortunately.

Similar to #54574, but for the trigger.

(cherry picked from commit a2bd625)
kaxil pushed a commit that referenced this pull request Aug 18, 2025
kaxil pushed a commit that referenced this pull request Aug 18, 2025
…ers (#54612)

Somewhat annoyingly it has to be `sync_to_async(mask_secret)` but that is
unavoidable unfortunately.

Similar to #54574, but for the trigger.

(cherry picked from commit a2bd625)
@dshvedchenko
Copy link

after upgrading to airflow 3.0.5 operators/hooks failed with similar messages:

[2025-08-21, 07:05:11] INFO - Connection Retrieved 'snowflake_default': source="airflow.hooks.base"
[2025-08-21, 07:05:11] ERROR - Task failed with exception: source="task"
NotImplementedError: Objects of type <class 'pydantic_core._pydantic_core.SerializationIterator'> are not supported
File "/usr/local/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py", line 918 in run

File "/usr/local/lib/python3.12/site-packages/airflow/sdk/execution_time/task_runner.py", line 1213 in _execute_task

File "/usr/local/lib/python3.12/site-packages/airflow/sdk/bases/operator.py", line 397 in wrapper

File "/usr/local/lib/python3.12/site-packages/airflow/providers/common/sql/operators/sql.py", line 307 in execute

File "/usr/local/lib/python3.12/site-packages/airflow/providers/common/sql/operators/sql.py", line 201 in get_db_hook

File "/usr/local/lib/python3.12/functools.py", line 998 in __get__

File "/usr/local/lib/python3.12/site-packages/airflow/providers/common/sql/operators/sql.py", line 177 in _hook

File "/usr/local/lib/python3.12/site-packages/airflow/providers/common/sql/operators/sql.py", line 166 in get_hook

File "/usr/local/lib/python3.12/site-packages/airflow/sdk/definitions/connection.py", line 162 in extra_dejson

File "/usr/local/lib/python3.12/site-packages/airflow/sdk/execution_time/secrets_masker.py", line 130 in mask_secret

File "/usr/local/lib/python3.12/site-packages/airflow/sdk/execution_time/comms.py", line 187 in send

File "/usr/local/lib/python3.12/site-packages/airflow/sdk/execution_time/comms.py", line 145 in as_bytes

File "/usr/local/lib/python3.12/site-packages/airflow/sdk/execution_time/comms.py", line 122 in _msgpack_enc_hook

@VladaZakharova
Copy link
Contributor

hi @dshvedchenko
Yes, the same problem with google-provider and all the operators and hooks
@ashb Can you please check? Can you please tell maybe there is now a new way to create connections or what?

@ashb
Copy link
Member Author

ashb commented Aug 21, 2025

Which hook is that coming from?

@VladaZakharova
Copy link
Contributor

All of them :)
when creating connection to run any hook I see the same error as @dshvedchenko
connection google_cloud_default

@ashb
Copy link
Member Author

ashb commented Aug 21, 2025

I think it depends on what fields are in the connection -- In my (admittedly simple) testing with http connection it worked fine.

Can you give me a connection to test with?

@ashb
Copy link
Member Author

ashb commented Aug 21, 2025

[2025-08-21, 09:36:10] INFO - conn.conn_id='test' conn.password=None conn=Connection(conn_id='test', conn_type='google_cloud_platform', description=None, host=None, schema=None, login=None, password=None, port=None, extra='{\n  "project": null,\n  "key_path": null,\n  "keyfile_dict": null,\n  "credential_config_file": null,\n  "scope": null,\n  "key_secret_name": null,\n  "key_secret_project_id": null,\n  "num_retries": 5,\n  "impersonation_chain": null,\n  "idp_issuer_url": null,\n  "client_id": null,\n  "client_secret": null,\n  "idp_extra_parameters": null,\n  "is_anonymous": false\n}'): chan="stdout": source="task"

@VladaZakharova
Copy link
Contributor

I was creating connection using:

AIRFLOW_CONN_GOOGLE_CLOUD_DEFAULT='{"conn_type": "google_cloud_default", "extra": {"key_path": "/files/airflow-breeze-config/keys/keys.json", "scope": "https://www.googleapis.com/auth/cloud-platform", "project": "project_id", "num_retries": 5}}'

the problem was originally with some of the triggered where we couldn't pass Enum values from tigggerer to worker (also needs to be reworked).
Now after this mask changes the problem is with creating the connection because of the changes made here #51699 :

def _msgpack_enc_hook(obj: Any) -> Any:
    import pendulum

    if isinstance(obj, pendulum.DateTime):
        # convert the pendulm Datetime subclass into a raw datetime so that msgspec can use it's native
        # encoding
        return datetime(
            obj.year, obj.month, obj.day, obj.hour, obj.minute, obj.second, obj.microsecond, tzinfo=obj.tzinfo
        )
    if isinstance(obj, Path):
        return str(obj)
    if isinstance(obj, BaseModel):
        return obj.model_dump(exclude_unset=True)

    # Raise a NotImplementedError for other types
    raise NotImplementedError(f"Objects of type {type(obj)} are not supported")

We need I think more types here to support and it will solve the problem for everyone.

@ashb
Copy link
Member Author

ashb commented Aug 21, 2025

Looks like this is an unsolved bug in pydantic pydantic/pydantic#9541 (and also impropper testing on our part)

@VladaZakharova
Copy link
Contributor

VladaZakharova commented Aug 21, 2025

Looks like yes, more complicated test cases would be good to have :)
are we planning to address this issue to make it work for providers? Looks like there will be more people coming with the same issue

@ashb
Copy link
Member Author

ashb commented Aug 21, 2025

Nothing providers can do, it needs a change in Airflow core. For people hitting this, if you can't/don't want to downgrade you can apply a patch to your Airflow to work around it - instructions in #54769 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:task-sdk backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch full tests needed We need to run full set of tests for this PR to merge

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Secret masker often doesn't always mask values (user defined mask, or secrets backend)

5 participants