Skip to content

Asset extra metadata not persisted: extra field always empty via API/DB for emitted asset alias events #58146

@jgoedeke

Description

@jgoedeke

Apache Airflow version

3.1.2

If "Other Airflow 2/3 version" selected, which one?

No response

What happened?

When emitting an Asset with a non-empty extra dictionary (using the standard Asset/AssetAlias machinery in Airflow 3.1.2), the resulting asset has an empty extra field in the database and API, despite being created with extra metadata.

from datetime import UTC, datetime

from airflow.providers.standard.operators.python import PythonOperator
from airflow.sdk import Asset, AssetAlias, Metadata, dag, task

first_alias = AssetAlias(
    name='a.first.data.alias',
)
first_asset = Asset(
    name='a.first.data.asset',
    uri='s3://bucket/mydata.csv',
    extra={'path': '/some/local/path.txt'},
)


@dag(
    schedule=None,
    tags=['test'],
)
def test_alias_producer_dag():
    def produce_asset():
        yield Metadata(
            first_asset,
            {'timestamp': datetime.now(tz=UTC).isoformat()},
            first_alias,
        )

    PythonOperator(
        task_id='produce_asset',
        python_callable=produce_asset,
        outlets=[first_alias],
    )


@dag(
    schedule=[first_alias],
    tags=['test'],
)
def test_alias_consumer_dag():
    @task(inlets=[first_alias])
    def process_asset(triggering_asset_events: dict):
        for event in triggering_asset_events:
            print(f'Event: {event}')

    process_asset()


test_alias_producer_dag()
test_alias_consumer_dag()

Task log:

[2025-11-10 14:54:24] INFO - Event: Asset(name='a.first.data.asset', uri='s3://bucket/mydata.csv', group='asset', extra={}, watchers=[]) source=task.stdout
[2025-11-10 14:54:24] INFO - Event: AssetAlias(name='a.first.data.alias', group='asset') source=task.stdout

Example API response (assets endpoint):

{
  "id": 116,
  "name": "a.first.data.asset",
  "uri": "s3://bucket/mydata.csv",
  "group": "",
  "extra": {},
  ...
}

What you think should happen instead?

The extra dictionary provided at Asset creation should be saved and returned by the API and database. Assets should always include their extra metadata if set.

How to reproduce

See example

Operating System

docker

Versions of Apache Airflow Providers

No response

Deployment

Other Docker-based deployment

Deployment details

No response

Anything else?

Can not verify Asset extra via the UI because of #57566

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions