Skip to content

Conversation

@amoghrajesh
Copy link
Contributor

closes: #58887
closes: #58885

Motivation

This change is part of the broader effort to achieve client-server separation. The serialization and deserialization utilities (serde) are execution time utilities used primarily during task execution for:

  • Serializing/deserializing XComs when communicating with the API server
  • Processing deadline alerts

Since these utilities are only needed at execution time (in workers, DAG processors, and triggerers), they belong in the task-sdk rather than airflow-core. This allows the core server components (Scheduler, API Server) to operate without requiring the task SDK, and also enabled task SDK to not having to import serde from airflow core, enabling independent deployments and upgrades.

Blockers and Solutions

Several blockers were identified and resolved:

  1. Migration File: The migration 0092_3_2_0_replace_deadline_inline_callback_with_fkey.py uses serde.deserialize()
    Effort tracked by @ramitkataria in Do not use serde library for database migrations

  2. XCom API Stringification: The core API's xcom endpoint (/api/v2/dags/{dag_id}/dagRuns/{run_id}/taskInstances/{task_id}/xcomEntries) used deserialize(full=False) to stringify XCom values for UI display.
    The solution was to create a dedicated stringify.py module in airflow-core that provides stringification without any SDK dependencies. This module matches the behavior of deserialize(full=False) exactly but is self-contained and independent of any external libraries.

  3. XComEncoder/XComDecoder: These JSON encoder/decoder classes in airflow-core directly imported serde for full serialization/deserialization of XCom values.
    The fix here was to implement lazy imports that attempt to load serde from the SDK, making the dependency optional. If the SDK is not available, appropriate error messages are shown. The consumers of this anyways were mostly fixed / handled by: Decouple xcom public API from using XcomEncoder #58900, which discovered that the same effect can be achieved without using those encoders. In a follow-up, I might move it to task SDK as well.

High Level Changes

  1. Serde module is moved to task sdk now

    • All serializers moved to task-sdk/src/airflow/sdk/serialization/serializers/
  2. Stringification Separation

    • Created airflow-core/src/airflow/serialization/stringify.py for UI stringification
    • This module is self-contained and does not import from SDK
    • Matches deserialize(full=False) behavior exactly
  3. Deprecation Path

    • Old imports (from airflow.serialization.serde import ) continue to work
    • Redirection to SDK variant with deprecation warnings
    • Ensures backward compatibility till we remove it
  4. Tests

    • Serialize/deserialize tests moved to task-sdk/tests/task_sdk/serialization/test_serde.py
    • Stringify tests created in airflow-core/tests/unit/serialization/test_stringify.py
  5. Docs

    • Updated serializer documentation to reference `airflow.sdk.serialization.serializers

What does this mean for new serializers? (Debatable, check open questions below)

Important: New serializers should now be added to the SDK namespace:

The old airflow-core/src/airflow/serialization/serializers/ directory still exists for backward compatibility but is deprecated. New serializers should not be added there.

Open Questions

  1. When should we delete airflow-core/src/airflow/serialization/serializers/? Currently kept for backward compatibility, but should be removed in a future release, no code reads off of here, shall we just delete it? What for people who have added their serializers in here?

  2. Should XcomEncoder + Decoder stay in airflow core or task sdk?

Testing

  • All serialize/deserialize tests moved to SDK and passing
  • New stringify tests created to test all possible situations

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in airflow-core/newsfragments.

@potiuk
Copy link
Member

potiuk commented Dec 7, 2025

Should not serialzation be moved to shared - that sounds like a most reasonable place to have it - then we will be able to use it anywhere. Is there a particular reason we want to do this weird dance of whether task-sdk is installed or not?

@amoghrajesh
Copy link
Contributor Author

Should not serialzation be moved to shared - that sounds like a most reasonable place to have it - then we will be able to use it anywhere. Is there a particular reason we want to do this weird dance of whether task-sdk is installed or not?

Yes. There is a reason to do it actually.

This is actually splitting serialization properly as it should rightfully be and it separates things rightly towards the future direction. So, serialization is in two parts (somewhat messy, but let me still explain):

  1. Serialization related to DAG: this should certainly be in airflow-core and it is going to continue to be there (everything but serde in here: https://github.com/apache/airflow/tree/main/airflow-core/src/airflow/serialization)
  2. Serde: this is a module that should ONLY belong at execution time, and I think moving serde out of airflow.serialization is on the right track in a reorganised way. The rest of the package (except serde) is to deal with dag serialisation, and should probably be renamed/moved to reflect that (later by me).

So this loading of task-sdk will only happen when we try to deserialise an object relevant to airflow.sdk which was pushed during task execution anyways (so task sdk is installed)

@potiuk
Copy link
Member

potiuk commented Dec 8, 2025

This is actually splitting serialization properly as it should rightfully be and it separates things rightly towards the future direction. So, serialization is in two parts (somewhat messy, but let me still explain):

Makes sense. I never liked the idea of having "one serialization to rule them all". Thanks for explanation.

@potiuk
Copy link
Member

potiuk commented Dec 8, 2025

Makes sense. I never liked the idea of having "one serialization to rule them all". Thanks for explanation.

Also it makes it far better from the security standpoint. And opens the door of having serde implementation connected with providers installed (and likely discovered by Providers Manager)

@amoghrajesh
Copy link
Contributor Author

This is actually splitting serialization properly as it should rightfully be and it separates things rightly towards the future direction. So, serialization is in two parts (somewhat messy, but let me still explain):

Makes sense. I never liked the idea of having "one serialization to rule them all". Thanks for explanation.

Absolutely me too!

Copy link
Member

@pierrejeambrun pierrejeambrun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API side LGTM.

@amoghrajesh
Copy link
Contributor Author

amoghrajesh commented Dec 9, 2025

Thanks, I will merge it once I am back from holidays (16 Dec), just because the scope of this PR

@amoghrajesh
Copy link
Contributor Author

Alright, let's merge this one today

@amoghrajesh
Copy link
Contributor Author

Broken tests are unrelated. Same failure on main too: https://github.com/apache/airflow/actions/runs/20298266352/job/58300350219

@amoghrajesh
Copy link
Contributor Author

Merging this PR

@amoghrajesh amoghrajesh merged commit 3af4d28 into apache:main Dec 17, 2025
235 of 237 checks passed
@amoghrajesh amoghrajesh deleted the move-serde-to-task-sdk branch December 17, 2025 10:28
@github-actions
Copy link

Backport failed to create: v3-1-test. View the failure log Run details

Status Branch Result
v3-1-test Commit Link

You can attempt to backport this manually by running:

cherry_picker 3af4d28 v3-1-test

This should apply the commit to the v3-1-test branch and leave the commit in conflict state marking
the files that need manual conflict resolution.

After you have resolved the conflicts, you can continue the backport process by running:

cherry_picker --continue

@amoghrajesh
Copy link
Contributor Author

No backport needed, this is targetted for 3.2.0

@potiuk
Copy link
Member

potiuk commented Dec 17, 2025

#protm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:API Airflow's REST/HTTP API area:dev-tools area:serialization area:task-sdk backport-to-v3-1-test Mark PR with this label to backport to v3-1-test branch kind:documentation

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Isolate xcom API from using serde deserialize for UI display Move over serde library to task sdk

6 participants