-
Notifications
You must be signed in to change notification settings - Fork 16.3k
AIP-72: Simplify the XCOM interface between task sdk and execution API #46719
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AIP-72: Simplify the XCOM interface between task sdk and execution API #46719
Conversation
|
Yeah, i think i have to fix the execution API tests |
|
Oh some task sdk tests will fail now, will take a look! |
kaxil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
few comments -- won't be looking until I am back -- but reviewing it for some context :)
kaxil
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fyi: We would still have multiple serialization/ de-serialization (after eliminating 1 step in this PR) steps that we should consolidate -- even after this PR.
Storing Data (for GO-client example):
- Go serializes the object (json.Marshal) → Sends it as an HTTP request.
- FastAPI deserializes it (JsonValue) → Converts it into a Python dict.
- SQLAlchemy stores it (Column(JSON)) → Serializes it back into JSON in the database.
airflow/airflow/models/xcom.py
Line 83 in 8f63b82
| value = Column(JSON().with_variant(postgresql.JSONB, "postgresql")) |
Retrieving Data:
- SQLAlchemy loads it as a Python dict from the DB.
- FastAPI serializes it back into JSON.
- Go receives it and deserializes it (
json.Unmarshalor a different library).
Not always, example airflow/providers/common/io/src/airflow/providers/common/io/xcom/backend.py Lines 116 to 118 in 8f63b82
airflow/providers/common/io/src/airflow/providers/common/io/xcom/backend.py Lines 126 to 133 in 8f63b82
|
ashb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
https://github.com/apache/airflow/pull/46719/files#r1956536132 needs doing, then LGTM
Thanks! So, yeah, thats true. We still are keeping the same behaviour even with this change, for smaller xcom values, we store in DB in a serialised manner, and for larger data, we store the path in a serialised manner. So when the get xcom is called for an xcom whcih went to object store, the data would be returned as a json object, correct? And so would it for smaller xcom values too. Is that right or i misunderstood? |
|
Hmmm, i didnt think it would make the tests fail like this: |
closes: #46513
related: #45231
Current issues
And here:

looks like this: (first 2 are AF2 and last 2 are with task sdk)
Current flow for setting an xcom from task sdk:
'"Hello, XCom!"')Current flow for getting an xcom:
Proposal to simplify
The new API response for get xcom now looks like:
contrary to:
Impact on custom XCOM backends.
Flow:
set, the custom xcom backend stores the data to object store and stores the path in the database, after serialising it: https://github.com/apache/airflow/blob/main/airflow/models/xcom.py#L185-L192get_value, the custom xcom backend does this:https://github.com/apache/airflow/blob/main/providers/common/io/src/airflow/providers/common/io/xcom/backend.py#L165 which means it returns raw data and not serialised data.Testing:
DAG used:
This tests both push and pull behaviour.
task 1

task 2

task 3 (puller)

DB state (single serialisation only):

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rstor{issue_number}.significant.rst, in newsfragments.