-
Notifications
You must be signed in to change notification settings - Fork 16.3k
updated comms to support subclasses of generic types #53939
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@ashb let me know if this is what you had in mind |
|
I wonder if the user should be responsible for casting to a supported built-in. In the case of the issue (i.e. using np.float64), what if they expect and want to use float64 downstream when accessing the extra? I guess that's not a big deal since If we proceed with this, I think we need to update docs to highlight that types that can be forced into a builtin will be forced into a builtin |
|
It actually seems that the asset metadata never reaches |
|
@RNHTTR @amoghrajesh You're both totally right. I think we should ensure that the serializer is called and ensure that on the other side of things, the deserialize method is called correctly too. Will update and test today |
|
I've tried updating the _serialize_outlet_events method to use the serialize function defined in serde.py, but I'm still getting the same error. def _serialize_outlet_events(events: OutletEventAccessorsProtocol) -> Iterator[dict[str, Any]]:
from airflow.serialization.serde import serialize
if TYPE_CHECKING:
assert isinstance(events, OutletEventAccessors)
# We just collect everything the user recorded in the accessors.
# Further filtering will be done in the API server.
for key, accessor in events._dict.items():
if isinstance(key, AssetUniqueKey):
yield {"dest_asset_key": serialize(key), "extra": serialize(accessor.extra)}
for alias_event in accessor.asset_alias_events:
yield attrs.asdict(alias_event) |
|
You have found a good bug. It never reaches the np serializer. Feel free to create an issue for that. |
|
Issue created and fixed here. Once merged, will update the outlet events with serialization & deserialization. Food for thought, do you think that asset metadata should respect the configured Xcom backend? Or is it fine that this will all be persisted in the metadata db? |
| ToSupervisor, | ||
| ) | ||
| from airflow.sdk.execution_time.task_runner import SUPERVISOR_COMMS | ||
| from airflow.serialization.serde import deserialize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not allowed to use non-sdk modules in sdk code. Why does this need to use deserialize (and serialize below)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes you're right. This PR is not ready yet. I was trying to test if we could pass the Asset metadata through the serialization layer (like XComs) per @amoghrajesh's recommendation and this implementation was just meant to be a quick check if it would work. What do you think the best way to reuse existing serializers would be?
aa355a3 to
9fea8e0
Compare
|
There was a decision some time ago to move serde to SDK so it should be fine to use it in execution_time now. cc @amoghrajesh to confirm. |
|
@astro-anand serde has now been moved into task sdk: https://github.com/apache/airflow/tree/main/task-sdk/src/airflow/sdk/serde. You can rebase and import from there. |
|
This pull request has been closed because the author has not responded to a request for more information. |
Updated _msgpack_enc_hook to support sending subclasses of builtin types
closes: #53474