Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement XComArg concat() #40172

Merged
merged 6 commits into from
Jun 18, 2024
Merged

Implement XComArg concat() #40172

merged 6 commits into from
Jun 18, 2024

Conversation

uranusjr
Copy link
Member

This is useful when you want to do a thing against more than one list of things. You can sort of already do this with an extra task, but this is lazier and saves some resources both in the XCom storage and memory needed to run tasks.

Copy link
Member

@hussein-awala hussein-awala left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest, I'm not completely convinced of the importance of this feature, but it might be useful in some case.

Could you update the documentation by describing the new XCom operation chain?

Copy link
Contributor

@josh-fell josh-fell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial reaction was the semantics could be confused with a TaskFlow-specific implementation of airflow.models.baseoperator.chain but it doesn't function the same way. This seems to mirror itertools.chain(), but because there is already a chain() method at the operator level we might need a different semantic? Maybe .combine() is another option?

Curious what others think too.

This is cool, and I would definitely use this personally.

@josh-fell
Copy link
Contributor

Oddly enough, this seems useful for the discussion in #40124?

@uranusjr
Copy link
Member Author

I briefly searched how other ecosystems call this and concat seems popular (behind chain). combine is more often like join but I think it still is an option in this case since the joining semantic does not make sense.

@potiuk
Copy link
Member

potiuk commented Jun 13, 2024

I briefly searched how other ecosystems call this and concat seems popular (behind chain). combine is more often like join but I think it still is an option in this case since the joining semantic does not make sense.

Maybe append is a better name ? I agree with @hussein-awala -> we need some docs to make it discoverable, it's a rather useful feature, but having a description and example of use is pretty much a "MUST HAVE" here.

@uranusjr
Copy link
Member Author

I don’t like append personally since in Python it’s a method that modifies the object in-place, not creating a new object. (Same for extend although nobody mentions it yet.) I plan to add docs after we agree on the interface.

@potiuk
Copy link
Member

potiuk commented Jun 13, 2024

Maybe just xcom_chain() or chain_xcom() ? That would remove confusion and keep itertools relation - at the expense of somewhat redundancy.

@uranusjr
Copy link
Member Author

I think I’m going to go with concat. This is used in JavaScript. Java has Stream.concat (which is more like itertools.chain, close enough). Various languages also have concat for strings. Probably among the most solid choices beside chain.

@potiuk
Copy link
Member

potiuk commented Jun 17, 2024

SGTM

@uranusjr uranusjr requested a review from potiuk as a code owner June 17, 2024 10:09
@uranusjr uranusjr changed the title Implement ChainXComArg Implement XComArg concat() Jun 17, 2024
@uranusjr
Copy link
Member Author

Alright, I’ve made the name change and added docs.

@eladkal
Copy link
Contributor

eladkal commented Jun 17, 2024

This is useful when you want to do a thing against more than one list of things.

Very nice! BTW I think it's also interesting to be able to do thing against specific items in the XCOM rather than all of it.

Copy link
Contributor

@josh-fell josh-fell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Just curious if DAG authors would also be able to concat XComArgs from two sets of mapped tasks? If yes, then maybe worth adding to the docs too?

airflow/models/xcom_arg.py Outdated Show resolved Hide resolved
@uranusjr
Copy link
Member Author

concat XComArgs from two sets of mapped tasks

Not sure I follow this

@josh-fell
Copy link
Contributor

concat XComArgs from two sets of mapped tasks

Not sure I follow this

Related to a question I could think of DAG authors asking: "Does this work with mapped-task inputs?".

  graph LR;
      mapped_task_1-->aggregate_mapped_tasks;
      mapped_task_2-->aggregate_mapped_tasks;
Loading

meaning

aggregate_mapped_tasks(input=mapped_task_1.concat(mapped_task_2))

Looking again the "... concat function takes arbitrary positional arguments ..." statement covers this so ignore. Still looks great.

@uranusjr
Copy link
Member Author

I added some text to call out chaining calls.

@uranusjr uranusjr merged commit 5a3823d into apache:main Jun 18, 2024
52 checks passed
@uranusjr uranusjr deleted the mapped-chain branch June 18, 2024 03:36
@utkarsharma2 utkarsharma2 added the type:improvement Changelog: Improvements label Jul 1, 2024
@utkarsharma2 utkarsharma2 added this to the Airflow 2.10.0 milestone Jul 1, 2024
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:improvement Changelog: Improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants