-
Notifications
You must be signed in to change notification settings - Fork 16.4k
[AIP-34] Rewrite SubDagOperator #9243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
airflow/models/task_group.py
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Serialized dag storing everything already I'm not sure if this needs it's own model or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Ash,
I am not familiar with the code on Serialized dag. Can you point me to the code that you are referring to? I will take a look at it
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
|
Hi @xinbinhuang Can you rebase the PR, and based on our last discussion in Slack (last month -- or the month before), I think it needs an update |
Add subdag tasks to root dag
- It's originally proposed to allow running SubDag tasks as part of parent Dag but still keep the visual grouping effect in Graph/Tree view see issue 8078. - This can be further extend to allow arbitrary grouping of tasks that make other metadata operations possible.
|
tests are failing @xinbinhuang |
|
In favor of #10153 |
#8078 cc: @dimberman @kaxil
Hi guys,
Sorry for the long wait. It took me a while to figure out where to properly put the changes and I am also occupied by other things. Anyway, here is a draft PR, for the proposed rewrite of the SubDagOperator. I will write out more details on the AIP tonight. But here is the gist.
There are 3 main changes:
DagBag.bag_daglogic to attach tasks of the subdags to the root dag during dag file parsing.SubDagOperatorcurrent_group/current_dagandparent_group/parent_dag) that is used for grouping related tasks. This is used to replace the concept of SubDag to group tasks together and render the visual grouping effect in the UI.Here is a example subdag dag I used to test out the new SubDag Operator. It can successfully unpack the subdags.
This is the comparison on the generated graph:
Old:

New:

Also, it seems that there are a lot of SubDag related handling logic in the codebase, and it means that if we decide to go forward with the rewrite. We probably will need to clean up a lot of code. I believe it's a good thing?
grep -r subdag airflow | wc -l 162Here are the things that I think need to be done:
current_group=subdag.dag_id=subdag_operator.task_id, to the task_ids.I have not created any tests yet for it as I want to have some discussion first. Let me know your thought :)
Cheers,
Bin
Make sure to mark the boxes below before creating PR: [x]
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.