-
Notifications
You must be signed in to change notification settings - Fork 14.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for custom Trigger Rule logic #17010
Comments
Thanks for opening your first issue here! Be sure to follow the issue template! |
Problem what to do about the person that need all_done_and_X_succeed where X could be 2 and not one Look like airflow is missing an expressive pattern to let the user implement any custom TriggerRule logic Maybe there could be an interface class TriggerRule(ABC):
@abstractmethod
def check_condition(self)->bool:
pass that the users could extends and the current https://github.com/apache/airflow/blob/main/airflow/utils/trigger_rule.py would also implement that interface |
Having the ability to define a custom trigger rule would be nice. |
the trigger rule task_at_least_one_previous_succeed = SomeOperator(
trigger_rule = "none_failed_or_skipped",
...
)
[task_a,task_b,task_c] >> task_at_least_one_previous_succeed |
The specific request of @SangwanP can be handled by an existed trigger rule |
It'd be nice to be able to have custom trigger rule, but it seem to be already investigated and declined AIRFLOW-389 Me too need 'all done and none upstream_failed' rule after an 'optional' task. If 'optional' task fails by itself then downstream tasks should proceed, but if it has upstream_failed then downstream tasks should have upstream_failed status too. It's possible to make desired behavior by adding 2 dummy tasks: 1 downstream from the 'optional' task with the 'all_done' rule, and another one afterwards with 'all_success' rule and the 'optional' task parents included too, but it will make already bloated DAG even bigger. |
I think custom trigger rules are possible but we need to make sure they are well implemented from the isolation/security point of view (especially if you want to define them in DAGs which should not be allowed). Airflow scheduler makes scheduling decisions based on those, and the rule is that only Workers and File Processors (which run in isolated, Sandboxed Python Processes ) can execute user-DAG-provided code. If we would like to implement custom trigger rule logic, it would have to be installable via providers/plugins only similar as new scheduling Timetables in Airflow 2.2. Which means that they have to be "installed" when Airflow is installed rather than added with DAGs. |
agree with @potiuk there is also the question of - does it worth the effort? |
Agree that given the security constraints, it's not worth it and a few more triggers would cover most needs. Are you willing to submit a PR? |
I think something like Following the ideas in #19361 – it seems that these ideas can combine, e.g. ...
trigger_rule=TriggerRule.WAIT_ALL & TriggerRule.ONE_SUCCESS & TriggerRule.MONDAY Those are all static rules that can be combined with boolean logic such as (This assumes that In #19361 there is also Like timetables (see AIP-39), there could be support for custom implementations, but it might be reasonable to start with a set of builtin classes which can then be combined as proposed above. In some scheduling systems, there is support for custom calendars such as "bank holidays". This seems like a good case for a custom implementation since there is no clear single solution. There are Python libraries such as holidays that address official holidays, but an organization might have a need for much more specific calendars. |
I like the boolean combination idea. This will likely be a big undertaking though with a lot of tests needed since there’s much existing code that uses |
To be honest it does not seem as bad when it comes to a "size" of change. There are not that many places where trigger_rule is used (< 80 cases from a quick look) - so changing the rules to those rules would be possible. However I am not sure if that is needed. Maybe just the name of the rule should be changed (NONE_FAILED_OR_SKIPPED -> WAIT_FOR_ALL_AND_ONE_SUCCESS - it could could even be an alias. I think not everything that can be done, should be done. If we allow for combine several triggers, there will be nonsense combinations (ALL_SUCCESS & ONE_FAILED for example) and we would have to detect all such cases when such combinations and mark them as invaild). That sounds really dangerous IMHO and opens a new class of potential erros when defining DAGS. I really like that people have predefined set of triggers that they can use rather than freely combine them. I think this is a bit too much freedom - with very little benefit, and a lot of potential problems. Very similarly - we could add (I actually had a working POC) MultiExecutor - to be able to combine any of the executors we have but eventually we only have CeleryKubernetesExecutor as "another executor" as this is the combination of executors that gives 90% of cases and saves us solving a lot of problems for people who would like to combine say - sequential executor. local and K8S one. Additionally it opens up new cases which can add even more confusion and problems - like I think throwing calendar in the mix as a trigger is a really, really bad idea. Previously you only looked at the state of upstream tasks to determine whether the trigger fired. If we add logically connected calendar this is really changing the whole "logic" of triggers - and it is far too much IMHO. The "day" based triggerer is not "structural" decision, it is "time" decision and it should not be mixed in here.. |
@potiuk i really like the alias idea. Currently, the doc says, in regards to none_failed_or_skipped:
After this discussion, this doesn't seem to be the truth for me. "WAIT_FOR_ALL_AND_ONE_SUCCESS" describes the real behavior of the TriggerRule. |
I changed the name of the trigger_rule check the latest version of the doc ( not the 2.1.2) |
Let me summarize this issue: Quoting my reply some months ago:
Given the time passed and sine no further requests were made in this area (also considering from what we saw in the past) I think we should close this feature request as won't fix. It seems that we don't really need to provide a custom trigger rule logic but just add specific trigger rules to Airflow core. What do others think? |
Agree with @eladkal |
I think the boolean combination idea is a good one – and probably @potiuk is right that throwing time in the mix is a bad one. That said, if it ain't broke don't fix it and while boolean combinations are arguably more elegant, if there's already a system in place that works, then I suppose that is good enough. |
So I'm closing this issue for now. |
Is there an appropriate place to request specific additional rules? The one which I would really like is |
There is 'none_failed_min_one_success' |
There isnt an option like this in airflow v2! you can select one of these rules' option:
|
Yes. You can actually make a PR adding it. Airflow is created by more than 2000 contributors like you. If you think and are convinced that a new rule is needed - there is nothing preventing you from opening a PR with proposing it @repl-chris Yoy might not understand that this is a free software, developed mostly by volunteers so there is no "request" place. The easiest (and fastest) way to add a feature is to do it yourself as PR. The second best is to convince someone that your cas eis needed - ideally you need to engage with the commuity, show the need, prove it and then ask if somoene would love to implement it - it works best if there are people who agree with you. if you approach it in the "request" way (like if you actually paid for the software - which you did not}) - you will not get far. You can also pay (or your company can pay) someone to contirbute it for you if you do not feel competent enough. So you have plenty of options here. Most of them require more than "raising a request" and "expecting it to happen. Most of them require investing your time or money if you want something delivered by vlolunteers. Just wanted to set expectations here, since you were - probably without the knowledge how things work here - downvoting good answers. You will probably downvote this answer too, but if you will, it means that you completely misunderstood how free software works. |
@potiuk My apologies for using the "request" terminology - I intended to be asking if there was an active issue/discussion somewhere where the specific "2-3 missing trigger rules" eladcal mentioned were being planned, which I could take part in. I did not mean to imply that I would not be willing to do the work myself...I realize my question was not worded appropriately and my intent was completely lost, I will be more careful in the future. I do understand this is free/opensource software maintained by volunteers (myself included, as recently as v2.3.1). |
Sorry for my "harsh" words too. It's just too often people approach free software as somethig that is "magically" maintained and that they might "demand" and "expect" things to happen. Sometimes the written communication is very difficult to pass your real intentions and attitude :) And in this case - really PR is the best way to attempt new rule creation. Having PR opened, would also give you (or creator of the PR whoever that person is) an opportunity to see how the rules are implement and understand it better and sometimes it might end up with ... not creating the PR at all after realising that either you can do it differently or that there are some performance/complexity implications that are difficult to reason about without writing the code. It just a little investment. of time to make further dicusssions more productive. This is great example when the code to implement and analysis how it works is not a huge effort and even if the code resulting from it would be dumped eventually it's a good learning and best use of time for everyone. |
Agree that it will be very helpful if one can define custom trigger_rules. |
In case anyone finds this thread in the future like myself, I'm using Airflow 2.4.3 and I tested |
Description
I would like to have a trigger rule that triggers a task if all the upstream tasks have executed (all_done) and there is atleast one successful upstream task. This would be like the trigger rule: one_success, except it'll wait for all upstream tasks to finish.
Use case / motivation
I have downstream tasks that should run after all the upstream tasks have finished execution. I can't trigger the downstream task if none of the upstream tasks pass, but I need to trigger it if even one of them passes.
Are you willing to submit a PR?
I can work on a PR, if the feature is approved.
The text was updated successfully, but these errors were encountered: