Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: never_fail in sensor #40915

Merged
merged 4 commits into from
Jul 21, 2024
Merged

Conversation

raphaelauv
Copy link
Contributor

@raphaelauv raphaelauv commented Jul 21, 2024

fix: #40787

introduce the settings never_fail so soft_fail no more ignore technical errors

before a probably full refacto of how sensors skip themself

@boring-cyborg boring-cyborg bot added the area:core-operators Operators, Sensors and hooks within Core Airflow label Jul 21, 2024
@shahar1 shahar1 added the type:bug-fix Changelog: Bug Fixes label Jul 21, 2024
@potiuk potiuk merged commit dc2f2dd into apache:main Jul 21, 2024
47 of 48 checks passed
@raphaelauv raphaelauv deleted the fix_precise_soft_fail branch July 21, 2024 21:32
@Lee-W
Copy link
Member

Lee-W commented Jul 22, 2024

I feel I'm still not sure what the main difference is between soft_fail and never_fail. At least based on the current description. I'm not sure whether it's because I'm not a native speaker or maybe we could improve the description.

@eladkal
Copy link
Contributor

eladkal commented Jul 22, 2024

I agree

I feel I'm still not sure what the main difference is between soft_fail and never_fail. At least based on the current description. I'm not sure whether it's because I'm not a native speaker or maybe we could improve the description.

I agree. This is very confusing.

@eladkal
Copy link
Contributor

eladkal commented Jul 22, 2024

To my understanding soft_fail and never_fail can't co-exist. Wouldn't it be easier to have 1 parameter with modes (enum)?

@raphaelauv
Copy link
Contributor Author

yes but this will need all providers to required airflow 2.10.0 since it would be a big breaking change, that's why I wrote before a probably full refacto of how sensors skip themself

@shahar1
Copy link
Contributor

shahar1 commented Jul 22, 2024

To my understanding soft_fail and never_fail can't co-exist. Wouldn't it be easier to have 1 parameter with modes (enum)?

I thought about it when reviewing it, but I came to the same conclusion as Raphael regarding being a breaking change.
Maybe we could have a better name for the flag, though (skip_on_failure maybe?)

@raphaelauv
Copy link
Contributor Author

yes never_fail do not explicit the fact that task will be skipped

what about skip_on_any_failure ?

@eladkal
Copy link
Contributor

eladkal commented Jul 22, 2024

yes but this will need all providers to required airflow 2.10.0 since it would be a big breaking change, that's why I wrote before a probably full refacto of how sensors skip themself

I don't follow why? We can have backward compatiblity for the current functionality and place the new functionality under version flag.

@Lee-W
Copy link
Member

Lee-W commented Jul 22, 2024

Sorry, I guess it was me that did not express myself well enough here. What I meant is introducing a new parameter for specific sensor instead of adding it to BaseSensor. soft_fail should be kept as it is unless we all agree to do a breaking change in airflow 3.0.

@raphaelauv
Copy link
Contributor Author

yes I can introduce a skip_policy parameters and keep the other settings.

what about

FAIL_ON_ERROR ( default )
SKIP_ON_ANY_ERROR ( never_fail )
SKIP_ONLY_SOFT_ERROR ( soft_fail )
IGNORE_ERRORS ( silent_fail )

?

@raphaelauv
Copy link
Contributor Author

the current soft_fail descript Set to true to mark the task as SKIPPED on failure.

Yes I updated the doc in the second PR #40923

@Lee-W
Copy link
Member

Lee-W commented Jul 22, 2024

yes I can introduce a skip_policy parameters and keep the other settings.

Yes, this is something we could discuss whether we want to make it a new feature and I'm ok with it.

Yes I updated the doc in the second PR #40923

It looks like a breaking change to me and not something we can fix just by updating the description. It's possible that soft_fail was not named well enough when it was introduced, but the description states how it used to work. Set to true to mark the task as SKIPPED on failure. Sounds like any kind of failure to me. Unless I misunderstood this description, we should even revert this PR.

@shahar1 's suggestion on changing it to skip_on_failure is great, and skip_on_any_failure is ok for me as well. We can choose to deprecate soft_fail and instead use skip_on_failure/skip_on_any_failure in the future, or even change the behavior of soft_fail but I think that should only happen in the next major release.

@raphaelauv
Copy link
Contributor Author

before 2.7.1 soft_fail was not related to every exception. So it's a quit recent and breaking change that my PR isolated in never_fail parameter

@Lee-W
Copy link
Member

Lee-W commented Jul 22, 2024

The reason why we made those changes was that the sensor did not do soft_fail states. I would also like to know what others think 🙂 Maybe we just misunderstood how it should work.

@ephraimbuddy ephraimbuddy added this to the Airflow 2.10.0 milestone Jul 23, 2024
romsharon98 pushed a commit to romsharon98/airflow that referenced this pull request Jul 26, 2024
* fix: never_fail in sensor

* fix: tests

* fix: tests 2

* review 1 - doc

---------

Co-authored-by: raphaelauv <raphaelauv@users.noreply.github.com>
@jedcunningham
Copy link
Member

Coming in a bit late, but I took it the same way and have the same concern as @Lee-W.

@raphaelauv
Copy link
Contributor Author

raphaelauv commented Aug 9, 2024

soft_fail have always been about skip the sensor ( and following task) if it raise AirflowSensorTimeout or AirflowTaskTimeout or AirflowFailException and nothing else.

to ignore other errors there is the silent_fail.

So soft_fail was always use as a way to skip following tasks if sensor timeout (without failing on a technical error)

@jaklan
Copy link

jaklan commented Aug 31, 2024

Hi, just to put my 2 cents - I was just reading the docs for both modes:

soft_fail (bool) – Set to true to mark the task as SKIPPED on failure. Mutually exclusive with never_fail.

never_fail (bool) – If true, and poke method raises an exception, sensor will be skipped. Mutually exclusive with soft_fail.

and tbh - I still understand nothing. It seems like soft_fail is related to both poke and reschedule methods, and never_fail only to the poke one, but except of that - it's extremely confusing what's the actual difference between failure (failure of sensor? failure of monitored task?) and exception in that context.

Astro docs make it even more confusing as they refer to exception in both descriptions:

soft_fail=True: If an exception is raised within the task, it is marked as skipped, affecting downstream tasks according to their defined trigger rules.

never_fail=True: (Airflow 2.10+) If the poke method raises any exception, the sensor task is skipped. This parameter is mutually exclusive with soft_fail.

@jaklan
Copy link

jaklan commented Aug 31, 2024

@raphaelauv what you mean is:

  • soft_fail -> skip on exceptions: AirflowSensorTimeout, AirflowTaskTimeout, AirflowSkipException and AirflowFailException
  • silent_fail -> skip on any exception other than the above ones (so... what exactly are these other exceptions?)
  • never_fail -> skip on all possible exceptions

Is that correct? But I still don't get it why silent_fail and never_fail only refer to the poke method - are these mysterious "other exceptions" never related to the reschedule mode, so then soft_fail covers all the possible options anyway?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:core-operators Operators, Sensors and hooks within Core Airflow type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

soft_fail | operator is skipped in all cases and not only "data" related fail
8 participants