-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail connection manager workflow on non-deterministic exception #14758
Fail connection manager workflow on non-deterministic exception #14758
Conversation
…ail-on-non-deterministic-exception
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am wondering, how complex would it be to write a test for this?
I'm not exactly sure what a test for this would look like, since it would require changing the order or names of activities for a running workflow, which I've only done by changing code. Might be possible though - probably worth having someone spend some time looking into that. |
@benmoriceau I added a note to the description of this ticket - it's up to you all, but we may want to wait to wait to merge this until the issue to find and restart failed workflows has been completed (#14043). If we merge this first, then non-deterministic exceptions will cause workflows to be set to Failed without anything automatically fixing them. Though, I'm not sure if this is actually worse or better than the current behavior (Temporal just retrying the workflows indefinitely, running into the non-deterministic exception over and over). It feels like the current behavior is probably preferable, as fixing that just requires rolling back to the previous deployment, which should result in workflows automatically recovering. Whereas if we mark them as failed without the automatic recovery process, then we will have to manually go and restart a bunch of workflows even if we roll back. I'll leave that decision up to the team! |
…ail-on-non-deterministic-exception
….com:airbytehq/airbyte into lmossman/fail-on-non-deterministic-exception
#14043 has been merged. @jdpgrailsdev / @benmoriceau it it time to merge this? |
@evantahler if we merge this before we have the process in place that automatically restarts failed workflows (I think this ticket #15218 ?), then non-deterministic exceptions will result in workflows being set to See my comment here - I think it may be slightly preferable to keep the current behavior until we have that, but its up to you all |
…ail-on-non-deterministic-exception
…ail-on-non-deterministic-exception
This is ready to be merged, I will do it once green. |
…ytehq#14758) * fail connection manager workflow on non-deterministic exception * Update where the config is added Co-authored-by: Benoit Moriceau <benoit@airbyte.io>
What
Resolves #13973
How
Uses the temporal api to have non-deterministic exceptions cause workflows to be marked as failed, rather than retrying indefinitely.
Note: This should probably only be merged once this other issue has been completed, so that the failed workflows are automatically restarted when this occurs: #14043