-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update critical_service_loop
to throw a runtime error on failure
#9267
Conversation
Fixes bug where the agent hangs when only one of its critical service loops fails
✅ Deploy Preview for prefect-docs-preview ready!
To edit notification comments on pull requests, go to your Netlify site settings. |
except KeyboardInterrupt: | ||
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should never swallow keyboard interrupts
This should also resolve #9052 |
@desertaxle thank you that's the one I was looking for! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
Fixes bug where the agent hangs when only one of its critical service loops fails.
We've been observing this for a while, but it was not clear what the cause was. It turns out we just returned from this function on error because we assumed it was being used in a context where that would result in an exit. Since the agent now checks for cancelled runs in addition to scheduled runs in separate loops, if one loop failed the agent could continue running without reporting the failure. This almost definitely patches the same bug for workers.
Loosely related to some behavior in #7442 e.g. #7442 (comment)
cc @jawnsy who first reported this behavior
Closes #9052