Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Force exit of process if a timeout has occurred #1354

Merged
merged 1 commit into from
Nov 10, 2022

Conversation

edoardopirovano
Copy link
Contributor

Our current withTimeout function lets us proceed with the rest of our code, but Node still won't terminate once we reach the end of execution because there'll still be unresolved promises. Sadly, it doesn't sound like actions/cache#958 will be prioritised soon to give us a proper timeout, so this workaround will remain in place for a while. Hence, I think we need to make sure we actually terminate properly after a timeout occurs to avoid alarming our users if there's ever an outage in the Actions cache.

We talked a while back about two possible solutions: force exiting the Node process at the end of execution if we had a timeout or spawning a new process just for the TRAP cache upload and then killing that. This PR implements the former, since I think it is the more lightweight (if somewhat ugly) solution.

Merge / deployment checklist

  • Confirm this change is backwards compatible with existing workflows.
  • Confirm the readme has been updated if necessary.
  • Confirm the changelog has been updated if necessary.

@edoardopirovano edoardopirovano requested a review from a team as a code owner November 9, 2022 17:33
@@ -884,14 +893,37 @@ export async function withTimeout<T>(
};
const timeout: Promise<undefined> = new Promise((resolve) => {
setTimeout(() => {
if (!finished) onTimeout();
if (!finished) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a much simpler solution here? Maybe unref will be a better option. https://nodejs.org/dist/latest-v18.x/docs/api/timers.html#immediateunref

Calling unref on a timer prevents the timer from blocking the process from exiting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for sharing that, I wasn't aware of unref. Having said that, I've played around for a bit trying to get this to work and I'm not sure I follow how it can be applied to our use-case here. The problem is that we have an Immediate object corresponding to the timeout code we've created. But that's not the promise that's left unresolved, so calling unref on that doesn't achieve what we need. The unresolved promise is the one we get back from the Actions cache library that here is wrapped in mainTask, and I don't think we have an Immediate object that we can unref on for that.

Perhaps I'm misunderstanding things, though, I'll admit my knowledge of the Node runtime is fairly limited. Maybe we could pair for a bit so you can show me what you had in mind?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be mistaken. My original understanding was that the call to setTimeout was preventing the action from finishing until the timer expired. But, it looks like the problem is actually coming from a timer inside of the cache library and we don't have access to it. I took a brief look at the cache library code and couldn't find it, but maybe we can raise an issue and submit a PR to change this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies for setting you down a wrong path. 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries! Indeed, I've filed that feature request in actions/cache#958, but it doesn't sound like it will be prioritised soon so we do need a functioning workaround before we can declare TRAP caching complete.

Copy link
Contributor

@aeisenberg aeisenberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the limitations, I think this solution is fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants