Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Services: Don't fail in watch mode when a dependency restarts or fails #526

Merged
merged 5 commits into from
Nov 14, 2022

Conversation

aomarks
Copy link
Member

@aomarks aomarks commented Nov 12, 2022

In non-watch mode, if service A depends on service B, and service B exits unexpectedly, then service A is automatically killed. We do this because otherwise wireit will just continue running, even though there has been an error. We instead want to exit with a non-zero code.

Before this PR, we did the same thing in watch mode. However, this was wrong for 2 reasons:

  1. If we're iteration N-1 which is about to be adopted into iteration N, our dependencies will sometimes intentionally restart, but we were treating this as though it was unexpected. This should not cause us to fail, since we'll either also restart very shortly (when cascade is true), or we'll just keep running (when cascade is false).

  2. If we're iteration N and our dependency unexpectedly exits by itself, it's not actually helpful if we also aggressively exit. We can just keep running, and hopefully the user can fix the problem with the server dependency, which will then either cause us to restart or not depending on the cascade setting.

As part of this I found these other related issues, which were compounding the problem above and making it harder to debug:

  • We were returning undefined when we detached, if the service was in its "failing" or "stopping" state, instead of the child process. This was compounding the issue described above, because if iteration N-1 was failing, iteration N would think the child wasn't running at all, and would not wait for it to fully shut down. We are now much stricter about communicating from iteration N-1 to iteration N that there is a running child process, regardless of the state.

  • We didn't log when a service shut down because of a failure in its dependencies.

Part of #33

Copy link
Collaborator

@augustjk augustjk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!


wireit.kill();
await wireit.exit;
assert.not(service1Inv1.isRunning);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noticed a failure here https://github.com/google/wireit/actions/runs/3465483821/jobs/5788267526

Maybe an extra 100ms wait before this line to de-flake?

Copy link
Collaborator

@AndrewJakubowicz AndrewJakubowicz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great!

Comment on lines 1153 to 1154
const service1 = await rig.newCommand();
const service2 = await rig.newCommand();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unbelievably minor nit and can be discarded. Naming these parentService and childService could help make it easier to follow the test. I kept losing track of whether 1 or 2 was the child.
Also please ignore this comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can +1 that I did have to look at the config being written to tell which one was the dependent. But also, super minor since it is indeed right there.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@aomarks aomarks enabled auto-merge (squash) November 14, 2022 23:07
@aomarks aomarks merged commit f12cb4f into main Nov 14, 2022
@aomarks aomarks deleted the service-watch-issue branch November 14, 2022 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants