Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid 'weaver multi' zombie main components. #644

Merged
merged 1 commit into from
Sep 28, 2023
Merged

Conversation

mwhittaker
Copy link
Member

Recall that when you run weaver multi deploy, you start a weaver multi deploy process that runs a bunch of remote weavelets in subprocesses. The weavelets communicate with the weaver multi deploy process over a set of pipes. Here's an example of what top shows when deploying the collatz app:

-bash
 └─ cmd/weaver/weaver multi deploy weaver.toml
    ├─ examples/collatz/collatz
    ├─ examples/collatz/collatz
    ├─ examples/collatz/collatz
    ├─ examples/collatz/collatz
    ├─ examples/collatz/collatz
    └─ examples/collatz/collatz

If you kill the weaver multi deploy process, it doesn't kill the child weavelet processes, at least not directly. However, killing the weaver multi deploy process closes the pipes to the weavelets. This should cause the weavelet processes to exit.

This was working correctly for all weavelets not running the main component. weavelets running the main component blocked on the user provided lambda. If this lambda did something like run an HTTP server, then the process wouldn't exit, even if the underlying weavelet was broken. This led to a bunch of zombie processes lingering around forever.

This PR fixes the bug by exiting whenever the weavelet encounters a broken pipe.

Also note that examples/examples_test.go runs into this zombie behavior. So, if you've been running ./dev/build_and_test.sh, you might to run ps aux | grep weaver to see if you have a bunch of zombies.

Recall that when you run `weaver multi deploy`, you start a `weaver
multi deploy` process that runs a bunch of remote weavelets in
subprocesses. The weavelets communicate with the `weaver multi deploy`
process over a set of pipes. Here's an example of what `top` shows when
deploying the collatz app:

```
-bash
 └─ cmd/weaver/weaver multi deploy weaver.toml
    ├─ examples/collatz/collatz
    ├─ examples/collatz/collatz
    ├─ examples/collatz/collatz
    ├─ examples/collatz/collatz
    ├─ examples/collatz/collatz
    └─ examples/collatz/collatz
```

If you `kill` the `weaver multi deploy` process, it doesn't kill the
child weavelet processes, at least not directly. However, killing the
`weaver multi deploy` process closes the pipes to the weavelets. This
should cause the weavelet processes to exit.

This was working correctly for all weavelets not running the main
component. weavelets running the main component blocked on the user
provided lambda. If this lambda did something like run an HTTP server,
then the process wouldn't exit, even if the underlying weavelet was
broken. This led to a bunch of zombie processes lingering around
forever.

This PR fixes the bug by exiting whenever the weavelet encounters a
broken pipe.

Also note that `examples/examples_test.go` runs into this zombie
behavior. So, if you've been running `./dev/build_and_test.sh`, you
might to run `ps aux | grep weaver` to see if you have a bunch of
zombies.
@mwhittaker mwhittaker self-assigned this Sep 27, 2023
Copy link
Contributor

@spetrovic77 spetrovic77 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice fix, I indeed had a bunch of zombies on my local machine!

@mwhittaker mwhittaker merged commit ce76255 into main Sep 28, 2023
@mwhittaker mwhittaker deleted the killing_zombies branch September 28, 2023 21:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants