-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remote actors don't appear in the local registry #93
Comments
Ran into one issue, when an actor is stopped on the remote node it causes errors on the local node:
The name of the remote actor is 'actorname', and I'm not sure why the termination of the actor is trying to re-register the actor, will have to look into it. |
The problem seems to occur due to
I believe we need to be able to remove terminated actors from the process groups without recreating them, but currently when we unregister from the process group it uses the ractor/ractor/src/actor/actor_cell/mod.rs Lines 283 to 284 in 6e36b8b
This approach wouldn't work for this situation, since we don't control the entire terminate+pgleave process. I'm open to ideas in solving this. One approach could be that when the NodeSession receives a Terminate event, we could manually unregister it from the process group (and in the process, notifying any supervisors that are monitoring it), and then in the PgLeave event handler only unregister actors from the process group if they still exist in |
Hmm so the local remote actor isn't fully shutdown perhaps? And therefore couldn't self-leave the process group? Yeah it might be something we need to look at, let me see if I can get a repro scenario together and replicate it and what might be good solutions for solving it. Thanks for reporting! |
I have a minimum reproducible example available here: https://gist.github.com/calebfletcher/057e9ca35acf3aaa181c1f0a5f679357 |
I need to re-run this after the merged PR, but I'm hoping this solved your issue. If yes, I'll add an integration test of the situation to make sure we don't regress on it. |
Reran my test case with the upstream changes, it seems to have fixed the issue, so thanks for that. I have opened PR #103 to implement the change and to add an integration test. |
Additionally (although this may be better separated out as a new issue) calling |
Yes for this, I don't have a good solution for remote supervision which is why for now |
I was looking for a way to be able to communicate with a remote actor without having to join it to a process group, and found this section in the code that prevents the RemoteActors from registering with the remote actor's name into the local registry:
ractor/ractor/src/actor/actor_cell/mod.rs
Lines 222 to 225 in 6e36b8b
Uncommenting this code (and cloning the name to get around the move into
ActorProperties::new_remote
) seems to work as expected, with all the remote actors successfully being reigstered.Is there an issue that this approach causes, or could this be enabled?
The text was updated successfully, but these errors were encountered: