Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multi-arbiter systems leave lingering Arbiters behind after shutdown. #464

Open
matklad opened this issue Feb 15, 2021 · 8 comments
Open
Assignees
Labels
C-bug Category: bug

Comments

@matklad
Copy link

matklad commented Feb 15, 2021

Expected Behavior

After System::run returns, all system's actors are stopped and dropped, all Arbiters are joined.

Current Behavior

If, inside the system, additional arbiters are created, they outlive the system.

Reproducible example:

#[test]
fn actix_sample() {
    System::run(|| {
        let arb = actix::Arbiter::new();
        A::start_in_arbiter(&arb, |_| A);
        std::thread::sleep_ms(100);
        System::current().stop();
    });
    eprintln!("B");
    std::thread::sleep_ms(300);
}

struct A;

impl Drop for A {
    fn drop(&mut self) {
        std::thread::sleep_ms(200);
        println!("A");
    }
}

impl actix::Actor for A {
    type Context = actix::Context<A>;
}

This prints B, A while it should print A, B. Otherwise, it's not really defined when the actor get's dropped and it might happen after main exits, skipping drops

Context

near/nearcore#3925

Your Environment

  • Rust Version (I.e, output of rustc -V): rustc 1.49.0-nightly (91a79fb29 2020-10-07)
  • Actix Version: 0.11.0-beta.1
@robjtede
Copy link
Member

robjtede commented Feb 15, 2021

Firstly, update to beta 2 to avoid a bug that was present regarding non-registered arbiters and to expose actix-rt v2 based API.

Arbiters are glorified threads and just like with threads you need to join them to await their destruction.

Therefore, the incantation to make these run in order is as follows:

#![allow(warnings)]

use actix::prelude::*;

#[test]
fn actix_sample() {
    let sys = System::new();
    let arb = actix::Arbiter::new();

    A::start_in_arbiter(&arb.handle(), |_| A);

    std::thread::sleep_ms(100);

    eprintln!("send sys stop");
    System::current().stop();

    eprintln!("run sys to completion");
    sys.run().unwrap();

    eprintln!("join arb");
    arb.join().unwrap();

    eprintln!("B");
    std::thread::sleep_ms(300);
}

struct A;

impl Drop for A {
    fn drop(&mut self) {
        std::thread::sleep_ms(200);
        println!("A");
    }
}

impl actix::Actor for A {
    type Context = actix::Context<A>;
}

@matklad
Copy link
Author

matklad commented Feb 15, 2021

Yeah, that works for the given example, but I am not feeling easy about this solution:

  • it still feels wrong that the Actor itself outlives the system it belongs to, regardless of the mechanism used to execute the actor's actions.
  • Semantics of Arbiter::join is really weird: Arbiter is Clone, and you can join any copy of, but only for the original instance join actually joins.

@robjtede
Copy link
Member

robjtede commented Feb 15, 2021

Arbiter is no longer Clone since Handles now exist.

It would be possible to collect all Arbiter thread join handles and wait on them before System::run completes but it would limit the ability of downstream to choose how they're handled since they are not Clone in std.

This issue actually relates to actix-rt not to actix since they're re-exports. Maybe the solution is that in actix-land the threads are joined.

@robjtede
Copy link
Member

robjtede commented Feb 15, 2021

Do you have a use case in mind for wanting this kind of ordering guarantee?

As I see it, you can't do anything useful with a system after sending the stop message anyway. So, as long as your System is set up on the main thread you'll never have a dangling Arbiter (again, not that it seems to matter) and never, in any case, have a thread destroyed without calling it's destructor, if my reading and understanding of std::thread is correct.

@matklad
Copy link
Author

matklad commented Feb 15, 2021

I definitely do want actor's destructors to have been called when run returns, because destructors often enforce correctness critical invariant. IE, in NEAR's case, an actor manages a database, and Actor's Drop closes this db. It is pretty unfortunate to return from main without waiting for that to happen.

I don't think joining the thread itself is that important, as long as actors themselves are dropped.

@robjtede
Copy link
Member

I'm 90% sure Rust will guarantee the destructors run on all Actors before the program terminates.

@matklad
Copy link
Author

matklad commented Feb 15, 2021

Here's the minimal repro demonstrating that it's not the case.

#![allow(warnings)]

use actix::prelude::*;

#[test]
fn actix_sample() {
    let sys = System::new();
    let arb = actix::Arbiter::new();
    A::start_in_arbiter(&arb.handle(), |_| A);
    System::current().stop();
    sys.run().unwrap();
    std::thread::sleep_ms(100);
}

struct A;

impl Drop for A {
    fn drop(&mut self) {
        println!("start drop");
        std::thread::sleep_ms(200);
        println!("finish drop");
    }
}

impl actix::Actor for A {
    type Context = actix::Context<A>;
}

leaked threads are terminated abruptly. In this case, the thread is terminated in the middle of Drop. It is a giant footgun and a reliability hazard, and people tend to learn about it the hard way (I published a blog post with a bug, lol: rust-lang/rust#48820).

@robjtede robjtede added the C-bug Category: bug label Feb 16, 2021
@robjtede robjtede self-assigned this Feb 16, 2021
@fakeshadow
Copy link
Contributor

fakeshadow commented Feb 22, 2021

#[test]
fn actix_sample() {
    actix::System::new().block_on(async {
        let arb = Arb(Some(actix::Arbiter::new()));
        let addr = A::start_in_arbiter(&arb.handle(), |_| A);
        std::thread::sleep_ms(100);
        actix::System::current().stop();
        actix_web::rt::task::yield_now().await;
    });
    eprintln!("B");
    std::thread::sleep_ms(300);
}

struct Arb(Option<actix::Arbiter>);

impl Drop for Arb {
    fn drop(&mut self) {
        self.0.take().unwrap().join().unwrap();
    }
}

impl Deref for Arb {
    type Target = actix::Arbiter;

    fn deref(&self) -> &Self::Target {
        self.0.as_ref().unwrap()
    }
}

struct A;

impl Drop for A {
    fn drop(&mut self) {
        std::thread::sleep_ms(200);
        println!("A");
    }
}

impl actix::Actor for A {
    type Context = actix::Context<A>;
}

System::run is just a block_on in latest actix-rt. So it behaves just like tokio's block_on. actix-rt is mainly used an re-export and a manager of multiple current thread runtime. It does not try to offer anything more than tokio offers.

If any change need to be done to let System stop yields and join all arbiter it should happen in actix repo or user code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-bug Category: bug
Projects
None yet
Development

No branches or pull requests

3 participants