Multi-arbiter systems leave lingering Arbiters behind after shutdown. #464

matklad · 2021-02-15T17:42:28Z

Expected Behavior

After System::run returns, all system's actors are stopped and dropped, all Arbiters are joined.

Current Behavior

If, inside the system, additional arbiters are created, they outlive the system.

Reproducible example:

#[test]
fn actix_sample() {
    System::run(|| {
        let arb = actix::Arbiter::new();
        A::start_in_arbiter(&arb, |_| A);
        std::thread::sleep_ms(100);
        System::current().stop();
    });
    eprintln!("B");
    std::thread::sleep_ms(300);
}

struct A;

impl Drop for A {
    fn drop(&mut self) {
        std::thread::sleep_ms(200);
        println!("A");
    }
}

impl actix::Actor for A {
    type Context = actix::Context<A>;
}

This prints B, A while it should print A, B. Otherwise, it's not really defined when the actor get's dropped and it might happen after main exits, skipping drops

Context

near/nearcore#3925

Your Environment

Rust Version (I.e, output of rustc -V): rustc 1.49.0-nightly (91a79fb29 2020-10-07)
Actix Version: 0.11.0-beta.1

The text was updated successfully, but these errors were encountered:

robjtede · 2021-02-15T18:24:19Z

Firstly, update to beta 2 to avoid a bug that was present regarding non-registered arbiters and to expose actix-rt v2 based API.

Arbiters are glorified threads and just like with threads you need to join them to await their destruction.

Therefore, the incantation to make these run in order is as follows:

#![allow(warnings)]

use actix::prelude::*;

#[test]
fn actix_sample() {
    let sys = System::new();
    let arb = actix::Arbiter::new();

    A::start_in_arbiter(&arb.handle(), |_| A);

    std::thread::sleep_ms(100);

    eprintln!("send sys stop");
    System::current().stop();

    eprintln!("run sys to completion");
    sys.run().unwrap();

    eprintln!("join arb");
    arb.join().unwrap();

    eprintln!("B");
    std::thread::sleep_ms(300);
}

struct A;

impl Drop for A {
    fn drop(&mut self) {
        std::thread::sleep_ms(200);
        println!("A");
    }
}

impl actix::Actor for A {
    type Context = actix::Context<A>;
}

matklad · 2021-02-15T18:32:56Z

Yeah, that works for the given example, but I am not feeling easy about this solution:

it still feels wrong that the Actor itself outlives the system it belongs to, regardless of the mechanism used to execute the actor's actions.
Semantics of Arbiter::join is really weird: Arbiter is Clone, and you can join any copy of, but only for the original instance join actually joins.

robjtede · 2021-02-15T18:37:05Z

Arbiter is no longer Clone since Handles now exist.

It would be possible to collect all Arbiter thread join handles and wait on them before System::run completes but it would limit the ability of downstream to choose how they're handled since they are not Clone in std.

This issue actually relates to actix-rt not to actix since they're re-exports. Maybe the solution is that in actix-land the threads are joined.

robjtede · 2021-02-15T19:43:27Z

Do you have a use case in mind for wanting this kind of ordering guarantee?

As I see it, you can't do anything useful with a system after sending the stop message anyway. So, as long as your System is set up on the main thread you'll never have a dangling Arbiter (again, not that it seems to matter) and never, in any case, have a thread destroyed without calling it's destructor, if my reading and understanding of std::thread is correct.

matklad · 2021-02-15T19:50:39Z

I definitely do want actor's destructors to have been called when run returns, because destructors often enforce correctness critical invariant. IE, in NEAR's case, an actor manages a database, and Actor's Drop closes this db. It is pretty unfortunate to return from main without waiting for that to happen.

I don't think joining the thread itself is that important, as long as actors themselves are dropped.

robjtede · 2021-02-15T20:01:11Z

I'm 90% sure Rust will guarantee the destructors run on all Actors before the program terminates.

matklad · 2021-02-15T20:11:45Z

Here's the minimal repro demonstrating that it's not the case.

#![allow(warnings)]

use actix::prelude::*;

#[test]
fn actix_sample() {
    let sys = System::new();
    let arb = actix::Arbiter::new();
    A::start_in_arbiter(&arb.handle(), |_| A);
    System::current().stop();
    sys.run().unwrap();
    std::thread::sleep_ms(100);
}

struct A;

impl Drop for A {
    fn drop(&mut self) {
        println!("start drop");
        std::thread::sleep_ms(200);
        println!("finish drop");
    }
}

impl actix::Actor for A {
    type Context = actix::Context<A>;
}

leaked threads are terminated abruptly. In this case, the thread is terminated in the middle of Drop. It is a giant footgun and a reliability hazard, and people tend to learn about it the hard way (I published a blog post with a bug, lol: rust-lang/rust#48820).

fakeshadow · 2021-02-22T07:05:47Z

#[test]
fn actix_sample() {
    actix::System::new().block_on(async {
        let arb = Arb(Some(actix::Arbiter::new()));
        let addr = A::start_in_arbiter(&arb.handle(), |_| A);
        std::thread::sleep_ms(100);
        actix::System::current().stop();
        actix_web::rt::task::yield_now().await;
    });
    eprintln!("B");
    std::thread::sleep_ms(300);
}

struct Arb(Option<actix::Arbiter>);

impl Drop for Arb {
    fn drop(&mut self) {
        self.0.take().unwrap().join().unwrap();
    }
}

impl Deref for Arb {
    type Target = actix::Arbiter;

    fn deref(&self) -> &Self::Target {
        self.0.as_ref().unwrap()
    }
}

struct A;

impl Drop for A {
    fn drop(&mut self) {
        std::thread::sleep_ms(200);
        println!("A");
    }
}

impl actix::Actor for A {
    type Context = actix::Context<A>;
}

System::run is just a block_on in latest actix-rt. So it behaves just like tokio's block_on. actix-rt is mainly used an re-export and a manager of multiple current thread runtime. It does not try to offer anything more than tokio offers.

If any change need to be done to let System stop yields and join all arbiter it should happen in actix repo or user code.

robjtede added the C-bug Category: bug label Feb 16, 2021

robjtede self-assigned this Feb 16, 2021

robjtede mentioned this issue Feb 21, 2021

assure arbiter stop ordering actix/actix-net#280

Closed

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multi-arbiter systems leave lingering Arbiters behind after shutdown. #464

Multi-arbiter systems leave lingering Arbiters behind after shutdown. #464

matklad commented Feb 15, 2021 •

edited

Loading

robjtede commented Feb 15, 2021 •

edited

Loading

matklad commented Feb 15, 2021

robjtede commented Feb 15, 2021 •

edited

Loading

robjtede commented Feb 15, 2021 •

edited

Loading

matklad commented Feb 15, 2021

robjtede commented Feb 15, 2021

matklad commented Feb 15, 2021

fakeshadow commented Feb 22, 2021 •

edited

Loading

Multi-arbiter systems leave lingering Arbiters behind after shutdown. #464

Multi-arbiter systems leave lingering Arbiters behind after shutdown. #464

Comments

matklad commented Feb 15, 2021 • edited Loading

Expected Behavior

Current Behavior

Context

Your Environment

robjtede commented Feb 15, 2021 • edited Loading

matklad commented Feb 15, 2021

robjtede commented Feb 15, 2021 • edited Loading

robjtede commented Feb 15, 2021 • edited Loading

matklad commented Feb 15, 2021

robjtede commented Feb 15, 2021

matklad commented Feb 15, 2021

fakeshadow commented Feb 22, 2021 • edited Loading

matklad commented Feb 15, 2021 •

edited

Loading

robjtede commented Feb 15, 2021 •

edited

Loading

robjtede commented Feb 15, 2021 •

edited

Loading

robjtede commented Feb 15, 2021 •

edited

Loading

fakeshadow commented Feb 22, 2021 •

edited

Loading