Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up cycle detector reaping some actors #3649

Merged
merged 1 commit into from
Sep 11, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions .release-notes/dipin-cd-patch.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
## Speed up cycle detector reaping some actors

When an actor is done running, we now do a check similar to the check we do for when --ponynoblock is on:

- Check to see if the actor's message queue is empty
- Check to see if it has a reference count of 0

If both are true, then the actor can be deleted. When --ponynoblock is used, this is done immediately. It can't be done immediately when the cycle detector is in use because, it might have references to the actor in question.

In order to safely cleanup and reap actors, the cycle detector must be the one to do the reaping. This would eventually be done without this change. However, it is possible to write programs where the speed at which the cycle detectors protocol will operate is unable to keep up with the speed at which new "one shot" actors are created.

Here's an example of such a program:

```pony
actor Main
new create(e: Env) =>
Spawner.run()

actor Spawner
var _living_children: U64 = 0

new create() =>
None

be run() =>
_living_children = _living_children + 1
Spawnee(this).run()

be collect() =>
_living_children = _living_children - 1
run()

actor Spawnee
let _parent: Spawner

new create(parent: Spawner) =>
_parent = parent

be run() =>
_parent.collect()
```

Without this patch, that program will have large amount of memory growth and eventually, run out of memory. With the patch, Spawnee actors will be reaped much more quickly and stable memory growth can be achieved.

This is not a complete solution to the problem as issue #1007 is still a problem due to reasons I outlined at https://github.com/ponylang/ponyc/issues/1007#issuecomment-689826407.

It should also be noted that in the process of developing this patch, we discovered a use after free race condition in the runtime mpmcq implementation. That race condition can and will result in segfaults when running the above example program. There's no issue opened yet for that problem, but one will be coming.
35 changes: 27 additions & 8 deletions src/libponyrt/actor/actor.c
Original file line number Diff line number Diff line change
Expand Up @@ -423,17 +423,36 @@ bool ponyint_actor_run(pony_ctx_t* ctx, pony_actor_t* actor, bool polling)
}

bool empty = ponyint_messageq_markempty(&actor->q);
if (empty && actor_noblock && (actor->gc.rc == 0))
if (!ponyint_is_cycle(actor) && empty && (actor->gc.rc == 0))
{
// when 'actor_noblock` is true, the cycle detector isn't running.
// this means actors won't be garbage collected unless we take special
// action. Here, we know that:
// Here, we know that:
// - the actor has no messages in its queue
// - there's no references to this actor
// therefore if `noblock` is on, we should garbage collect the actor.
ponyint_actor_setpendingdestroy(actor);
ponyint_actor_final(ctx, actor);
ponyint_actor_destroy(actor);
// therefore the actor is a zombie and can be reaped.
if (actor_noblock)
{
// when 'actor_noblock` is true, the cycle detector isn't running.
// this means actors won't be garbage collected unless we take special
// action.
// therefore if `noblock` is on, we should garbage collect the actor.
ponyint_actor_setpendingdestroy(actor);
ponyint_actor_final(ctx, actor);
ponyint_actor_destroy(actor);
} else {
// tell cycle detector that this actor is a zombie and will not get
// any more messages/work and can be reaped
// Mark the actor as FLAG_BLOCKED_SENT and send a BLOCKED message
// to speed up reaping otherwise waiting for the cycle detector
// to get around to asking if we're blocked would result in a
// huge memory leak
if(has_flag(actor, FLAG_BLOCKED) && !has_flag(actor, FLAG_BLOCKED_SENT))
{
// We're blocked, send block message.
set_flag(actor, FLAG_BLOCKED_SENT);
pony_assert(ctx->current == actor);
ponyint_cycle_block(actor, &actor->gc);
}
}
}

// Return true (i.e. reschedule immediately) if our queue isn't empty.
Expand Down
50 changes: 48 additions & 2 deletions src/libponyrt/gc/cycle.c
Original file line number Diff line number Diff line change
Expand Up @@ -710,9 +710,55 @@ static void actor_destroyed(detector_t* d, pony_actor_t* actor)
}
}

static void block(detector_t* d, pony_actor_t* actor,
static void block(detector_t* d, pony_ctx_t* ctx, pony_actor_t* actor,
size_t rc, deltamap_t* map)
{
if (rc == 0)
{
// if rc == 0 then this is a zombie actor with no references left to it
// - the actor blocked because it has no messages in its queue
// - there's no references to this actor because rc == 0
// therefore the actor is a zombie and can be reaped.

view_t* view = get_view(d, actor, false);

if (view != NULL)
{
// remove from the deferred set
if(view->deferred)
ponyint_viewmap_remove(&d->deferred, view);

// if we're in a perceived cycle, that cycle is invalid
expire(d, view);

// free the view on the actor
ponyint_viewmap_remove(&d->views, view);
view_free(view);
}

if (map != NULL)
{
// free deltamap
#ifdef USE_MEMTRACK
d->mem_used -= ponyint_deltamap_total_mem_size(map);
d->mem_allocated -= ponyint_deltamap_total_alloc_size(map);
#endif

ponyint_deltamap_free(map);
}

// invoke the actor's finalizer and destroy it
ponyint_actor_setpendingdestroy(actor);
ponyint_actor_final(ctx, actor);
ponyint_actor_sendrelease(ctx, actor);
ponyint_actor_destroy(actor);

d->destroyed++;

return;
}


view_t* view = get_view(d, actor, true);

// update reference count
Expand Down Expand Up @@ -989,7 +1035,7 @@ static void cycle_dispatch(pony_ctx_t* ctx, pony_actor_t* self,
#endif

block_msg_t* m = (block_msg_t*)msg;
block(d, m->actor, m->rc, m->delta);
block(d, ctx, m->actor, m->rc, m->delta);
break;
}

Expand Down