schedule the cycle detector with higher priority using the inject queue #3507

sblessing · 2020-04-19T11:49:56Z

This addresses #3501.

PR #2709 introduced the cycle detector being lazily scheduled once a certain amount of cpu ticks (--ponycdinterval) has passed. Scheduler thread 0 would then 'wake up' the the cycle detector by sending it a CHECK_BLOCKED message. The cycle detector would then be appended to back of scheduler-0's queue. The cycle detector would then only be scheduled if either

Scheduler 0 is available (i.e. its queue was empty in the previous loop iteration)
Another scheduler thread is available and steals the cycle detector from scheduler 0's queue. However, this would only happen when the cycle detector has 'travelled' to the head of said queue.

This could create pathological behavior where cycle detection is actually deferred to a point in time where the program would be almost quiescent, especially for programs that create lots of actors right at program start, such that the distance for the cycle detector to reach the head of the scheduling queue might be relatively long.

This change treats the cycle detector as a special system actor that is maintained on the schedulers global inject queue. That is, it is always scheduled using its original start up main context (which is not related to any runtime thread), making sure that a CHECK_BLOCK message (and any other message it would receive after being blocked) will cause it to be pushed to the global inject queue. Every scheduler thread become available would first try to consume an item from this queue before attempting to get an actors from its local queue.

This should have the following effect on runtime system behavior:

Get the cycle detector to be scheduled closer to --ponycdinterval, making this a theoretical upper bound setting
Have the cycle detector to be scheduled with higher priority as it can be 'pop_global'd' by any scheduler thread from the inject queue.
Not defer garbage collection of large actor graphs for long running programs right to the end (or close to) a quiescent system.

SeanTAllen · 2020-04-19T12:22:24Z

@sblessing there's a lot of good information about this change in the issue, can you update your commit comment to include a write up of the why for this change? eventually that information will get lost or send someone off having to go look at this issue. I don't want to lose that information and as such want it to be part of the commit comment itself.

SeanTAllen · 2020-04-19T12:28:19Z

@sblessing additionally, can you write up a user facing description of this change (the what and why) that a pony user might be interested in so that it can be used in the release notes for this? you can add those release notes as a comment on this PR.

thanks.

dipinhora

looks good overall. some minor comments/changes (and a question for the core team). also, as @SeanTAllen mentioned, please update the commit comment with more context about the change so it doesn't get lost (or become hard to find).

once this is merged, i have an idea for how to accomplish the batching i mentioned in the issue.

thanks @sblessing for finding and fixing this unnecessary performance limitation of my changes to the cycle detector stuff (and the detailed issue about it to prompt the discussion and additional thought). and welcome to the (large) group of folks who've found such silliness on my part. 8*P

dipinhora · 2020-04-19T14:39:11Z

src/libponyrt/actor/actor.c

@@ -86,8 +86,10 @@ static bool well_formed_msg_chain(pony_msg_t* first, pony_msg_t* last)
 static void send_unblock(pony_ctx_t* ctx, pony_actor_t* actor)
 {
  // Send unblock before continuing.
+  (void)ctx;


Sorry, i didn't catch this earlier, but instead of this approach, can you change send_unblock to only take actor as an argument and update the call sites accordingly?

dipinhora · 2020-04-19T14:40:36Z

src/libponyrt/actor/actor.c


  DTRACE2(ACTOR_ALLOC, (uintptr_t)ctx->scheduler, (uintptr_t)actor);
  return actor;
 }

 PONY_API void ponyint_destroy(pony_ctx_t* ctx, pony_actor_t* actor)
 {
+  (void)ctx;


@ponylang/core how do you folks feel about changing the signature of ponyint_destroy to remove the ctx argument? (asking because it's marked as PONY_API meaning it's a breaking change)

dipinhora · 2020-04-19T14:42:37Z

src/libponyrt/sched/scheduler.c

        {
          last_cd_tsc = current_tsc;

          // cycle detector should now be on the queue
-          if(actor == NULL)
+          if(actor == NULL) 


i'm assuming this additional trailing space was accidental. if yes, please undo this change.

dipinhora · 2020-04-19T14:44:39Z

src/libponyrt/sched/scheduler.c

@@ -1143,7 +1145,9 @@ pony_ctx_t* ponyint_sched_init(uint32_t threads, bool noyield, bool pin,
  ponyint_mpmcq_init(&inject);
  ponyint_asio_init(asio_cpu);

-  return pony_ctx();
+  inject_context = pony_ctx();


please add code to ponyint_sched_shutdown to set inject_context = NULL to clean things up.

sblessing · 2020-04-20T07:54:24Z

I have done the changes requested by @dipinhora. Will amend the commit message and write the user facing documentation later today @SeanTAllen.

@dipinhora please loop me in on the batch processing thoughts. I have some ideas as well and I am also happy to implement those!

…the inject queue (ponylang#3501) This commit treats the cycle detector as special system actor that is scheduled using the global inject queue. Consequently, the cycle detector is subject to work stealing with a higher priority between all scheduler threads. Prior to this change, the cycle detector was treated like any other application actor. Moreover, scheduler thread 0 decided when the cycle detector should be activated for it to potentially do some work. In some cases, this may have caused pathological behavior, where the time interval --ponycdinterval for 'poking' the cycle detector was theoretically unbounded. With this commit, the cycle detector always uses the global non-runtime context (inject_context), whenever messages are sent to it, independent of the sending actor's context. This guarantees that it will always be pushed to the global inject queue and will then be pop_global'd by the first runtime thread to become available. This has the following effect: a) get the cycle detector to be scheduled closer to --ponycdinterval b) have the cycle detector to be scheduled with higher priority as it can be 'pop_global'd' by any scheduler thread from the inject queue. c) not defer garbage collection of large actor graphs for long running programs right to the end (or close to) a quiescent system. d) locality is not that important for the cycle detector as for application actors, thus scheduling a previously blocked cycle detector using a runtime thread on a first-come-first serve basis should have no negative effect on performance.

sblessing · 2020-04-20T10:34:33Z

user facing documentation and commit message amended @SeanTAllen !

SeanTAllen · 2020-04-20T12:26:24Z

thanks @sblessing. welcome back.

@dipinhora this can be merged once you sign off.

SeanTAllen requested a review from dipinhora April 19, 2020 12:27

SeanTAllen added the changelog - changed Automatically add "Changed" CHANGELOG entry on merge label Apr 19, 2020

dipinhora suggested changes Apr 19, 2020

View reviewed changes

sblessing force-pushed the cycle branch 2 times, most recently from fea2604 to 9f759f2 Compare April 20, 2020 07:54

sblessing force-pushed the cycle branch from 9f759f2 to 4a1744e Compare April 20, 2020 10:33

sblessing requested a review from dipinhora April 20, 2020 10:43

SeanTAllen mentioned this pull request Apr 20, 2020

Release 0.34.0 #3462

Closed

dipinhora approved these changes Apr 21, 2020

View reviewed changes

dipinhora merged commit 9032e01 into ponylang:master Apr 21, 2020

github-actions bot pushed a commit that referenced this pull request Apr 21, 2020

Update CHANGELOG for PR #3507 [skip ci]

4499fd9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

schedule the cycle detector with higher priority using the inject queue #3507

schedule the cycle detector with higher priority using the inject queue #3507

sblessing commented Apr 19, 2020 •

edited

Loading

SeanTAllen commented Apr 19, 2020

SeanTAllen commented Apr 19, 2020

dipinhora left a comment

dipinhora Apr 19, 2020

dipinhora Apr 19, 2020

dipinhora Apr 19, 2020

dipinhora Apr 19, 2020

sblessing commented Apr 20, 2020 •

edited

Loading

sblessing commented Apr 20, 2020 •

edited

Loading

SeanTAllen commented Apr 20, 2020

schedule the cycle detector with higher priority using the inject queue #3507

schedule the cycle detector with higher priority using the inject queue #3507

Conversation

sblessing commented Apr 19, 2020 • edited Loading

SeanTAllen commented Apr 19, 2020

SeanTAllen commented Apr 19, 2020

dipinhora left a comment

Choose a reason for hiding this comment

dipinhora Apr 19, 2020

Choose a reason for hiding this comment

dipinhora Apr 19, 2020

Choose a reason for hiding this comment

dipinhora Apr 19, 2020

Choose a reason for hiding this comment

dipinhora Apr 19, 2020

Choose a reason for hiding this comment

sblessing commented Apr 20, 2020 • edited Loading

sblessing commented Apr 20, 2020 • edited Loading

SeanTAllen commented Apr 20, 2020

sblessing commented Apr 19, 2020 •

edited

Loading

sblessing commented Apr 20, 2020 •

edited

Loading

sblessing commented Apr 20, 2020 •

edited

Loading