Adding an action in the terminator should produce/consume on epoch? #649

lifflander · 2020-01-08T19:26:46Z

Describe the bug
If you do, theTerm()->addAction(new_epoch, []{}); with a current epoch on the stack, it should probably produce/consume on the outer epoch so it doesn't terminate before an enqueued action on the child epoch completes.

The text was updated successfully, but these errors were encountered:

pnstickne · 2020-01-19T21:08:49Z

Add a (mandatory) parameter or distinct method for each and force the caller to choose?

PhilMiller · 2020-02-20T20:47:58Z

Agreed with Paul - we don't want to just change present behavior. Making it explicit, and consequently auditing all current calls, is probably the right approach.

Moreover, what should the epoch stack look like when that action runs? I think my default expectation would be bare, but I don't know if that's currently accurate or the best design.

lifflander · 2020-02-20T21:00:58Z

So let's say we have the following program. It is currently non-deterministic. Sometimes, send X may be grouped inside ep1 other times, it may not. It all depends on whether the action for ep2 fires before or after the popEpoch(ep1).

int main() {
  auto ep1 = theTerm()->makeEpochCollective();
  theMsg()->pushEpoch(ep1);
  // send some messages with ep1

  { // scope for illustration
    auto ep2 = theTerm()->makeEpochCollective();
    theMsg()->pushEpoch(ep2);
    // send some messages with ep2
    theMsg()->popEpoch(ep2);
    theTerm()->finishedEpoch(ep2);
   
    theTerm()->addAction(ep2, []{
      theMsg()->sendMsg<MyMsg, handler>(msg); // send X
    });
  }
  theMsg()->popEpoch(ep1);
  theTerm()->finishedEpoch(ep1);
}

PhilMiller · 2020-02-20T23:43:36Z

Ok, that example is pretty close to what I would have imagined. I didn't realize we were currently completely at the mercy of ambient state for how it would work out, though. We definitely can't stick with that.

I see at least a few options

require an epoch argument to addAction, produce on that at call, then push, pop and consume it around the action
store the parent epoch of ep2, and do the same with it
store ep1 as the then-active epoch, and do the same with it
Run all such actions in the global epoch, and require callers to handle things as they choose.

I think only 1 and 4 are reasonable choices here. I'd support 1, since it's easiest to simulate any of the others with that simply by what epoch argument gets passed.

PhilMiller · 2020-02-20T23:43:44Z

Oops, sorry

PhilMiller · 2020-02-24T21:08:37Z

I'm going to start a branch to eliminate some of the unnecessary addAction call sites that just set some boolean to use as a loop control variable.

PhilMiller · 2020-02-24T21:09:30Z

The resolution to this issue can then go on there as well.

lifflander · 2020-04-08T16:10:28Z

Has this moved forward? Since I want to close out beta.6, I'm removing this from beta.6 for now.

PhilMiller · 2020-04-09T03:15:53Z

Sorry, I started on it, and then paused efforts. I'll probably be able to push something coherent later this week or early next, though.

…cit creation/management

…nEpochCollective/Rooted, with added addAction as a nuisance

…cit creation/management

…nEpochCollective/Rooted, with added addAction as a nuisance

PhilMiller · 2020-07-28T23:59:40Z

In conversation with Jonathan, there's a decent way we can enforce option 4 - designate a poison_epoch sentinel value, then in debug builds, push that on the stack when setting up to call an epoch, check in the message stamping path that it's not still the top, and pop it afterward.

PhilMiller · 2020-08-18T19:13:58Z

Checking that none of the PRs removing calls to addAction illuminated some previously erroneous usage:

PhilMiller · 2020-08-18T19:22:46Z

https://github.com/DARMA-tasking/vt/pull/979/files#diff-63280c3ae5ef295b295afa9290cfe4d9L211 may have been misuse-adjacent. If we were to add an error check that messages not be sent inside an action without an explicit epoch being activated, that would trigger it. It wasn't broken in context, though.

PhilMiller · 2020-08-18T19:26:47Z

https://github.com/DARMA-tasking/vt/pull/979/files#diff-3b24b898691f8d36d7ffbb22cf09562bL92 et seq were called inside addAction. I think those were likely problematic

PhilMiller · 2020-08-18T19:27:53Z

All of the sequencer tests' usage of addAction were worrisome, but I don't know sequencer well enough to analyze in depth.

lifflander · 2021-09-06T13:02:50Z

Looks like we refactored addAction out of most of the real code in VT.

lifflander added the type: bug label Jan 8, 2020

lifflander self-assigned this Jan 8, 2020

PhilMiller closed this as completed Feb 20, 2020

PhilMiller reopened this Feb 20, 2020

PhilMiller added the 1.0.0-beta.6 label Feb 20, 2020

lifflander removed the 1.0.0-beta.6 label Apr 8, 2020

PhilMiller added the 1.0.0 label May 11, 2020

PhilMiller assigned PhilMiller and unassigned lifflander May 26, 2020

PhilMiller mentioned this issue May 26, 2020

Refactor code away from using low-level termination primitives, to higher-level routines #824

Merged

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: LBManager: Start rearranging to eliminate use of addAction

5ae1e13

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: LBManager: refactor makeLB to return void

a82747a

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: LBManager: use epoch/scheduler helper routines instead of expli…

0640e0e

…cit creation/management

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: BaseLB: ensure asynchronous work runs in correct epoch

f7ca546

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: Remove disused Scoped interfaces, that were duplicative of runI…

7a914e3

…nEpochCollective/Rooted, with added addAction as a nuisance

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: Don't include deleted file

2cd73a6

lifflander mentioned this issue Jul 28, 2020

Meeting Agenda [do not close] #925

Open

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: Include explicitly, where it was implicit before

2de1ddd

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: Convert some tests away from addAction

a4dce50

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: LBManager: Start rearranging to eliminate use of addAction

0d46659

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: LBManager: refactor makeLB to return void

a3bf94d

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: LBManager: use epoch/scheduler helper routines instead of expli…

31325d3

…cit creation/management

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: BaseLB: ensure asynchronous work runs in correct epoch

25203dc

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: Remove disused Scoped interfaces, that were duplicative of runI…

d7ff4b3

…nEpochCollective/Rooted, with added addAction as a nuisance

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: Don't include deleted file

74d1ca7

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: Include explicitly, where it was implicit before

879f871

PhilMiller added a commit that referenced this issue Jul 28, 2020

#649: Convert some tests away from addAction

a4c769e

lifflander added the 1.0.0-beta.10 label Aug 4, 2020

lifflander removed the 1.0.0-beta.10 label Aug 20, 2020

lifflander removed the 1.0.0 label Sep 8, 2020

lifflander closed this as completed Sep 6, 2021

Matthew-Whitlock mentioned this issue Jul 26, 2023

Nondeterministic epoch behavior for enqueue/addAction/threads #2179

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Adding an action in the terminator should produce/consume on epoch? #649

Adding an action in the terminator should produce/consume on epoch? #649

lifflander commented Jan 8, 2020

pnstickne commented Jan 19, 2020

PhilMiller commented Feb 20, 2020

lifflander commented Feb 20, 2020

PhilMiller commented Feb 20, 2020 •

edited

Loading

PhilMiller commented Feb 20, 2020

PhilMiller commented Feb 24, 2020

PhilMiller commented Feb 24, 2020

lifflander commented Apr 8, 2020

PhilMiller commented Apr 9, 2020

PhilMiller commented Jul 28, 2020

PhilMiller commented Aug 18, 2020 •

edited by lifflander

Loading

PhilMiller commented Aug 18, 2020

PhilMiller commented Aug 18, 2020

PhilMiller commented Aug 18, 2020 •

edited

Loading

lifflander commented Sep 6, 2021

Adding an action in the terminator should produce/consume on epoch? #649

Adding an action in the terminator should produce/consume on epoch? #649

Comments

lifflander commented Jan 8, 2020

pnstickne commented Jan 19, 2020

PhilMiller commented Feb 20, 2020

lifflander commented Feb 20, 2020

PhilMiller commented Feb 20, 2020 • edited Loading

PhilMiller commented Feb 20, 2020

PhilMiller commented Feb 24, 2020

PhilMiller commented Feb 24, 2020

lifflander commented Apr 8, 2020

PhilMiller commented Apr 9, 2020

PhilMiller commented Jul 28, 2020

PhilMiller commented Aug 18, 2020 • edited by lifflander Loading

PhilMiller commented Aug 18, 2020

PhilMiller commented Aug 18, 2020

PhilMiller commented Aug 18, 2020 • edited Loading

lifflander commented Sep 6, 2021

PhilMiller commented Feb 20, 2020 •

edited

Loading

PhilMiller commented Aug 18, 2020 •

edited by lifflander

Loading

PhilMiller commented Aug 18, 2020 •

edited

Loading