Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding an action in the terminator should produce/consume on epoch? #649

Closed
lifflander opened this issue Jan 8, 2020 · 15 comments
Closed
Assignees

Comments

@lifflander
Copy link
Collaborator

Describe the bug
If you do, theTerm()->addAction(new_epoch, []{}); with a current epoch on the stack, it should probably produce/consume on the outer epoch so it doesn't terminate before an enqueued action on the child epoch completes.

@lifflander lifflander self-assigned this Jan 8, 2020
@pnstickne
Copy link
Contributor

Add a (mandatory) parameter or distinct method for each and force the caller to choose?

@PhilMiller
Copy link
Member

Agreed with Paul - we don't want to just change present behavior. Making it explicit, and consequently auditing all current calls, is probably the right approach.

Moreover, what should the epoch stack look like when that action runs? I think my default expectation would be bare, but I don't know if that's currently accurate or the best design.

@lifflander
Copy link
Collaborator Author

So let's say we have the following program. It is currently non-deterministic. Sometimes, send X may be grouped inside ep1 other times, it may not. It all depends on whether the action for ep2 fires before or after the popEpoch(ep1).

int main() {
  auto ep1 = theTerm()->makeEpochCollective();
  theMsg()->pushEpoch(ep1);
  // send some messages with ep1

  { // scope for illustration
    auto ep2 = theTerm()->makeEpochCollective();
    theMsg()->pushEpoch(ep2);
    // send some messages with ep2
    theMsg()->popEpoch(ep2);
    theTerm()->finishedEpoch(ep2);
   
    theTerm()->addAction(ep2, []{
      theMsg()->sendMsg<MyMsg, handler>(msg); // send X
    });
  }
  theMsg()->popEpoch(ep1);
  theTerm()->finishedEpoch(ep1);
}

@PhilMiller
Copy link
Member

PhilMiller commented Feb 20, 2020

Ok, that example is pretty close to what I would have imagined. I didn't realize we were currently completely at the mercy of ambient state for how it would work out, though. We definitely can't stick with that.

I see at least a few options

  1. require an epoch argument to addAction, produce on that at call, then push, pop and consume it around the action
  2. store the parent epoch of ep2, and do the same with it
  3. store ep1 as the then-active epoch, and do the same with it
  4. Run all such actions in the global epoch, and require callers to handle things as they choose.

I think only 1 and 4 are reasonable choices here. I'd support 1, since it's easiest to simulate any of the others with that simply by what epoch argument gets passed.

@PhilMiller
Copy link
Member

Oops, sorry

@PhilMiller
Copy link
Member

I'm going to start a branch to eliminate some of the unnecessary addAction call sites that just set some boolean to use as a loop control variable.

@PhilMiller
Copy link
Member

The resolution to this issue can then go on there as well.

@lifflander
Copy link
Collaborator Author

Has this moved forward? Since I want to close out beta.6, I'm removing this from beta.6 for now.

@PhilMiller
Copy link
Member

Sorry, I started on it, and then paused efforts. I'll probably be able to push something coherent later this week or early next, though.

@PhilMiller PhilMiller assigned PhilMiller and unassigned lifflander May 26, 2020
PhilMiller added a commit that referenced this issue Jul 28, 2020
PhilMiller added a commit that referenced this issue Jul 28, 2020
…nEpochCollective/Rooted, with added addAction as a nuisance
PhilMiller added a commit that referenced this issue Jul 28, 2020
PhilMiller added a commit that referenced this issue Jul 28, 2020
PhilMiller added a commit that referenced this issue Jul 28, 2020
…nEpochCollective/Rooted, with added addAction as a nuisance
PhilMiller added a commit that referenced this issue Jul 28, 2020
@PhilMiller
Copy link
Member

In conversation with Jonathan, there's a decent way we can enforce option 4 - designate a poison_epoch sentinel value, then in debug builds, push that on the stack when setting up to call an epoch, check in the message stamping path that it's not still the top, and pop it afterward.

@PhilMiller
Copy link
Member

PhilMiller commented Aug 18, 2020

@PhilMiller
Copy link
Member

https://github.com/DARMA-tasking/vt/pull/979/files#diff-63280c3ae5ef295b295afa9290cfe4d9L211 may have been misuse-adjacent. If we were to add an error check that messages not be sent inside an action without an explicit epoch being activated, that would trigger it. It wasn't broken in context, though.

@PhilMiller
Copy link
Member

https://github.com/DARMA-tasking/vt/pull/979/files#diff-3b24b898691f8d36d7ffbb22cf09562bL92 et seq were called inside addAction. I think those were likely problematic

@PhilMiller
Copy link
Member

PhilMiller commented Aug 18, 2020

All of the sequencer tests' usage of addAction were worrisome, but I don't know sequencer well enough to analyze in depth.

@lifflander
Copy link
Collaborator Author

Looks like we refactored addAction out of most of the real code in VT.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants