-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1938 remove unneeded calls to mpi wtime in trace and lb #1998
1938 remove unneeded calls to mpi wtime in trace and lb #1998
Conversation
Pipelines resultsPR tests (gcc-12, ubuntu, mpich) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (gcc-5, ubuntu, mpich) Build for 5046d72 (2022-11-17 20:24:02 UTC)
PR tests (clang-3.9, ubuntu, mpich) Build for 5046d72 (2022-11-17 20:24:02 UTC)
PR tests (clang-5.0, ubuntu, mpich) Build for 5046d72 (2022-11-17 20:24:02 UTC)
PR tests (gcc-10, ubuntu, openmpi, no LB) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (gcc-6, ubuntu, mpich) Build for 5046d72 (2022-11-17 20:24:02 UTC)
PR tests (clang-9, ubuntu, mpich) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (clang-13, alpine, mpich) Build for 5046d72 (2022-11-17 20:24:02 UTC)
PR tests (gcc-9, ubuntu, mpich, zoltan) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (gcc-8, ubuntu, mpich, address sanitizer) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (clang-11, ubuntu, mpich) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (clang-12, ubuntu, mpich) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (intel icpx, ubuntu, mpich) Build for 5046d72 (2022-11-17 20:24:02 UTC)
PR tests (clang-13, ubuntu, mpich) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (nvidia cuda 10.1, ubuntu, mpich) Build for 5046d72 (2022-11-17 20:24:02 UTC)
PR tests (nvidia cuda 11.0, ubuntu, mpich) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (clang-14, ubuntu, mpich) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (clang-10, ubuntu, mpich) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (gcc-11, ubuntu, mpich, json schema test) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (intel icpc, ubuntu, mpich) Build for d46c6d1 (2023-02-15 01:01:05 UTC)
PR tests (gcc-7, ubuntu, mpich, trace runtime, LB) Build for 5046d72 (2022-11-17 20:24:02 UTC)
PR tests (nvidia cuda 11.2, ubuntu, mpich) Build for ( UTC)
|
57775d0
to
6ef8a04
Compare
Codecov Report
@@ Coverage Diff @@
## develop #1998 +/- ##
===========================================
- Coverage 84.86% 82.99% -1.87%
===========================================
Files 720 731 +11
Lines 25674 25851 +177
===========================================
- Hits 21788 21456 -332
- Misses 3886 4395 +509
|
I would approve the first 3 commits of this PR as resolving 1938 (modulo my comments about default parameters). The latter two commits seem to be less motivated, and less clearly correct. |
Thanks for your comments @PhilMiller. I'll drop the last 2 commits and see about the default arguments. |
7f463fa
to
31b91f3
Compare
31b91f3
to
5046d72
Compare
5046d72
to
5880acc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Frankly, I was much happier with the approach of just passing a time down the call stack.
This PR now adds a bunch of calls to timing::getCurrentTime
, though I haven't checked whether they're in specifically cold paths.
This change in approach may have been discussed, but that discussion is supposed to be memorialized in comments here. I'll note that the PR description no longer quite matches the code.
Let's talk about this in the meeting this week.
src/vt/scheduler/scheduler.cc
Outdated
@@ -258,7 +258,8 @@ void Scheduler::runSchedulerOnceImpl(bool msg_only) { | |||
} | |||
} else if (work_queue_.empty()) { | |||
if (curRT->needsCurrentTime()) { | |||
runProgress(msg_only, timing::getCurrentTime()); | |||
current_sched_time_ = timing::getCurrentTime(); | |||
runProgress(msg_only, current_sched_time_); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may not be worth changing, but since runProgress
is a sibling member, it doesn't need to take current_sched_time_
as an argument, since it can access it directly.
Thanks @PhilMiller. Yes, the idea was to just pass time down the call stack. After removing the default arguments (per your first review), then a time argument needed to be passed through to any runnable (
I'll update the PR description to reflect these changes. |
@@ -140,7 +140,7 @@ static trace::TraceProcessingTag BeginProcessingInvokeEvent() { | |||
const auto trace_event = theTrace()->messageCreation(trace_id, 0); | |||
const auto from_node = theContext()->getNode(); | |||
|
|||
return theTrace()->beginProcessing(trace_id, 0, trace_event, from_node); | |||
return theTrace()->beginProcessing(trace_id, 0, trace_event, from_node, timing::getCurrentTime()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may actually illustrate a case for why I like the 'pass the time as an argument to contexts' approach better - if we want to add LB attribution for methods called through invoke
, we would want to share this timer call for the start time, rather than have it reference the scheduler's recorded time, or need another new interface for the divergent case.
@@ -62,6 +62,7 @@ void BaseUnit::execute() { | |||
#endif | |||
} else if (work_) { | |||
work_(); | |||
theSched()->setRecentTimeToStale(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be outside the if / else if
, or duplicated in both arms? Why shouldn't the time be updated after we've run stuff in a thread?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's in the run function (runnable.cc line 183). Should we just pull it out to here (outside the if/else)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left as is, since the runnable branch will also need to get a finish time.
I think I'm generally ok with this, and just need to see the couple comments addressed. This does need to be run through the same fine-grain overhead benchmarks we used to identify time calls as a concern to begin with. I'd like to be confident that added calls to |
…d adjusting for CI
e6bb911
to
44f3132
Compare
Some runs of the microbenchmark were getting ~30% reduction in time, directly attributable to reduced time spent in |
Improvements to the tests will be pursued in VT #2017 |
Closes #1938.
Pass the current time from the scheduler to LB and Trace contexts.This has been modified to allow the scheduler to hold a
current_sched_time_
and then the appropriate runnable contexts (here LB and Trace) can query their start time from the scheduler. This change was primarily the result of not wanting to force all runnable contexts to take a time (even in contexts where that is inappropriate). Still targets the issue by reducing calls to MPI_Wtime (although increased dereferencing of thetheSched()
)