-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
1899 improve makerunnable allocation performance #1900
1899 improve makerunnable allocation performance #1900
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to add the node_
to the debug prints in set_context.cc
?
@@ -146,3 +147,9 @@ template struct MemoryPoolEqual<memory_size_small>; | |||
template struct MemoryPoolEqual<memory_size_medium>; | |||
|
|||
}} //end namespace vt::pool | |||
|
|||
namespace vt { namespace pool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why close the namespace and reopen immediately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok I fexed that.
src/vt/runnable/runnable.impl.h
Outdated
for (int i = 0; i < ci_; i++) { | ||
auto t = dynamic_cast<T*>(contexts_[i].get()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the limited number of context types, would it make sense to designate a specific slot to each context type, to minimize the number of dynamic_cast
calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I refactored the code to have a separate member for each context type instead of an array. Is that something that you had in mind?
Code looks reasonable enough. How much performance impact does this have? |
Pipelines resultsPR tests (gcc-5, ubuntu, mpich) Build for b138399
PR tests (gcc-10, ubuntu, openmpi, no LB) Build for b138399
PR tests (clang-5.0, ubuntu, mpich) Build for b138399
PR tests (clang-3.9, ubuntu, mpich) Build for b138399
PR tests (gcc-9, ubuntu, mpich, zoltan) Build for b138399
PR tests (gcc-6, ubuntu, mpich) Build for b138399
PR tests (clang-9, ubuntu, mpich) Build for b138399
PR tests (clang-13, alpine, mpich) Build for b138399
PR tests (nvidia cuda 11.0, ubuntu, mpich) Build for b138399
PR tests (clang-12, ubuntu, mpich) Build for b138399
PR tests (nvidia cuda 10.1, ubuntu, mpich) Build for b138399
PR tests (clang-11, ubuntu, mpich) Build for b138399
PR tests (intel icpx, ubuntu, mpich) Build for b138399
PR tests (clang-13, ubuntu, mpich) Build for b138399
PR tests (gcc-8, ubuntu, mpich, address sanitizer) Build for 1942c6b
PR tests (clang-14, ubuntu, mpich) Build for b138399
PR tests (gcc-11, ubuntu, mpich) Build for b138399
PR tests (gcc-12, ubuntu, mpich) Build for b138399
PR tests (clang-10, ubuntu, mpich) Build for b138399
PR tests (intel icpc, ubuntu, mpich) Build for b138399
PR tests (gcc-7, ubuntu, mpich, trace runtime, LB) Build for cbd764e
|
The gcc-7 CI build is repeatedly hanging the Azure agent at |
I saw many such hangs in this vicinity from my PR to change the number of ranks used for running the ping pong test, so the hang here may not be related to this PR. Any thoughts on how to figure out what's actually going wrong? Do you think we're exceeding memory in our parallel build (trace is enabled so it might be one of the tougher builds)? |
Codecov Report
@@ Coverage Diff @@
## develop #1900 +/- ##
===========================================
+ Coverage 84.39% 84.42% +0.02%
===========================================
Files 761 763 +2
Lines 26866 26930 +64
===========================================
+ Hits 22673 22735 +62
- Misses 4193 4195 +2
|
cbd764e
to
1942c6b
Compare
1942c6b
to
b138399
Compare
Superseded by later smaller PRs |
Fixes #1899