Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1899 improve makerunnable allocation performance #1900

Closed

Conversation

lifflander
Copy link
Collaborator

Fixes #1899

@lifflander lifflander linked an issue Aug 5, 2022 that may be closed by this pull request
3 tasks
Copy link
Collaborator

@nlslatt nlslatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to add the node_ to the debug prints in set_context.cc?

@@ -146,3 +147,9 @@ template struct MemoryPoolEqual<memory_size_small>;
template struct MemoryPoolEqual<memory_size_medium>;

}} //end namespace vt::pool

namespace vt { namespace pool {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why close the namespace and reopen immediately?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok I fexed that.

Comment on lines 55 to 56
for (int i = 0; i < ci_; i++) {
auto t = dynamic_cast<T*>(contexts_[i].get());
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the limited number of context types, would it make sense to designate a specific slot to each context type, to minimize the number of dynamic_cast calls?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I refactored the code to have a separate member for each context type instead of an array. Is that something that you had in mind?

@PhilMiller
Copy link
Member

Code looks reasonable enough. How much performance impact does this have?

@github-actions
Copy link

github-actions bot commented Aug 5, 2022

Pipelines results

PR tests (gcc-5, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (gcc-10, ubuntu, openmpi, no LB)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (clang-5.0, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (clang-3.9, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (gcc-9, ubuntu, mpich, zoltan)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (gcc-6, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (clang-9, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (clang-13, alpine, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (nvidia cuda 11.0, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (clang-12, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (nvidia cuda 10.1, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (clang-11, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (intel icpx, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (clang-13, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (gcc-8, ubuntu, mpich, address sanitizer)

Build for 1942c6b

Compilation - successful

Testing - passed

Build log


PR tests (clang-14, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (gcc-11, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (gcc-12, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (clang-10, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (intel icpc, ubuntu, mpich)

Build for b138399

Compilation - successful

Testing - passed

Build log


PR tests (gcc-7, ubuntu, mpich, trace runtime, LB)

Build for cbd764e

Compilation - successful

Testing - passed

Build log


@PhilMiller
Copy link
Member

The gcc-7 CI build is repeatedly hanging the Azure agent at Linking CXX executable tests/atomic_basic

@nlslatt
Copy link
Collaborator

nlslatt commented Aug 16, 2022

The gcc-7 CI build is repeatedly hanging the Azure agent at Linking CXX executable tests/atomic_basic

I saw many such hangs in this vicinity from my PR to change the number of ranks used for running the ping pong test, so the hang here may not be related to this PR. Any thoughts on how to figure out what's actually going wrong? Do you think we're exceeding memory in our parallel build (trace is enabled so it might be one of the tougher builds)?

@codecov
Copy link

codecov bot commented Aug 24, 2022

Codecov Report

Merging #1900 (cbd764e) into develop (ff3ec68) will increase coverage by 0.02%.
The diff coverage is 73.52%.

❗ Current head cbd764e differs from pull request most recent head b138399. Consider uploading reports for the commit b138399 to get more accurate results

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #1900      +/-   ##
===========================================
+ Coverage    84.39%   84.42%   +0.02%     
===========================================
  Files          761      763       +2     
  Lines        26866    26930      +64     
===========================================
+ Hits         22673    22735      +62     
- Misses        4193     4195       +2     
Impacted Files Coverage Δ
src/vt/context/runnable_context/collection.h 100.00% <ø> (ø)
src/vt/pool/static_sized/memory_pool_equal.cc 100.00% <ø> (ø)
src/vt/runnable/runnable.h 77.77% <ø> (-2.23%) ⬇️
src/vt/vrt/collection/manager.impl.h 96.48% <ø> (ø)
src/vt/context/runnable_context/collection.cc 50.00% <50.00%> (ø)
src/vt/runnable/runnable.cc 71.42% <70.96%> (+0.69%) ⬆️
src/vt/utils/ptr/unique_fixed.h 76.92% <76.92%> (ø)
src/vt/context/context.cc 100.00% <100.00%> (ø)
src/vt/context/runnable_context/collection.impl.h 100.00% <100.00%> (+40.00%) ⬆️
src/vt/context/runnable_context/set_context.h 100.00% <100.00%> (ø)
... and 50 more

@thearusable thearusable force-pushed the 1899-improve-makerunnable-allocation-performance branch from cbd764e to 1942c6b Compare August 24, 2022 12:14
@thearusable thearusable force-pushed the 1899-improve-makerunnable-allocation-performance branch from 1942c6b to b138399 Compare August 25, 2022 09:52
@PhilMiller
Copy link
Member

Superseded by later smaller PRs

@PhilMiller PhilMiller closed this Aug 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve makeRunnable allocation performance
4 participants