Skip to content

Conversation

@pcarruscag
Copy link
Member

@pcarruscag pcarruscag commented May 30, 2020

Proposed Changes

This makes the routines InitiateComms and CompleteComms (and periodic counterparts) safe to call in parallel, until now they had to be guarded inside SU2_OMP_MASTER sections.
The calls to MPI are still only made by the master thread (funneled communication) but the buffers are packed and unpacked by all threads.
I also made a slight change which seems to make communications more efficient, we were always communicating the entire buffer, which is sized for the maximum number of variables per point, because the data was packed like this:
o o o o _ _ _ _ o o o o _ ... (count = 4, maxCount = 8);
I changed it to:
o o o o o o o o ... _ _ _ _ ...
Which allows only part of the buffer to be communicated.
(The maximum size nPrimVarGrad*nDim*2 is actually quite large compared to the median nVar)

In the process I also had to make a few more CGeometry routines thread safe.

Related Work

#789
Resolves #1011

PR Checklist

  • I am submitting my contribution to the develop branch.
  • My contribution generates no new compiler warnings (try with the '-Wall -Wextra -Wno-unused-parameter -Wno-empty-body' compiler flags).
  • My contribution is commented and consistent with SU2 style.

Comment on lines 299 to 300
void PostP2PRecvs(CGeometry *geometry, CConfig *config, unsigned short commType,
unsigned short countPerPoint, bool val_reverse) const;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A consequence of this is that the current count-per-point needs to be explicitly passed to a few routines, whereas before we could always rely on the maximum value.

@pcarruscag
Copy link
Member Author

pcarruscag commented May 31, 2020

This is the sort of scalability (in terms of time to solution, not per iteration) we get now:
image
Which you can directly compare with #861

Edit: The results at 192c are actually better, it depends on the position of the cluster nodes in the network, the update is apples to apples.

Copy link
Member

@talbring talbring left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @pcarruscag! Just two small comments below.

Comment on lines +50 to 54
CGeometry::CGeometry(void) :
size(SU2_MPI::GetSize()),
rank(SU2_MPI::GetRank()) {

}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The requested spring cleaning is done, I think I'll stop here.

Copy link
Member

@economon economon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pcarruscag pcarruscag merged commit 48ee558 into develop Jun 5, 2020
@pcarruscag pcarruscag deleted the hybrid_parallel_mpi branch June 5, 2020 14:41
pcarruscag added a commit that referenced this pull request Jun 18, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants