Skip to content

Conversation

@pcarruscag
Copy link
Member

Proposed Changes

Remove some more duplication, fix some issues, and make the incompressible solvers compatible with OpenMP.

Related Work

Fix #1175

PR Checklist

  • I am submitting my contribution to the develop branch.
  • My contribution generates no new compiler warnings (try with the '-Wall -Wextra -Wno-unused-parameter -Wno-empty-body' compiler flags, or simply --warnlevel=2 when using meson).
  • My contribution is commented and consistent with SU2 style.
  • I have added a test case that demonstrates my contribution, if necessary.
  • I have updated appropriate documentation (Tutorials, Docs Page, config_template.cpp) , if necessary.

StrainMag_Max = max(StrainMag_Max, strainMax);
Omega_Max = max(Omega_Max, omegaMax);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you do the reduction over the threads yourself because of efficiency, when the if statement is false, but wouldn't it be more readable to have a reduction clause in the OMP for loop?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for having a look at the code Edwin, the issues with these kind of reductions were:

  • The microsoft compilers do not support max/min reductions;
  • For the code to compile with AD (forward) we would need to define custom reduction operators, which might not be very readable;
  • Earlier discussions about OpenMP+AD (reverse) suggested that avoiding reduction clauses would make it easier to develop a thread-safe AD tool.

For performance the reduction clause would be more efficient, as the compiler can make min/max atomic, however I do not know of a nice simple way to implement atomic min/max.

SU2_OMP_BARRIER
SU2_OMP_MASTER
{
SU2_OMP_MASTER {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For efficiency reasons then the Allreduce could be done in a single call with two arguments, as for both Omega and the Strain you compute the maximum.

class CIncEulerSolver : public CFVMFlowSolverBase<CIncEulerVariable, INCOMPRESSIBLE> {
protected:
CFluidModel *FluidModel = nullptr; /*!< \brief fluid model used in the solver */
vector<CFluidModel*> FluidModel; /*!< \brief fluid model used in the solver. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't you want to put the fluid model in the base class? CEulerSolver has exactly the same variable.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a bit different for the NEMO solver, it also has a "FluidModel" but of a different type.

Copy link
Contributor

@vdweide vdweide left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The few questions I had have been answered, so this one looks good to me.

@pcarruscag pcarruscag merged commit be378f2 into develop Jan 30, 2021
@pcarruscag pcarruscag deleted the hybrid_parallel_incomp branch January 30, 2021 14:12
Copy link
Contributor

@TobiKattmann TobiKattmann left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding omp support to the inc solver 💐 and moving some reg test accordingly
And thanks for moving some functions to the FVMFlowSolverBase 👍
Much appreciated

*/
class CEulerSolver : public CFVMFlowSolverBase<CEulerVariable, COMPRESSIBLE> {
protected:
using BaseClass = CFVMFlowSolverBase<CEulerVariable, COMPRESSIBLE>;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need this to call an implementation of the base class later like BaseClass::SetInitialCondition in this very CEulerSolver. But SetInitialCondition is also public in the FVMBase class so it should be callable anyway because we inherit that via class CEulerSolver : public CFVMFlowSolverBase ... where am I wrong?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you call SetInitialCondition in SetInitialCondition you get an infinite loop, I just wanted to make more expressive I was calling the implementation of the parent first.

Comment on lines +264 to +268
inline virtual void Viscous_Residual(unsigned long iEdge, CGeometry *geometry, CSolver **solver_container,
CNumerics *numerics, CConfig *config) { }
void Viscous_Residual_impl(unsigned long iEdge, CGeometry *geometry, CSolver **solver_container,
CNumerics *numerics, CConfig *config);
using CSolver::Viscous_Residual; /*--- Silence warning ---*/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea behind this to be able to 'fake' override Viscous_Residual for viscous flows with just calling Viscous_Residual_impl because for inviscid flow the empty implementation from here is taken. This trick has to be used because FVMBase -> Euler -> NS is the inheritance and Viscous_Residual itself cannot be overloaded because Euler is the middleman?
And what happens to the call in CIntegration to Viscous_Residual for NS flow. Is there a doubled contribution then because it is called there and in the centered/upwind residual?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, in Cintegration the empty impl of CSolver is used and in the centered/upwind residual the correct Viscous_Residual impl with the Viscous_Residual_impl init ...
Seems that i have to go back and read a couple of pages in 'Inheritance for complete morons'. fml

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep that's all correct.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except the self flagellation part, no need for that xD

Comment on lines +256 to +257
* \note The convective residual methods include a call to this for each edge,
* this allows convective and viscous loops to be "fused".
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a speed-up due to this? Or is this just for simplification?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fast code goes brrrrrrrrrrrrrrrrrrrr

Comment on lines +1044 to +1048
template <class V, ENUM_REGIME R>
void CFVMFlowSolverBase<V, R>::LoadRestart(CGeometry **geometry, CSolver ***solver,
CConfig *config, int iter, bool update_geo) {
LoadRestart_impl(geometry, solver, config, iter, update_geo);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why having the LoadRestart_impl is beneficial? The LoadRestart moved from the Euler and IncEuler to FVMFlowSolverBase so that is good but why the additional wrapper function

* \return Value of the pressure at the infinity.
*/
inline CFluidModel* GetFluidModel(void) const final { return FluidModel;}
inline CFluidModel* GetFluidModel(void) const final { return FluidModel[omp_get_thread_num()]; }
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, I guess (real) shared-memory is not viable for the FluidModel as it is filled and computes stuff on the fly so each thread gets its own?

}
}
}
BaseClass::SetInitialCondition(geometry, solver_container, config, TimeIter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the BaseClass call I mean above

CConfig *config, unsigned short iMesh, unsigned short iRKStep) {

CNumerics* numerics = numerics_container[CONV_TERM];
CNumerics* numerics = numerics_container[CONV_TERM + omp_get_thread_num()*MAX_TERMS];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you need multiple numerics instances because in the OMP loop they all individually fill the container and evaluate their residual, right?

Comment on lines 983 to 985
void CIncEulerSolver::Centered_Residual(CGeometry *geometry, CSolver **solver_container, CNumerics **numerics_container,
CConfig *config, unsigned short iMesh, unsigned short iRKStep) {

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it would be prob nice to reflect in the name that it contains the viscous residual.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah... but then I would need to change all the solvers, the vectorized residual loop is called EdgeFluxResidual to be more inclusive...

Comment on lines -3092 to 2833

/*--- Read the restart data from either an ASCII or binary SU2 file. ---*/

if (config->GetRead_Binary_Restart()) {
Read_SU2_Restart_Binary(geometry[MESH_0], config, restart_filename);
} else {
Read_SU2_Restart_ASCII(geometry[MESH_0], config, restart_filename);
}

/*--- Load data from the restart into correct containers. ---*/

counter = 0;
for (iPoint_Global = 0; iPoint_Global < geometry[MESH_0]->GetGlobal_nPointDomain(); iPoint_Global++ ) {

/*--- Retrieve local index. If this node from the restart file lives
on the current processor, we will load and instantiate the vars. ---*/

iPoint_Local = geometry[MESH_0]->GetGlobal_to_Local_Point(iPoint_Global);

if (iPoint_Local > -1) {

/*--- We need to store this point's data, so jump to the correct
offset in the buffer of data from the restart file and load it. ---*/

index = counter*Restart_Vars[1] + skipVars;
for (iVar = 0; iVar < nVar_Restart; iVar++) Solution[iVar] = Restart_Data[index+iVar];
nodes->SetSolution(iPoint_Local,Solution);
iPoint_Global_Local++;

/*--- For dynamic meshes, read in and store the
grid coordinates and grid velocities for each node. ---*/

if (dynamic_grid && val_update_geo) {

/*--- Read in the next 2 or 3 variables which are the grid velocities ---*/
/*--- If we are restarting the solution from a previously computed static calculation (no grid movement) ---*/
/*--- the grid velocities are set to 0. This is useful for FSI computations ---*/

/*--- Rewind the index to retrieve the Coords. ---*/
index = counter*Restart_Vars[1];
for (iDim = 0; iDim < nDim; iDim++) { Coord[iDim] = Restart_Data[index+iDim]; }

su2double GridVel[MAXNDIM] = {0.0};
if (!steady_restart) {
/*--- Move the index forward to get the grid velocities. ---*/
index = counter*Restart_Vars[1] + skipVars + nVar_Restart + turbVars;
for (iDim = 0; iDim < nDim; iDim++) { GridVel[iDim] = Restart_Data[index+iDim]; }
}

for (iDim = 0; iDim < nDim; iDim++) {
geometry[MESH_0]->nodes->SetCoord(iPoint_Local, iDim, Coord[iDim]);
geometry[MESH_0]->nodes->SetGridVel(iPoint_Local, iDim, GridVel[iDim]);
}
}

/*--- For static FSI problems, grid_movement is 0 but we need to read in and store the
grid coordinates for each node (but not the grid velocities, as there are none). ---*/

if (static_fsi && val_update_geo) {
/*--- Rewind the index to retrieve the Coords. ---*/
index = counter*Restart_Vars[1];
for (iDim = 0; iDim < nDim; iDim++) { Coord[iDim] = Restart_Data[index+iDim];}

for (iDim = 0; iDim < nDim; iDim++) {
geometry[MESH_0]->nodes->SetCoord(iPoint_Local, iDim, Coord[iDim]);
}
}

/*--- Increment the overall counter for how many points have been loaded. ---*/
counter++;

}
}

/*--- Detect a wrong solution file ---*/

if (iPoint_Global_Local < nPointDomain) {
SU2_MPI::Error(string("The solution file ") + restart_filename + string(" doesn't match with the mesh file!\n") +
string("It could be empty lines at the end of the file."), CURRENT_FUNCTION);
}

/*--- Update the geometry for flows on deforming meshes ---*/

if ((dynamic_grid || static_fsi) && val_update_geo) {

/*--- Communicate the new coordinates and grid velocities at the halos ---*/

geometry[MESH_0]->InitiateComms(geometry[MESH_0], config, COORDINATES);
geometry[MESH_0]->CompleteComms(geometry[MESH_0], config, COORDINATES);

if (dynamic_grid) {
geometry[MESH_0]->InitiateComms(geometry[MESH_0], config, GRID_VELOCITY);
geometry[MESH_0]->CompleteComms(geometry[MESH_0], config, GRID_VELOCITY);
}

/*--- Recompute the edges and dual mesh control volumes in the
domain and on the boundaries. ---*/

geometry[MESH_0]->SetControlVolume(config, UPDATE);
geometry[MESH_0]->SetBoundControlVolume(config, UPDATE);
geometry[MESH_0]->SetMaxLength(config);

/*--- Update the multigrid structure after setting up the finest grid,
including computing the grid velocities on the coarser levels. ---*/

for (iMesh = 1; iMesh <= config->GetnMGLevels(); iMesh++) {
iMeshFine = iMesh-1;
geometry[iMesh]->SetControlVolume(config, geometry[iMeshFine], UPDATE);
geometry[iMesh]->SetBoundControlVolume(config, geometry[iMeshFine],UPDATE);
geometry[iMesh]->SetCoord(geometry[iMeshFine]);
if (dynamic_grid) {
geometry[iMesh]->SetRestricted_GridVelocity(geometry[iMeshFine], config);
}
geometry[iMesh]->SetMaxLength(config);
}
}

/*--- Communicate the loaded solution on the fine grid before we transfer
it down to the coarse levels. We alo call the preprocessing routine
on the fine level in order to have all necessary quantities updated,
especially if this is a turbulent simulation (eddy viscosity). ---*/

solver[MESH_0][FLOW_SOL]->InitiateComms(geometry[MESH_0], config, SOLUTION);
solver[MESH_0][FLOW_SOL]->CompleteComms(geometry[MESH_0], config, SOLUTION);

/*--- For turbulent simulations the flow preprocessing is done by the turbulence solver
* after it loads its variables (they are needed to compute flow primitives). ---*/
if (!turbulent) {
solver[MESH_0][FLOW_SOL]->Preprocessing(geometry[MESH_0], solver[MESH_0], config, MESH_0, NO_RK_ITER, RUNTIME_FLOW_SYS, false);
}

/*--- Interpolate the solution down to the coarse multigrid levels ---*/

for (iMesh = 1; iMesh <= config->GetnMGLevels(); iMesh++) {
for (iPoint = 0; iPoint < geometry[iMesh]->GetnPoint(); iPoint++) {
Area_Parent = geometry[iMesh]->nodes->GetVolume(iPoint);
for (iVar = 0; iVar < nVar; iVar++) Solution[iVar] = 0.0;
for (iChildren = 0; iChildren < geometry[iMesh]->nodes->GetnChildren_CV(iPoint); iChildren++) {
Point_Fine = geometry[iMesh]->nodes->GetChildren_CV(iPoint, iChildren);
Area_Children = geometry[iMesh-1]->nodes->GetVolume(Point_Fine);
Solution_Fine = solver[iMesh-1][FLOW_SOL]->GetNodes()->GetSolution(Point_Fine);
for (iVar = 0; iVar < nVar; iVar++) {
Solution[iVar] += Solution_Fine[iVar]*Area_Children/Area_Parent;
}
}
solver[iMesh][FLOW_SOL]->GetNodes()->SetSolution(iPoint,Solution);
}
solver[iMesh][FLOW_SOL]->InitiateComms(geometry[iMesh], config, SOLUTION);
solver[iMesh][FLOW_SOL]->CompleteComms(geometry[iMesh], config, SOLUTION);

if (!turbulent) {
solver[iMesh][FLOW_SOL]->Preprocessing(geometry[iMesh], solver[iMesh], config, iMesh, NO_RK_ITER, RUNTIME_FLOW_SYS, false);
}
}

/*--- Update the old geometry (coordinates n and n-1) in dual time-stepping strategy ---*/
if (dual_time && config->GetGrid_Movement() && !config->GetDeform_Mesh() &&
(config->GetKind_GridMovement() != RIGID_MOTION)) {
Restart_OldGeometry(geometry[MESH_0], config);
}

/*--- Delete the class memory that is used to load the restart. ---*/

delete [] Restart_Vars; Restart_Vars = nullptr;
delete [] Restart_Data; Restart_Data = nullptr;
LoadRestart_impl(geometry, solver, config, val_iter, val_update_geo, Solution, nVar_Restart);

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TobiKattmann to satisfy the quirkiness of the incompressible solver that needs a default value for temperature when there is no energy equation

@jblueh jblueh mentioned this pull request Mar 1, 2021
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Incorrect aero coeffs for compressible moving grid simulations

4 participants