Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
129 commits
Select commit Hold shift + click to select a range
17b90ab
all numerics allocated one-per-thread
pcarruscag Jan 22, 2020
55739da
Merge branch 'feature_hybrid_parallel_and_SIMD_tmp' into feature_hybr…
pcarruscag Jan 23, 2020
a772260
const correct MMS classes for thread safety
pcarruscag Jan 23, 2020
04f7b37
array of FluidModels for parallel execution (not trivial to make it t…
pcarruscag Jan 23, 2020
83521b7
begin SetTimeStep
pcarruscag Jan 24, 2020
c316c8d
intermediate changes
pcarruscag Jan 24, 2020
f58bd8f
Merge branch 'restructure_solvers' into tmp
pcarruscag Jan 24, 2020
92200e8
finish time_step routine, allow array of numerics to be passed to sol…
pcarruscag Jan 24, 2020
87cb8bb
split and move CNumerics files
pcarruscag Jan 26, 2020
6ec4a40
spaces and tabs
pcarruscag Jan 26, 2020
0cdac84
update meson
pcarruscag Jan 26, 2020
64010eb
removed unused functions / variables
pcarruscag Jan 26, 2020
1d6939e
more cleaning
pcarruscag Jan 26, 2020
4be6a0b
reduce number of includes
pcarruscag Jan 26, 2020
b56b39c
fix some indentation and comments in CConfig
pcarruscag Jan 27, 2020
37b36e3
faster (much) compilation by disabling map definition when not needed
pcarruscag Jan 27, 2020
b6aa9ae
fix un-init warning
pcarruscag Jan 27, 2020
9172ca0
fix SU2_DOT compilation
pcarruscag Jan 27, 2020
038769a
fix some indentation in option_structure
pcarruscag Jan 27, 2020
088fdf2
reduce number of numerics files, difficult to understand inheritance …
pcarruscag Jan 27, 2020
118bb22
update build systems
pcarruscag Jan 27, 2020
fa5b9c5
fix build again
pcarruscag Jan 27, 2020
4f2f9d9
Merge remote-tracking branch 'upstream/develop' into feature_hybrid_p…
pcarruscag Jan 27, 2020
048f9d2
work around a segfault during parallel allocation of numerics
pcarruscag Jan 27, 2020
4a0716e
compressible centered schemes working in parallel
pcarruscag Jan 28, 2020
aa4ce5f
thread-safe compressible upwind residual loop
pcarruscag Jan 28, 2020
0e3a8dc
delete some unused variables
pcarruscag Jan 28, 2020
5534707
thread-safe AUSM family
pcarruscag Jan 28, 2020
c071319
thread safe CUSP
pcarruscag Jan 28, 2020
91303e9
thread safe MSW
pcarruscag Jan 28, 2020
b159d1d
thread safe HLLC
pcarruscag Jan 28, 2020
2dca913
thread safe Roe family
pcarruscag Jan 28, 2020
ce5447c
fix variable lifetime issues
pcarruscag Jan 28, 2020
9a30827
change the signature of Source_Residual to pass another index of nume…
pcarruscag Jan 28, 2020
8444c6b
fix calls to CNumerics::ComputeResidual in BC contexts
pcarruscag Jan 28, 2020
d3d338c
make the improved compilation mechanism more readable
pcarruscag Jan 28, 2020
03c5109
Merge remote-tracking branch 'upstream/develop' into feature_hybrid_p…
pcarruscag Jan 29, 2020
efe60d9
fix blanks and tabs in config_structure.hpp
pcarruscag Jan 29, 2020
7e307ea
fix a warning
pcarruscag Jan 29, 2020
d8aba8f
remove the config .inl
pcarruscag Jan 29, 2020
dfd8902
Merge branch 'feature_faster_compilation' into feature_hybrid_paralle…
pcarruscag Jan 29, 2020
b6387e1
fix merge
pcarruscag Jan 29, 2020
41d50e2
fix warning the right way
pcarruscag Jan 29, 2020
b46bb21
Merge remote-tracking branch 'upstream/feature_faster_compilation' in…
pcarruscag Jan 29, 2020
3ed72ef
better ComputeResidual overload for the thread-safe classes + const C…
pcarruscag Jan 29, 2020
ae9486b
Merge branch 'develop' into feature_hybrid_parallel_and_SIMD
pcarruscag Jan 29, 2020
317753a
parallel CEulerSolver::SourceResidual
pcarruscag Jan 29, 2020
4ff3d71
fluid source classes made thread-safe
pcarruscag Jan 29, 2020
740f3b9
remove a forgoten debug statement
pcarruscag Jan 29, 2020
be2824a
carried away with const, caused overload resolution problems
pcarruscag Jan 29, 2020
3072afb
fix source residual of CIncEulerSolver
pcarruscag Jan 30, 2020
d97ddda
fully exploit the fast lookup mechanisms in CSysMatrix
pcarruscag Jan 30, 2020
c632de1
parallel SetMax_Eigenvalue
pcarruscag Jan 30, 2020
d4fd63d
parallel SetUndivided_Laplacian
pcarruscag Jan 30, 2020
d18c17f
parallel SetCentered_Dissipation_Sensor
pcarruscag Jan 30, 2020
f9e9ba3
parallel SetUpwind_Ducros_Sensor
pcarruscag Jan 30, 2020
08f9972
cleanup some mpi code
pcarruscag Jan 30, 2020
e8eec6d
clean more postprocessing code
pcarruscag Jan 30, 2020
3471e4d
parallel explicit iterations
pcarruscag Jan 30, 2020
684f39b
wrong order of cases in switch
pcarruscag Jan 30, 2020
a5427c0
parallel implicit Euler iteration
pcarruscag Jan 30, 2020
2f82358
more cleanup of turbo stuff
pcarruscag Jan 30, 2020
31bc0bf
more postprocessing cleanup in NS solver
pcarruscag Jan 30, 2020
e5096f1
parallel SetVorticity_StrainMag
pcarruscag Jan 30, 2020
81de82f
primitive variables loop
pcarruscag Jan 30, 2020
49f928f
SetTime_Step of the NS solver
pcarruscag Jan 30, 2020
bbc7daf
reduce duplication off SetTime_Step
pcarruscag Jan 30, 2020
713c84f
pass one more dim of numerics in viscous residual
pcarruscag Jan 30, 2020
df42c31
parallel NSSolver viscous residual loop
pcarruscag Jan 30, 2020
064ff3d
small cleanup
pcarruscag Jan 30, 2020
29138ee
update boundary conditions to use the right overload of ComputeResidual
pcarruscag Jan 30, 2020
b3619c3
fix segfault
pcarruscag Jan 30, 2020
bbc47be
fix segfault on laminar cases
pcarruscag Jan 31, 2020
4ac3ccc
Merge remote-tracking branch 'upstream/develop' into feature_hybrid_p…
pcarruscag Jan 31, 2020
42a3982
messed up the merge
pcarruscag Jan 31, 2020
a121394
cleanup computation of aero coefficients
pcarruscag Jan 31, 2020
3a84305
complete the aero coefficients auxilary type
pcarruscag Jan 31, 2020
46fddd8
merge still not ok, files left behind
pcarruscag Jan 31, 2020
7d0bb9b
make OpenMP parallel sections easier to find by using allways a versi…
pcarruscag Jan 31, 2020
594bfda
cleaner handling of AD-compatible dot-product
pcarruscag Jan 31, 2020
3910f3d
update version of new files to 7.0.1
pcarruscag Feb 1, 2020
3cb4ad7
simplify some OpenMP reductions
pcarruscag Feb 1, 2020
253580a
parallel upwind and viscous loops of CTurbSolver
pcarruscag Feb 1, 2020
f53aec3
thread-safe convection and diffusion turbulence numerics
pcarruscag Feb 1, 2020
1061736
parallel implicit euler of CTurbSolver
pcarruscag Feb 1, 2020
7c6a91f
parallel pre and postprocessing SA and SST
pcarruscag Feb 1, 2020
9ea05ef
parallel source loops SA and SST
pcarruscag Feb 1, 2020
080f0e0
thread safe turbulence sources
pcarruscag Feb 1, 2020
4d048a0
color balancing on coarse grids, config options for edge coloring gro…
pcarruscag Feb 3, 2020
d3c016f
more ctor cleanup
pcarruscag Feb 3, 2020
ded9c08
fix #857
pcarruscag Feb 3, 2020
f14ba27
natural coloring when running single threaded
pcarruscag Feb 4, 2020
810f8b4
transform the edge loop in SetTimeStep to point loop
pcarruscag Feb 4, 2020
93eb726
make SetMax_Eigenvalue a point loop
pcarruscag Feb 4, 2020
3fd3d24
reducer strategy on coarse grids (coloring fails too often)
pcarruscag Feb 4, 2020
caa4d7b
fix AD compilation
pcarruscag Feb 4, 2020
e9ad0d5
more ctor cleanup
pcarruscag Feb 4, 2020
10aa2ce
document and cleanup reducer strategy
pcarruscag Feb 4, 2020
7ef421a
move incompressible convective numerics to new return type
pcarruscag Feb 6, 2020
2620ac6
parallel CTurbSolver::SetResidual_DualTime
pcarruscag Feb 6, 2020
32451e7
Merge remote-tracking branch 'upstream/develop' into hybrid_parallel_…
pcarruscag Feb 7, 2020
d3dbfd0
fix automake
pcarruscag Feb 7, 2020
9ba1988
cleanup of CSysVector access
pcarruscag Feb 7, 2020
b56c46c
update sliding interface testcases
pcarruscag Feb 7, 2020
c6907c2
update cases with minute changes
pcarruscag Feb 7, 2020
f013845
update UQ testcases after comparing solution
pcarruscag Feb 7, 2020
a3df199
add worksharing to some CVariable methods
pcarruscag Feb 7, 2020
1993f85
move the parallel region out of CSysSolve, clients call "Solve" alrea…
pcarruscag Feb 7, 2020
704bf09
missing sliding_interface updates
pcarruscag Feb 7, 2020
466338b
Merge remote-tracking branch 'upstream/develop' into hybrid_parallel_…
pcarruscag Feb 7, 2020
9d95257
add worksharing construct to MG routines
pcarruscag Feb 8, 2020
c06342a
split CIntegration files
pcarruscag Feb 8, 2020
3c01897
cleanup integration classes, unused/not implemented, unnecessary virt…
pcarruscag Feb 8, 2020
0f97d62
add worksharing to SingleGridIntegration
pcarruscag Feb 8, 2020
65787cc
more worksharing in CIntegration
pcarruscag Feb 8, 2020
22bbb44
prepare limiters and gradients to be called in parallel
pcarruscag Feb 10, 2020
252b2d2
fix build
pcarruscag Feb 10, 2020
2687904
single parallel section for all preprocessing, reduce number of CSolv…
pcarruscag Feb 11, 2020
fcc39c0
single parallel regions for pre/post processing turbulence solvers
pcarruscag Feb 11, 2020
c3b06ce
single parallel region for entire multigrid and singlegrid iterations
pcarruscag Feb 11, 2020
f590376
leave TODO in CSolver::AdaptCFLNumber
pcarruscag Feb 11, 2020
f188072
do the MG and SG iterations in parallel only if the solver supports it
pcarruscag Feb 12, 2020
3a123cd
cleanup unreachable calls to FEA time integration methods
pcarruscag Feb 12, 2020
e1eb786
update some sliding cases (again)
pcarruscag Feb 12, 2020
6ea86f4
allow compilation with AD+OpenMP (experimental)
pcarruscag Feb 12, 2020
9bb71bb
cleanup color loops no more need for #ifdef HAVE_OMP
pcarruscag Feb 13, 2020
f48e3e4
fix some typos
pcarruscag Feb 16, 2020
3fdfd0c
fix #846
pcarruscag Feb 16, 2020
3e98ddd
Merge remote-tracking branch 'upstream/develop' into feature_hybrid_p…
pcarruscag Feb 16, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
15 changes: 11 additions & 4 deletions Common/include/CConfig.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -1125,6 +1125,8 @@ class CConfig {

string caseName; /*!< \brief Name of the current case */

unsigned long edgeColorGroupSize; /*!< \brief Size of the edge groups colored for OpenMP parallelization of edge loops. */

/*!
* \brief Set the default values of config options not set in the config file using another config object.
* \param config - Config object to use the default values from.
Expand Down Expand Up @@ -4247,7 +4249,7 @@ class CConfig {
* \brief Get whether to "Use Accurate Jacobians" for AUSM+up(2) and SLAU(2).
* \return yes/no.
*/
bool GetUse_Accurate_Jacobians(void) { return Use_Accurate_Jacobians; }
bool GetUse_Accurate_Jacobians(void) const { return Use_Accurate_Jacobians; }

/*!
* \brief Get the kind of integration scheme (explicit or implicit)
Expand Down Expand Up @@ -4450,7 +4452,7 @@ class CConfig {
* \brief Factor by which to multiply the dissipation contribution to Jacobians of central schemes.
* \return The factor.
*/
su2double GetCent_Jac_Fix_Factor(void) { return Cent_Jac_Fix_Factor; }
su2double GetCent_Jac_Fix_Factor(void) const { return Cent_Jac_Fix_Factor; }

/*!
* \brief Get the kind of integration scheme (explicit or implicit)
Expand Down Expand Up @@ -5829,7 +5831,7 @@ class CConfig {
* \brief Get a pointer to the body force vector.
* \return A pointer to the body force vector.
*/
su2double* GetBody_Force_Vector(void) { return Body_Force_Vector; }
const su2double* GetBody_Force_Vector(void) const { return Body_Force_Vector; }

/*!
* \brief Get information about the rotational frame.
Expand Down Expand Up @@ -8487,7 +8489,7 @@ class CConfig {
* \param[in] val_coeff - Index of the coefficient.
* \return Alpha coefficient for the Runge-Kutta integration scheme.
*/
su2double* Get_Electric_Field_Dir(void) { return Electric_Field_Dir; }
const su2double* Get_Electric_Field_Dir(void) const { return Electric_Field_Dir; }

/*!
* \brief Check if the user wants to apply the load as a ramp.
Expand Down Expand Up @@ -9201,4 +9203,9 @@ class CConfig {
*/
unsigned long GetLinear_Solver_Prec_Threads(void) const { return Linear_Solver_Prec_Threads; }

/*!
* \brief Get the size of the edge groups colored for OpenMP parallelization of edge loops.
*/
unsigned long GetEdgeColoringGroupSize(void) const { return edgeColorGroupSize; }

};
6 changes: 3 additions & 3 deletions Common/include/interpolation_structure.hpp
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
/*!
* \file interpolation_structure.hpp
* \brief Headers of the main subroutines used by SU2_FSI.
* The subroutines and functions are in the <i>interpolation_structure.cpp</i> file.
* \brief Headers of classes used for multiphysics interpolation.
* The implementation is in the <i>interpolation_structure.cpp</i> file.
* \author H. Kline
* \version 7.0.1 "Blackbird"
*
Expand Down Expand Up @@ -244,7 +244,7 @@ class CIsoparametric : public CInterpolator {
* \param[in] nDim - the dimension of the coordinates.
* \param[in] iZone_1 - zone index of the element to use for interpolation (the DONOR zone)
* \param[in] donor_elem - element index of the element to use for interpolation (or global index of a point in 2D)
* \param[in[ nDonorPoints - number of donor points in the element.
* \param[in] nDonorPoints - number of donor points in the element.
* \param[in] xj - point projected onto the plane of the donor element.
* \param[out] isoparams - isoparametric coefficients. Must be allocated to size nNodes ahead of time. (size> nDonors)
*
Expand Down
92 changes: 88 additions & 4 deletions Common/include/linear_algebra/CSysMatrix.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -95,6 +95,7 @@ class CSysMatrix {
const unsigned long *row_ptr; /*!< \brief Pointers to the first element in each row. */
const unsigned long *dia_ptr; /*!< \brief Pointers to the diagonal element in each row. */
const unsigned long *col_ind; /*!< \brief Column index for each of the elements in val(). */
vector<const ScalarType*> col_ptr;/*!< \brief The transpose of col_ind, pointer to blocks with the same column index. */

ScalarType *ILU_matrix; /*!< \brief Entries of the ILU sparse matrix. */
unsigned long nnz_ilu; /*!< \brief Number of possible nonzero entries in the matrix (ILU). */
Expand Down Expand Up @@ -440,7 +441,7 @@ class CSysMatrix {
* \param[in] val_block - Block to set to A(i, j).
*/
template<class OtherType>
inline void SetBlock(unsigned long block_i, unsigned long block_j, OtherType **val_block) {
inline void SetBlock(unsigned long block_i, unsigned long block_j, const OtherType* const* val_block) {

unsigned long iVar, jVar, index;

Expand Down Expand Up @@ -503,7 +504,7 @@ class CSysMatrix {
* \param[in] val_block - Block to add to A(i, j).
*/
template<class OtherType>
inline void AddBlock(unsigned long block_i, unsigned long block_j, OtherType **val_block) {
inline void AddBlock(unsigned long block_i, unsigned long block_j, const OtherType* const* val_block) {

unsigned long iVar, jVar, index;

Expand All @@ -524,7 +525,7 @@ class CSysMatrix {
* \param[in] val_block - Block to subtract to A(i, j).
*/
template<class OtherType>
inline void SubtractBlock(unsigned long block_i, unsigned long block_j, OtherType **val_block) {
inline void SubtractBlock(unsigned long block_i, unsigned long block_j, const OtherType* const* val_block) {

unsigned long iVar, jVar, index;

Expand All @@ -550,7 +551,7 @@ class CSysMatrix {
*/
template<class OtherType, int Sign = 1>
inline void UpdateBlocks(unsigned long iEdge, unsigned long iPoint, unsigned long jPoint,
OtherType **block_i, OtherType **block_j) {
const OtherType* const* block_i, const OtherType* const* block_j) {

ScalarType *bii = &matrix[dia_ptr[iPoint]*nVar*nEqn];
ScalarType *bjj = &matrix[dia_ptr[jPoint]*nVar*nEqn];
Expand All @@ -570,6 +571,84 @@ class CSysMatrix {
}
}

/*!
* \brief Short-hand for the "subtractive" version (sub from i* add to j*) of UpdateBlocks.
*/
template<class OtherType>
inline void UpdateBlocksSub(unsigned long iEdge, unsigned long iPoint, unsigned long jPoint,
const OtherType* const* block_i, const OtherType* const* block_j) {
UpdateBlocks<OtherType,-1>(iEdge, iPoint, jPoint, block_i, block_j);
}

/*!
* \brief Update 2 blocks ij and ji (add to i* sub from j*).
* \note The template parameter Sign, can be used create a "subtractive"
* update i.e. subtract from row i and add to row j instead.
* \param[in] edge - Index of edge that connects iPoint and jPoint.
* \param[in] block_i - Subs from ji.
* \param[in] block_j - Adds to ij.
*/
template<class OtherType, int Sign = 1>
inline void UpdateBlocks(unsigned long iEdge, const OtherType* const* block_i, const OtherType* const* block_j) {

ScalarType *bij = &matrix[edge_ptr(iEdge,0)*nVar*nEqn];
ScalarType *bji = &matrix[edge_ptr(iEdge,1)*nVar*nEqn];

unsigned long iVar, jVar, offset = 0;

for (iVar = 0; iVar < nVar; iVar++) {
for (jVar = 0; jVar < nEqn; jVar++) {
bij[offset] += PassiveAssign<ScalarType,OtherType>(block_j[iVar][jVar]) * Sign;
bji[offset] -= PassiveAssign<ScalarType,OtherType>(block_i[iVar][jVar]) * Sign;
++offset;
}
}
}

/*!
* \brief Short-hand for the "subtractive" version (sub from i* add to j*) of UpdateBlocks.
*/
template<class OtherType>
inline void UpdateBlocksSub(unsigned long iEdge, const OtherType* const* block_i, const OtherType* const* block_j) {
UpdateBlocks<OtherType,-1>(iEdge, block_i, block_j);
}

/*!
* \brief Adds the specified block to the (i, i) subblock of the matrix-by-blocks structure.
* \param[in] block_i - Diagonal index.
* \param[in] val_block - Block to add to the diagonal of the matrix.
*/
template<class OtherType>
inline void AddBlock2Diag(unsigned long block_i, const OtherType* const* val_block) {

ScalarType *bii = &matrix[dia_ptr[block_i]*nVar*nEqn];

unsigned long iVar, jVar, offset = 0;

for (iVar = 0; iVar < nVar; iVar++)
for (jVar = 0; jVar < nEqn; jVar++)
bii[offset++] += PassiveAssign<ScalarType,OtherType>(val_block[iVar][jVar]);

}

/*!
* \brief Subtracts the specified block from the (i, i) subblock of the matrix-by-blocks structure.
* \param[in] block_i - Diagonal index.
* \param[in] val_block - Block to subtract from the diagonal of the matrix.
*/
template<class OtherType>
inline void SubtractBlock2Diag(unsigned long block_i, const OtherType* const* val_block) {

ScalarType *bii = &matrix[dia_ptr[block_i]*nVar*nEqn];

unsigned long iVar, jVar, offset = 0;

for (iVar = 0; iVar < nVar; iVar++)
for (jVar = 0; jVar < nEqn; jVar++)
bii[offset++] -= PassiveAssign<ScalarType,OtherType>(val_block[iVar][jVar]);

}

/*!
* \brief Adds the specified value to the diagonal of the (i, i) subblock
* of the matrix-by-blocks structure.
Expand Down Expand Up @@ -616,6 +695,11 @@ class CSysMatrix {
template<class OtherType>
void EnforceSolutionAtNode(const unsigned long node_i, const OtherType *x_i, CSysVector<OtherType> & b);

/*!
* \brief Sets the diagonal entries of the matrix as the sum of the blocks in the corresponding column.
*/
void SetDiagonalAsColumnSum();

/*!
* \brief Add a scaled sparse matrix to "this" (axpy-type operation, A = A+alpha*B).
* \note Matrices must have the same sparse pattern.
Expand Down
4 changes: 4 additions & 0 deletions Common/include/linear_algebra/CSysSolve.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,10 @@ using namespace std;
* creating CSysSolve objects we can more easily assign different
* matrix-vector products and preconditioners to different problems
* that may arise in a hierarchical solver (i.e. multigrid).
*
* The methods of this class are designed to be called by multiple OpenMP threads.
* Beware of writes to class member variables, for example "Residual" should only
* be modified by one thread.
*/
template<class ScalarType>
class CSysSolve {
Expand Down
8 changes: 6 additions & 2 deletions Common/include/linear_algebra/CSysVector.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ class CSysVector {
* \brief Generic initialization from a scalar or array.
* \note If val==nullptr vec_val is not initialized, only allocated.
* \param[in] numBlk - number of blocks locally
* \param[in] numBlkDomain - number of blocks locally (without g cells)
* \param[in] numBlkDomain - number of blocks locally (without ghost cells)
* \param[in] numVar - number of variables in each block
* \param[in] val - default value for elements
* \param[in] valIsArray - if true val is treated as array
Expand Down Expand Up @@ -360,7 +360,11 @@ class CSysVector {
* \param[in] val_var - inde of the residual to be set.
* \return Value of the residual.
*/
inline ScalarType GetBlock(unsigned long val_ipoint, unsigned long val_var) const {
inline const ScalarType& operator() (unsigned long val_ipoint, unsigned long val_var) const {
return vec_val[val_ipoint*nVar+val_var];
}
inline ScalarType& operator() (unsigned long val_ipoint, unsigned long val_var) {
return vec_val[val_ipoint*nVar+val_var];
}

};
53 changes: 52 additions & 1 deletion Common/include/omp_structure.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,8 @@

#pragma once

#include <type_traits>

#if defined(_MSC_VER)
#define PRAGMIZE(X) __pragma(X)
#else
Expand All @@ -46,7 +48,8 @@

/*--- Detect compilation with OpenMP support, protect agaisnt
* using OpenMP with AD (not supported yet). ---*/
#if defined(_OPENMP) && !defined(CODI_REVERSE_TYPE) && !defined(CODI_FORWARD_TYPE)
//#if defined(_OPENMP) && !defined(CODI_REVERSE_TYPE) && !defined(CODI_FORWARD_TYPE)
#if defined(_OPENMP)
#define HAVE_OMP
#include <omp.h>

Expand Down Expand Up @@ -84,7 +87,9 @@ inline constexpr int omp_get_thread_num(void) {return 0;}
#define SU2_OMP_SIMD SU2_OMP(simd)

#define SU2_OMP_MASTER SU2_OMP(master)
#define SU2_OMP_ATOMIC SU2_OMP(atomic)
#define SU2_OMP_BARRIER SU2_OMP(barrier)
#define SU2_OMP_CRITICAL SU2_OMP(critical)

#define SU2_OMP_PARALLEL SU2_OMP(parallel)
#define SU2_OMP_PARALLEL_(ARGS) SU2_OMP(parallel ARGS)
Expand Down Expand Up @@ -122,3 +127,49 @@ inline size_t computeStaticChunkSize(size_t totalWork,
return roundUpDiv(workPerThread, chunksPerThread);
}

/*!
* \brief Copy data from one array-like object to another in parallel.
* \param[in] size - Number of elements.
* \param[in] src - Source array.
* \param[in] dst - Destination array.
*/
template<class T, class U>
void parallelCopy(size_t size, const T* src, U* dst)
{
SU2_OMP_FOR_STAT(4196)
for(size_t i=0; i<size; ++i) dst[i] = src[i];
}

/*!
* \brief Set the entries of an array-like object to a constant value in parallel.
* \param[in] size - Number of elements.
* \param[in] val - Value to set.
* \param[in] dst - Destination array.
*/
template<class T, class U>
void parallelSet(size_t size, T val, U* dst)
{
SU2_OMP_FOR_STAT(4196)
for(size_t i=0; i<size; ++i) dst[i] = val;
}

/*!
* \brief Atomically update a (shared) lhs value with a (local) rhs value.
* \note For types without atomic support (non-arithmetic) this is done via critical.
* \param[in] rhs - Local variable being added to the shared one.
* \param[in,out] lhs - Shared variable being updated.
*/
template<class T,
typename std::enable_if<!std::is_arithmetic<T>::value,bool>::type = 0>
inline void atomicAdd(T rhs, T& lhs)
{
SU2_OMP_CRITICAL
lhs += rhs;
}
template<class T,
typename std::enable_if<std::is_arithmetic<T>::value,bool>::type = 0>
inline void atomicAdd(T rhs, T& lhs)
{
SU2_OMP_ATOMIC
lhs += rhs;
}
Loading