Skip to content

Hybrid OpenMP-MPI implementation Strategy + Vectorization (SIMD) #789

@pcarruscag

Description

@pcarruscag

Preamble

I am moving the discussion about SIMD that started in #716 here and adding hybrid parallelization.
The two topics go hand in hand since both (SPMD and SIMD) consist of processing multiple data (MD) elements simultaneous, either by a single program (SP) that is run by multiple threads (generally with shared view of memory), or by a single instruction (SI) run by a single core.
The reason SIMD came up in #716 is that, as I will demonstrate, vectorization needs to be supported by data structures. On the other hand SPMD needs to be supported by algorithms designed to avoid race conditions, two or more threads modifying the same memory location.

Instead of continuing #716 I think it is better to let that become documentation for #753.
I did not add without loss of readability to the title because it is long as is, that requirement is present nonetheless.
I open these issues in the hope that people participate (I am not a fast writer so this is actually a lot of work) and so far great comments and insights have come from those with experience in these topics (kudos to @economon and @vdweide). But please participate even if you never heard of these topics, your opinion about readability and "developability" of the code is important! I think the code-style should be accessible to people starting a PhD (after they read a bit about C++...).

My (ambitious) goal with this work is to lay down an architecture for performance, i.e. not just to improve the performance of a few key numerical schemes but to create mechanisms applicable to all existing and future ones. Moreover I want that to be possible with minimal changes to the way those bits of code are currently written.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions