Skip to content

Parallel algorithms #120

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 14 commits into
base: master
Choose a base branch
from
124 changes: 124 additions & 0 deletions sources/modules/concurrency-parallelism/async-programming.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,124 @@
## Module name: Asynchronous programming

_Skeleton descriptions are typeset in italic text,_
_so please don't remove these descriptions when editing the topic._

### Overview

_Provides a short natural language abstract of the module’s contents._
_Specifies the different levels of teaching._

------------------------------------------------------------------------
Level Objective
----------------- ------------------------------------------------------
Foundational --- Knowledge about build systems

Main --- Usage of build system to compile a executable

Advanced --- Add external libraries as a dependencies

------------------------------------------------------------------------

### Motivation

_Why is this important?_
_Why do we want to learn/teach this topic?_

* Asynchronous programming allows for non-blocking functions launches and enables that functions are executed asynchronously.
* Asynchronous programming is one form of parallelism and concurrency included in the C++ standard

### Topic introduction

_Very brief introduction to the topic._

Build systems are used to configure, build, and install complex C++ projects.


### Foundational: Knowledge about build systems

#### Background/Required Knowledge

A student:
* Should know [lambdas](../functions/lambdas.md)

#### Student outcomes

_A list of things "a student should be able to" after the curriculum._
_The next word should be an action word and testable in an exam._
_Max 5 items._

A student should be able to:

1. To explain what asynchronous programming is
2. To explain what a futures and promise are

#### Caveats

_This section mentions subtle points to understand, like anything resulting in
implementation-defined, unspecified, or undefined behavior._

None

#### Points to cover

_This section lists important details for each point._

* Mention how to launch a function or lambda asynchronously using `std::async`
* Mention how to use a `std::future` as a place holder for the result

### Main: Launch functions asynchronous and obtain the results

#### Background/Required Knowledge

* All of the above.

#### Student outcomes

A student should be able to:

1. Do define a function or lambda for the computational task
2. Launch the function or lambda asynchronous and obtain the results

#### Caveats

The concept of asynchronous programming is no easy digestible for most students.


#### Points to cover

* The header `<future>` needs to be includes
* The return type of the function or lambda will the the template type of the future
* The first argument of `std::async` is the function or lambda and after that all arguments are provided

Example using a function
```
void print_square(double a)
{
std::cout << "Result=" << a * a << std::endl;
}

std::future<void> f = std::async(print,5.0);
// We could do other work here
f.get()
```

Example using lambdas
```
// Compute the sum
std::future<double> f1 = std::async([](double a, double b){ return a + b;});
// Compute the square
std::future<double> f2 = std::async([](double a){ return a * a;});

// Gather the results and add them up
double res = f1.get() + f2.get();
```

### Advanced

_These are important topics that are not expected to be covered but provide
guidance where one can continue to investigate this topic in more depth._

* How to build libraries
* How to have external libraries be downloaded during the build process
* Mention that build systems provide support for unit testing

140 changes: 140 additions & 0 deletions sources/modules/concurrency-parallelism/parallel-algorithms.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,140 @@
## Module name: Parallel algorithms

_Skeleton descriptions are typeset in italic text,_
_so please don't remove these descriptions when editing the topic._

### Overview

_Provides a short natural language abstract of the module’s contents._
_Specifies the different levels of teaching._

------------------------------------------------------------------------
Level Objective
----------------- ------------------------------------------------------
Foundational --- Knowledge about build systems

Main --- Usage of build system to compile a executable

Advanced --- Add external libraries as a dependencies

------------------------------------------------------------------------

### Motivation

_Why is this important?_
_Why do we want to learn/teach this topic?_

* Allows to execute *most* of the algorithms in the C++ standard in parallel
* Parallel algorithms are one form of parallelism included in the C++ standard

### Topic introduction

_Very brief introduction to the topic._

Algorithms that range over a large data set, can be accelerated by parallel execution.
Exectution policies allow to executed the algorithms in the C++ standard on multiple cores or a single core.

### Foundational: Knowledge about build systems

#### Background/Required Knowledge

A student:
* Should know [lambdas](../functions/lambdas.md)
* Should know [algorithms](../program-design/algorithms.md)
* Should know [iterators](../program-design/iterators.md)
* Should know [containers](../program-design/containers.md)

#### Student outcomes

_A list of things "a student should be able to" after the curriculum._
_The next word should be an action word and testable in an exam._
_Max 5 items._

A student should be able to:

1. To explain parallel and sequential execution
2. Specify the appropriate execution policy for sequntial or parallel execution

#### Caveats

_This section mentions subtle points to understand, like anything resulting in
implementation-defined, unspecified, or undefined behavior._

1. The programmer has to make sure that the algorithm is theoretical parallel. For example, specifying parallel execution using state-full lambdas might give wrong results.
2. The programmer has to make sure that the parallel execution does not lead to race conditions.

#### Points to cover

_This section lists important details for each point._

* Add execution policies as an additonal argument to the algorithm
* The C++ 17 standard is required
* Mention `std::atomic` or `std::mutex` to avoid race conditions

### Main:

#### Background/Required Knowledge

* All of the above.

#### Student outcomes

A student should be able to:

1. Do define a function or lambda for the compute kernel
2. Split the work in independent tasks to avoid race conditions
3. Explain the meaning of the four policies (`std::execution::seq`, `std::execution::par`, `std::execution::par_unseq`, and `std::execution::unseq`)

#### Caveats

The concept of parallel programming introduces bugs introdcues via race conditions


#### Points to cover

* The header `<execution>` needs to be included
* The first argument of the algorithm is the execution policy

Example using a function
```
std::vector<double> values = {1,2,3,4,5,6};
void square(double& a)
{
a = a * a;
}

// Parallel execution
std::for_each(std::execution::par_unseq,std::begin(values),std::end(values),square);

// Serial execution
std::for_each(std::execution::seq,std::begin(values),std::end(values),square);
```

Example using a predefined algorithm
```
std::vector<int> values(10000);

// Seed the random number generator
std::random_device rd;
std::mt19937 gen(rd());

// Define the range for the random numbers
std::uniform_int_distribution<> distrib(1, 100); // Generates numbers between 1 and 100

// Fill the vector with random numbers
std::generate(random_vector.begin(), random_vector.end(), [&]() { return distrib(gen); });

// Sort the vector in parallel
std::sort(std::exeuction::par,values.begin(),values.end())
```

### Advanced

_These are important topics that are not expected to be covered but provide
guidance where one can continue to investigate this topic in more depth._

* If the implementation cannot parallelize or vectorize (e.g. due to lack of resources), all standard execution policies can fall back to sequential execution.
* None of the execution policies allow for reproducibilty. This is obvious for the parallel execution policies. But even `std::ececution::seq` can execute the iterations in any order.
* Nvidia supports to run `std::execution::par` on Nvidia GPUs. However, that is not yet in the C++ standard and will only work with Nvidia's HPC compiler.
* Currently, parallel algorithms are implemented using Intel's TBB library in GCC. You can set the number of used cores using `tbb::global_control(tbb::global_control::max_allowed_parallelism, nthreads);` provided by the header `#include "tbb/tbb.h"`.

Loading