-
Notifications
You must be signed in to change notification settings - Fork 13
4 GPU acceleration
If the GPU memory is large enough to hold all variables, GPU processing is recommended for 60~100 fold improvement of execution speed.
To achieve more effective GPU acceleration, the CUDA C
code (.cu
) for GPU acceleration is generated, and compiled to the binary form .mex
. Specifically, the program that consumes the most computing resources are converted into CUDA mex
: back_diff_cuda.mexw64
, and forward_diff_cuda.mexw64
. These CUDA mex
files are compiled in our local working stations. As they are related to the type of GPU (e.g., TITIAN RTX), the form of operating system (e.g., Windows 10), and the version of CUDA (e.g., 10.0), we did not apply these mex files in the released version of Sparse-SIM. But the GPU acceleration in the released version is also feasible and effective.
This software has been tested on:
- MATLAB R2017b on (Win 10: 128 GB and NVIDIA Titan Xp: 12GB; CUDA 9.1);
- MATLAB R2019b on (Win 10: 128 GB and NVIDIA Titan RTX: 24GB; CUDA 10.0);
- MATLAB R2019b on (Win 10: 16GB and NVIDIA GTX1050Ti: 4GB, CUDA 10.2);
- MATLAB R2015b on (CentOS 7: 64GB and Tesla K40 :12GB, CUDA 9.0);
- MATLAB R2018b on (Ubuntu 18.04: 16GB and NVIDIA TITAN Xp: 12GB, CUDA 10.1);
- MATLAB R2017b on (MacOS 10: 8GB without GPU acceleration).
One or more GPUs with large memory is recommended for fast execution (see Matlab documentation for supported GPU models).
If any bugs is found, please just open an issue on this Github repository!