Skip to content

Commit 36e79dc

Browse files
jokr91drroe
andcommitted
Interaction Energy calculation on GPU for GIST (#787)
* Added GPU Compatibility to GIST Added changes in the Makefile of the CUDA code, to include the correct compilation of the GIST CUDA implementation. The call to the GPU is done as an extra function, instead of the call to NonbondEnergy, NonbondCuda is called. Since the NonbondCuda also directly can calculate the order parameters, the function Order is also replaced by the single call to NonbondCuda. I added #ifdef statements around each change, so that the changes are only effecting the code, when compiled with CUDA. If the code is compiled without CUDA, the code is exactly the same as in the cpptraj implementation. Some further new functions were added to the implementation, so that the NonbondCuda can be called easily from the rest of the code. These functions include a possibility to free the GPUMemory (freeGPUMemory) from the main program, a function to copy stuff from the Host memory to the CUDA capable device (copyToGPU). A few more variables were added, which are used for copying of data to and from the GPU, and keeping track of a variety of atom parameters (like charge, atom type, ...). Most of the time, when working on the host memory, the vector class from the STL is used, this is not true for the solvent_ array. The reason for this is a feature concerning vector<bool>, which is not an array holding boolean values, but a bit string, where at each bit, one true or false value is stored. For copying the memory to the device, the use of this functionality is not an option, thus the solvent_ array remains an allocated array in the host memory. The source files for the device code were implemented in the same way as in GIGIST (compare www.github.com/liedllab/gigist). However, they were all moved together into the cuda_kernels. In the RunTest.sh, a different test for the CUDA version is supplied. Actually, everything stays the same, except for the comparison with Eww_ij, which is not calculated on the GPU, for memory reasons and thus can also not be tested against. This is done by using the CheckEnv function to recognize the cuda and notcuda keywords for the Requires and CheckFor statements. The answer to why the Eww_ij is not implemented on the GPU is quite simple: Consider the following example. If a grid is constructed, consisting of 100 x 100 x 100 voxels, with a grid spacing of 0.5 Angstrom, this is actually just a box of size 50 x 50 x 50 Angstrom. But if one would calculated the entire Eww_ij matrix for this kind of calculation, a total of 10^12 data points would need to be saved. This results from the calculation to compare each voxel with each other. Given 100 x 100 x 100 voxels, this results in (10^6)^2 data points, which evaluates to 10^12. At each data point, a value for the energy has to be stored, when using ASCII-code, this will probably take about 10 characters, resulting in a final file size of 10TB (and this does not even consider the 4TB of memory that would be needed). * Added skipS flag A flag to skip the entropy calculation was added. Also, the indentation was fixed for all files. * Updated Documentation, Change Version Added a paragraph with a couple of tips for the GPU implementation and also added a statement concerning the new skipS keyword. Changed the version number of cpptraj in src/Version.h to 4.25.0. * Changes to comply with reviews by @drroe - Removed ConstantsG.cuh - Fixed printf error - Added gist to the CUDA enabled commands Co-authored-by: Daniel R. Roe <daniel.r.roe@gmail.com>
1 parent 060a3b4 commit 36e79dc

File tree

12 files changed

+1659
-183
lines changed

12 files changed

+1659
-183
lines changed

doc/cpptraj.lyx

Lines changed: 30 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1174,6 +1174,10 @@ closest
11741174
watershell
11751175
\end_layout
11761176

1177+
\begin_layout LyX-Code
1178+
gist
1179+
\end_layout
1180+
11771181
\begin_layout Section
11781182
General Concepts
11791183
\end_layout
@@ -22490,7 +22494,7 @@ gist
2249022494

2249122495
\end_inset
2249222496

22493-
[doorder] [doeij] [skipE] [refdens <rdval>] [temp <tval>]
22497+
[doorder] [doeij] [skipS] [skipE] [refdens <rdval>] [temp <tval>]
2249422498
\end_layout
2249522499

2249622500
\begin_layout LyX-Code
@@ -22543,6 +22547,10 @@ literal "true"
2254322547
).
2254422548
\end_layout
2254522549

22550+
\begin_layout Description
22551+
[skipS] Skip all entropy calculations.
22552+
\end_layout
22553+
2254622554
\begin_layout Description
2254722555

2254822556
\series bold
@@ -22922,6 +22930,27 @@ refdens
2292222930
keyword, instead of allowing GIST to supply the default value.
2292322931
\end_layout
2292422932

22933+
\begin_layout Standard
22934+
For GIST, a GPU accelerated version is available, in which the interaction
22935+
energy is calculated using CUDA.
22936+
When using the GPU accelerated version of GIST, the
22937+
\series bold
22938+
doeij
22939+
\series default
22940+
keyword is not available.
22941+
It is recommended to use a grid covering the entire box, when using the
22942+
GPU implementation.
22943+
You may also choose a smaller grid, but all interaction energies, i.e., each
22944+
atom with each atom, will always be calculated independent of the chosen
22945+
grid.
22946+
This ensures optimum performance when calculating the interaction energies.
22947+
Thus, the additional time required to calculate the order parameters (
22948+
\series bold
22949+
doorder
22950+
\series default
22951+
) is negligible.
22952+
\end_layout
22953+
2292522954
\begin_layout Subsubsection*
2292622955
\paragraph_spacing other 3
2292722956
\noindent

0 commit comments

Comments
 (0)