Enabling release memory (device memory deallocations) mode after each run from the Execution Plan #444
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
TornadoVM fully manages device memory, and the way it works is similar to the Java memory management. TornadoVM has a hard limit for the maximum amount of device memory to use. Then, the TornadoVM runtime can allocate as many buffers in that region. Thus, the memory used expands until the maximum limit is reach.
Besides, TornadoVM maintains a list of free and used buffers. Thus, when an execution plan finishes, device buffers are marked as free, but never released (e.g.,
clMemFree
in OpenCL), but rather declare as free for other task-graphs to use the already allocated areas. In the case compaction is needed, TornadoVM deallocs and allocs a new consecutive region.This whole process is fully transparent for the programmer.
However, it might be cases in which programmers would like the TornadoVM runtime to free all resources after an execution plan has finished. This PR adds support for this feature.
If the flag
-Dtornado.reuse.device.buffers=False
is set, then TornadoVM allocs and deallocs device buffers every time an execution plan is launched. By default, it is set totrue
(to reuse buffers as much as possible).Behaviour
To check all JNI calls, including allocations and deallocations, we need to enable the LOG_JNI macro:
$ tornado-test --printKernel --jvm="-Dtornado.reuse.device.buffers=false" -V uk.ac.manchester.tornado.unittests.foundation.TestFloats#testVectorFloatAdd
// OpenCL
Level Zero:
PTX:
Problem description
n/ a.
Backend/s tested
Mark the backends affected by this PR.
OS tested
Mark the OS where this PR is tested.
Did you check on FPGAs?
If it is applicable, check your changes on FPGAs.
How to test the new patch?
Any test with the flag
-Dtornado.reuse.device.buffers=false
: