Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Autoclosable Execution Plans for automatic memory management (free resources) #384

Merged
merged 9 commits into from
Apr 23, 2024

Conversation

jjfumero
Copy link
Member

Description

There is a potential problem of having too many execution plans “open” without freeing the resources. The main issue is that TornadoVM cannot close/free the resources until the user explicitly calls freeDeviceMemory using the execution plan. This is by design, since an execution plan can reuse buffers and code cache.

However, there might be cases in which developers want to free resources using the try-with-resources clause from Java. In this case, the execution plan is created within a scope, and as soon that the syntatic scope is finished, we want to free all resources that a specific instance of an execution plan occupied (e.g. , free all memory buffers).

TornadoVM Try-With-Resources

From now on, developers can instance an execution plan using the try-with-resources statement. For example:

TaskGraph taskGraph = new TaskGraph("stress" + dataSizeFactor) //
       .transferToDevice(DataTransferMode.EVERY_EXECUTION, inputArray) //
       .task("moveData", TestStressDeviceMemory::moveData, inputArray, outputArray) //
       .transferToHost(DataTransferMode.EVERY_EXECUTION, outputArray);


ImmutableTaskGraph immutableTaskGraph = taskGraph.snapshot();
try (TornadoExecutionPlan executionPlan = new TornadoExecutionPlan(immutableTaskGraph)) {
   executionPlan.execute();
} catch (TornadoExecutionPlanException e) {}

When resources are freed, the TornadoVM runtime invokes the free device memory function, which calls an internal function to declare all the memory buffers used as free and ready to be used by other execution plans.

Backend/s tested

Mark the backends affected by this PR.

  • OpenCL
  • PTX
  • SPIRV

OS tested

Mark the OS where this PR is tested.

  • Linux
  • OSx
  • Windows

Did you check on FPGAs?

If it is applicable, check your changes on FPGAs.

  • Yes
  • No

How to test the new patch?

$ tornado-test --fast --threadInfo --jvm="-Xmx12g -Dtornado.device.memory=4GB" -V uk.ac.manchester.tornado.unittests.memory.TestStressDeviceMemory

@jjfumero jjfumero added enhancement New feature or request runtime feature New feature proposal labels Apr 22, 2024
@jjfumero jjfumero self-assigned this Apr 22, 2024
@jjfumero jjfumero requested review from stratika and removed request for gigiblender April 22, 2024 11:17
Copy link
Collaborator

@stratika stratika left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor fix to be done. Please fix the memoryplan test in the tornado-test script.

tornado-assembly/src/bin/tornado-test Show resolved Hide resolved
public void test02() {

long maxMemory = Runtime.getRuntime().maxMemory();
final long _12GB = (1024L * 1024 * 1024 * 12);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code style error:

[ERROR] src/main/java/uk/ac/manchester/tornado/unittests/memory/TestStressDeviceMemory.java:[89,20] (naming) LocalFinalVariableName: Name '_12GB' must match pattern '^[a-z][a-zA-Z0-9]*$'.

Copy link
Collaborator

@mairooni mairooni left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@jjfumero jjfumero merged commit b8fdd54 into beehive-lab:develop Apr 23, 2024
2 checks passed
@jjfumero jjfumero deleted the fix/memory/fill branch April 23, 2024 11:21
jjfumero added a commit to jjfumero/TornadoVM that referenced this pull request Apr 30, 2024
Improvements
~~~~~~~~~~~~~~~~~~

- [beehive-lab#369](beehive-lab#369): Introduction of Tensor types in TornadoVM API and interoperability with ONNX Runtime.
- [beehive-lab#370](beehive-lab#370): Array concatenation operation for TornadoVM native arrays.
- [beehive-lab#371](beehive-lab#371): TornadoVM installer script ported for Windows 10/11.
- [beehive-lab#372](beehive-lab#372): Add support for ``HalfFloat`` (``Float16``) in vector types.
- [beehive-lab#374](beehive-lab#374): Support for TornadoVM array concatenations from the constructor-level.
- [beehive-lab#375](beehive-lab#375): Support for TornadoVM native arrays using slices from the Panama API.
- [beehive-lab#376](beehive-lab#376): Support for lazy copy-outs in the batch processing mode.
- [beehive-lab#377](beehive-lab#377): Expand the TornadoVM profiler with power metrics for NVIDIA GPUs (OpenCL and PTX backends).
- [beehive-lab#384](beehive-lab#384): Auto-closable Execution Plans for automatic memory management.

Compatibility
~~~~~~~~~~~~~~~~~~

- [beehive-lab#386](beehive-lab#386): OpenJDK 17 support removed.
- [beehive-lab#390](beehive-lab#390): SapMachine OpenJDK 21 supported.
- [beehive-lab#395](beehive-lab#395): OpenJDK 22 and GraalVM 22.0.1 supported.
- TornadoVM tested with Apple M3 chips.

Bug Fixes
~~~~~~~~~~~~~~~~~~

- [beehive-lab#367](beehive-lab#367): Fix for Graal/Truffle languages in which some Java modules were not visible.
- [beehive-lab#373](beehive-lab#373): Fix for data copies of the ``HalfFloat`` types for all backends.
- [beehive-lab#378](beehive-lab#378): Fix free memory markers when running multi-thread execution plans.
- [beehive-lab#379](beehive-lab#379): Refactoring package of vector api unit-tests.
- [beehive-lab#380](beehive-lab#380): Fix event list sizes to accommodate profiling of large applications.
- [beehive-lab#385](beehive-lab#385): Fix code check style.
- [beehive-lab#387](beehive-lab#387): Fix TornadoVM internal events in OpenCL, SPIR-V and PTX for running multi-threaded execution plans.
- [beehive-lab#388](beehive-lab#388): Fix of expected and actual values of tests.
- [beehive-lab#392](beehive-lab#392): Fix installer for using existing JDKs.
- [beehive-lab#389](beehive-lab#389): Fix ``DataObjectState`` for multi-thread execution plans.
- [beehive-lab#396](beehive-lab#396): Fix JNI code for the CUDA NVML library access with OpenCL.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request feature New feature proposal runtime
Projects
Development

Successfully merging this pull request may close these issues.

4 participants