Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support lazy copy-out for batch processing #376

Merged
merged 6 commits into from
Apr 16, 2024

Conversation

mairooni
Copy link
Collaborator

Description

This PR provides support for lazy copy-out in batch processing. The fix is related to the issue #333.

Problem description

When the mode of the copy-out was set to UNDER_DEMAND on a TaskGraph that uses batches, the transferToHost copied only the first batch of the output data. In this PR, the stream-out is invoked with the appropriate offsets for each batch, writing all the data to the host buffer.

Backend/s tested

Mark the backends affected by this PR.

  • OpenCL
  • PTX
  • SPIRV

OS tested

Mark the OS where this PR is tested.

  • Linux
  • OSx
  • Windows

Did you check on FPGAs?

If it is applicable, check your changes on FPGAs.

  • Yes
  • No

How to test the new patch?

Run the following unittests:
tornado-test -V --fast uk.ac.manchester.tornado.unittests.batches.TestBatches#test100MBSmallLazy
tornado-test -V --fast uk.ac.manchester.tornado.unittests.batches.TestBatches#test100MBLazy
tornado-test -V --fast uk.ac.manchester.tornado.unittests.batches.TestBatches#test300MBLazy
tornado-test -V --fast uk.ac.manchester.tornado.unittests.batches.TestBatches#test512MBLazy
tornado-test -V --fast uk.ac.manchester.tornado.unittests.batches.TestBatches#testBatchNotEven2Lazy

@mairooni mairooni added runtime fix Provides a fix labels Apr 12, 2024
@mairooni mairooni self-assigned this Apr 12, 2024
@jjfumero
Copy link
Member

I get segfaults with the PTX backend. Can you reproduce it?

tornado-test -V --fast uk.ac.manchester.tornado.unittests.batches.TestBatches
tornado --jvm "-Xmx6g -Dtornado.recover.bailout=False -Dtornado.unittests.verbose=True "  -m  tornado.unittests/uk.ac.manchester.tornado.unittests.tools.TornadoTestRunner  --params "uk.ac.manchester.tornado.unittests.batches.TestBatches"
WARNING: Using incubator modules: jdk.incubator.vector
#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x00007fb630f8d2c0, pid=14114, tid=14115
#
# JRE version: Java(TM) SE Runtime Environment (21.0.2+13) (build 21.0.2+13-LTS-58)
# Java VM: Java HotSpot(TM) 64-Bit Server VM (21.0.2+13-LTS-58, mixed mode, tiered, jvmci, parallel gc, linux-amd64)
# Problematic frame:
# C  [libcuda.so.1+0x18d2c0]
#
# Core dump will be written. Default location: Core dumps may be processed with "/usr/lib/systemd/systemd-coredump %P %u %g %s %t %c %h" (or dumping to /home/juan/tornadovm/TornadoVM/core.14114)
#
# An error report file with more information is saved as:
# /home/juan/tornadovm/TornadoVM/hs_err_pid14114.log
#
# If you would like to submit a bug report, please visit:
#   https://bugreport.java.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#

@jjfumero
Copy link
Member

This is the one breaking:

tornado-test -V --fast uk.ac.manchester.tornado.unittests.batches.TestBatches#test300MBLazy

@jjfumero
Copy link
Member

OpenCL and SPIR-V are passing.

Copy link
Member

@jjfumero jjfumero left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code looks good to me. Let's investigate the error with the PTX backend.

@mairooni
Copy link
Collaborator Author

There was a bug in the PTXMemorySegmentWrapper class, I just pushed a fix. Please check again.

@jjfumero
Copy link
Member

Thanks. It works now

Copy link
Member

@mikepapadim mikepapadim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, can we document this in another PR

@jjfumero jjfumero merged commit 4204e3b into beehive-lab:develop Apr 16, 2024
2 checks passed
@mairooni mairooni deleted the fix/batch_copyout branch April 16, 2024 08:04
jjfumero added a commit to jjfumero/TornadoVM that referenced this pull request Apr 30, 2024
Improvements
~~~~~~~~~~~~~~~~~~

- [beehive-lab#369](beehive-lab#369): Introduction of Tensor types in TornadoVM API and interoperability with ONNX Runtime.
- [beehive-lab#370](beehive-lab#370): Array concatenation operation for TornadoVM native arrays.
- [beehive-lab#371](beehive-lab#371): TornadoVM installer script ported for Windows 10/11.
- [beehive-lab#372](beehive-lab#372): Add support for ``HalfFloat`` (``Float16``) in vector types.
- [beehive-lab#374](beehive-lab#374): Support for TornadoVM array concatenations from the constructor-level.
- [beehive-lab#375](beehive-lab#375): Support for TornadoVM native arrays using slices from the Panama API.
- [beehive-lab#376](beehive-lab#376): Support for lazy copy-outs in the batch processing mode.
- [beehive-lab#377](beehive-lab#377): Expand the TornadoVM profiler with power metrics for NVIDIA GPUs (OpenCL and PTX backends).
- [beehive-lab#384](beehive-lab#384): Auto-closable Execution Plans for automatic memory management.

Compatibility
~~~~~~~~~~~~~~~~~~

- [beehive-lab#386](beehive-lab#386): OpenJDK 17 support removed.
- [beehive-lab#390](beehive-lab#390): SapMachine OpenJDK 21 supported.
- [beehive-lab#395](beehive-lab#395): OpenJDK 22 and GraalVM 22.0.1 supported.
- TornadoVM tested with Apple M3 chips.

Bug Fixes
~~~~~~~~~~~~~~~~~~

- [beehive-lab#367](beehive-lab#367): Fix for Graal/Truffle languages in which some Java modules were not visible.
- [beehive-lab#373](beehive-lab#373): Fix for data copies of the ``HalfFloat`` types for all backends.
- [beehive-lab#378](beehive-lab#378): Fix free memory markers when running multi-thread execution plans.
- [beehive-lab#379](beehive-lab#379): Refactoring package of vector api unit-tests.
- [beehive-lab#380](beehive-lab#380): Fix event list sizes to accommodate profiling of large applications.
- [beehive-lab#385](beehive-lab#385): Fix code check style.
- [beehive-lab#387](beehive-lab#387): Fix TornadoVM internal events in OpenCL, SPIR-V and PTX for running multi-threaded execution plans.
- [beehive-lab#388](beehive-lab#388): Fix of expected and actual values of tests.
- [beehive-lab#392](beehive-lab#392): Fix installer for using existing JDKs.
- [beehive-lab#389](beehive-lab#389): Fix ``DataObjectState`` for multi-thread execution plans.
- [beehive-lab#396](beehive-lab#396): Fix JNI code for the CUDA NVML library access with OpenCL.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
fix Provides a fix runtime
Projects
Development

Successfully merging this pull request may close these issues.

3 participants