-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] Query Memory Consumption and expanded profiler with memory consumption #448
Conversation
tornado-api/src/main/java/uk/ac/manchester/tornado/api/profiler/ProfilerType.java
Outdated
Show resolved
Hide resolved
public long getTotalBytesTransferred() { | ||
return taskGraph.getTotalBytesTransferred(); | ||
} | ||
|
||
public long getTotalDeviceMemoryUsage() { | ||
return taskGraph.getTotalDeviceMemoryUsage(); | ||
} | ||
|
||
public long getCurrentMemoryUsage() { | ||
return taskGraph.getCurrentMemoryUsage(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's remove the public
access modifier for those methods, since we do not want to expose them from the TaskGraph.
public long getTotalBytesTransferred() { | ||
return taskGraphImpl.getTotalBytesTransferred(); | ||
} | ||
|
||
public long getTotalDeviceMemoryUsage() { | ||
return taskGraphImpl.getTotalDeviceMemoryUsage(); | ||
} | ||
|
||
public long getCurrentMemoryUsage() { | ||
return taskGraphImpl.getCurrentMemoryUsage(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, let's remove the public for those methods.
@@ -514,6 +524,14 @@ long getDeviceKernelTime() { | |||
return immutableTaskGraphList.stream().map(ImmutableTaskGraph::getDeviceKernelTime).mapToLong(Long::longValue).sum(); | |||
} | |||
|
|||
long getTotalBytesCopyIn() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add Java docs to describe what is the functionality of each method. The description should make clear when each method should be used.
@@ -124,6 +124,7 @@ __TEST_THE_WORLD__ = [ | |||
TestEntry("uk.ac.manchester.tornado.unittests.api.TestDevices"), | |||
TestEntry("uk.ac.manchester.tornado.unittests.tensors.TestTensorTypes"), | |||
TestEntry("uk.ac.manchester.tornado.unittests.tensors.TestTensorAPIWithOnnx"), | |||
TestEntry("uk.ac.manchester.tornado.unittests.memory.TestStressDeviceMemory"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we add the new unit-test?
MemoryConsumptionTest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
} | ||
|
||
@Override | ||
public synchronized int deallocate(DeviceBufferState deviceBufferState) { | ||
public synchronized long deallocate(DeviceBufferState deviceBufferState) { | ||
long memoryRegionDellocated = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
refactor the variable to deallocatedSpace
to keep consistency with the allocate
method. Also, let's use the same variable names across all backends:
PTXTornadoDevice
, SPIRVTornadoDevice
.
} | ||
|
||
@Override | ||
public synchronized int deallocate(DeviceBufferState deviceBufferState) { | ||
public synchronized long deallocate(DeviceBufferState deviceBufferState) { | ||
long spaceDeallocated = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor variable to deallocatedSpace
} | ||
|
||
@Override | ||
public synchronized int deallocate(DeviceBufferState deviceBufferState) { | ||
public synchronized long deallocate(DeviceBufferState deviceBufferState) { | ||
long spaceDeallocated = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refactor variable to deallocatedSpace
…r/ProfilerType.java Co-authored-by: Thanos Stratikopoulos <34061419+stratika@users.noreply.github.com>
…' into feat/query/memory
All comments applied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @jjfumero. LGTM
Description
This PR expands the profiler and the TornadoVM Execution Plan API to obtain the current memory consumption per task-graph, total number of bytes transferred to the device back and forth per execution plan, and the total memory usage. This feature is requested from the GAIA project.
a)
getTotalBytesTransferred
This expands the profiler to obtain the total number of bytes transferred in every execution plan launch. The number of bytes transferred depends on READ_ONLY, WRITE_ONLY and READ_WRITE data buffers.
b)
getTotalDeviceMemoryUsage
The total device memory usage registers all allocations needed to run an execution plan. This value can be queried using the TornadoVM profiler.
c)
getCurrentMemoryUsage
Value to query at the execution plan level, to obtain the actual memory usage at any point during execution.
Problem description
n/ a.
Backend/s tested
Mark the backends affected by this PR.
OS tested
Mark the OS where this PR is tested.
Did you check on FPGAs?
If it is applicable, check your changes on FPGAs.
How to test the new patch?