Add expected behavior to unified memory #3617

matyas-streamhpc · 2024-09-19T09:12:16Z

No description provided.

docs/how-to/hip_runtime_api/memory_management/unified_memory.rst

randyh62 · 2024-10-30T16:59:29Z

docs/how-to/hip_runtime_api.rst

+********************************************************************************
+
+The HIP runtime API provides C and C++ functionalities to manage event, stream,
+and memory on GPUs. On AMD ROCm software, the HIP runtime uses :doc:`Common


Suggested change

and memory on GPUs. On AMD ROCm software, the HIP runtime uses :doc:`Common

and memory on GPUs. On AMD ROCm software, the HIP runtime uses :doc:`Compute

Apparently CLR is supposed to be Compute Language Runtime

randyh62 · 2024-10-30T17:17:10Z

docs/how-to/hip_runtime_api/memory_management.rst

+The runtime offers functions for allocating, freeing, and copying device memory,
+along with transferring data between host and device memory.
+
+Here are the various memory management techniques:


It seems like these techniques could use a brief description?

randyh62 · 2024-10-30T17:35:43Z

docs/how-to/hip_runtime_api/memory_management/host_memory.rst

+.. figure:: ../../../data/how-to/hip_runtime_api/memory_management/pageable_pinned.svg
+
+The pageable and pinned memory allow you to exercise direct control over
+memory operations, which is known as explicit memory management. When using the


In memory_management.rst we list four memory management techniques, none of which is "explicit memory management". Is it possible to relate this explicit memory management to one of the four techniques? Or should it be listed as one of the techniques?

randyh62 · 2024-10-30T18:37:25Z

docs/how-to/hip_runtime_api/memory_management/host_memory.rst

+      // Run the kernel
+      // ...
+
+      HIP_CHECK(hipMemcpy(device_input, host_input, element_number * sizeof(int), hipMemcpyHostToDevice));


This doesn't seem right to me. We have device_output and host_output, shouldn't those come into play here?

randyh62 · 2024-10-30T19:41:54Z

docs/how-to/hip_runtime_api/memory_management/host_memory.rst

+Pinned memory
+================================================================================
+
+Pinned memory or page-locked memory is stored in pages that are locked in specific sectors in RAM and can't be migrated. The pointer can be used on both


Seems like we should note that pinned memory is allocated by hipMalloc() and hipHostMalloc().

randyh62 · 2024-10-30T19:42:45Z

docs/how-to/hip_runtime_api/memory_management/host_memory.rst

+      // Run the kernel
+      // ...
+
+      HIP_CHECK(hipMemcpy(device_input, host_input, element_number * sizeof(int), hipMemcpyHostToDevice));


Shouldn't this be DeviceToHost?

randyh62 · 2024-10-30T19:57:40Z

docs/how-to/hip_runtime_api/memory_management/device_memory.rst

+as HBM2e. Device memory can be allocated as global memory, constant, texture or
+surface memory.
+
+Global memory


I suggest making this a bulleted list rather than four separate subheads.

docs/how-to/hip_runtime_api/memory_management/device_memory.rst

randyh62 · 2024-10-30T23:39:20Z

docs/how-to/hip_runtime_api/memory_management/coherence_control.rst

+:sup:`1` The :cpp:func:`hipHostMalloc` memory allocation coherence mode can be 
+affected by the ``HIP_HOST_COHERENT`` environment variable, if the
+``hipHostMallocCoherent``, ``hipHostMallocNonCoherent``, and
+``hipHostMallocMapped`` are unset. If neither these flags nor the


Here the footnote to the table references hipHostMallocMapped flag, but the table above does not show this flag. This also is mentioned in the following note.

docs/how-to/hip_runtime_api/memory_management/virtual_memory.rst

randyh62 · 2024-11-01T03:03:44Z

docs/how-to/hip_runtime_api/memory_management/virtual_memory.rst

+functions. To unmap the memory, use :cpp:func:`hipMemUnmap`. To release the
+virtual address range, use :cpp:func:`hipMemAddressFree`.  Finally, to release
+the physical memory, use :cpp:func:`hipMemRelease`. A side effect of these
+functions is the lack of synchronization when memory is released. If you call


If you call :cpp:func:hipFree when you have multiple streams running in parallel, it
synchronizes the device. This causes worse resource usage and performance.

This seems like it could be an important note?

docs/how-to/hip_runtime_api/memory_management/virtual_memory.rst

randyh62 · 2024-11-01T03:32:34Z

docs/how-to/hip_runtime_api/memory_management/stream_ordered_allocator.rst

On line 24: - Learning curve: Requires additional effort to understand and utilize SOMA effectively.

randyh62

added comments

randyh62

Looks good to me.

g-h-c · 2024-11-26T08:06:11Z

I think many people would be caught up by the new/delete example not working, as its ability to work is not only dependent on the device attributes but also the kernel HMM support. I would add some comments to the actual code, e.g.

// for this example to work the environment variable HSA_XNACK needs to be set to 1, and the kernel needs HMM support

We can potentially indicate that a way to check for HMM support is setting the environment variable, AMD_LOG_LEVEL e.g. to 3 or 4. For instance on my instinct accelerator it prints:

:3:rocdevice.cpp :1790: 10080897639915 us: HMM support: 1, XNACK: 1, Direct host access: 0

when set to 3.
But this last thing about AMD_LOG_LEVEL does not necessarily belong to this page

MKKnorr · 2024-11-26T12:13:32Z

Thanks for the review, I agree with your points.

I think many people would be caught up by the new/delete example not working, as its ability to work is not only dependent on the device attributes but also the kernel HMM support. I would add some comments to the actual code, e.g.

I clarified that a bit more under the system requirements, and moved the first mention of HMM support there.

// for this example to work the environment variable HSA_XNACK needs to be set to 1, and the kernel needs HMM support

Added a comment similar to this to the example

We can potentially indicate that a way to check for HMM support is setting the environment variable, AMD_LOG_LEVEL e.g. to 3 or 4. For instance on my instinct accelerator it prints:

:3:rocdevice.cpp :1790: 10080897639915 us: HMM support: 1, XNACK: 1, Direct host access: 0

when set to 3. But this last thing about AMD_LOG_LEVEL does not necessarily belong to this page

Agreed, this is a bit overkill, as AMD_LOG_LEVEL=3 also prints a lot of other stuff, which might make it hard to look for the HMM and XNACK support. Also similar information is listed on https://rocm.docs.amd.com/en/latest/conceptual/gpu-memory.html#xnack, which is being linked several times in this document.

matyas-streamhpc self-assigned this Sep 19, 2024

matyas-streamhpc requested review from gandryey, mangupta and rakesroy as code owners September 19, 2024 09:12

neon60 marked this pull request as draft September 19, 2024 09:43

neon60 force-pushed the update-unified-memory branch from da1233f to 267b2f2 Compare September 26, 2024 10:23

neon60 force-pushed the docs/develop branch from 98f9599 to 73814d1 Compare September 26, 2024 10:30

neon60 force-pushed the update-unified-memory branch 3 times, most recently from 7e27ba9 to 84f9097 Compare September 29, 2024 17:13

neon60 force-pushed the docs/develop branch from 0049c2b to e7486be Compare September 30, 2024 19:25

neon60 force-pushed the update-unified-memory branch 4 times, most recently from 934e4d3 to 4baf272 Compare October 9, 2024 09:59

neon60 force-pushed the update-unified-memory branch 3 times, most recently from 546b8d9 to 6244fc2 Compare October 15, 2024 05:37

neon60 changed the base branch from docs/develop to hip_runtime_api_how-to October 15, 2024 08:23

neon60 marked this pull request as ready for review October 15, 2024 08:23

neon60 requested review from chrispaquot and saleelk as code owners October 15, 2024 08:23

neon60 requested a review from MKKnorr October 15, 2024 08:24

MKKnorr force-pushed the update-unified-memory branch from f4ab47d to 74ebf79 Compare October 25, 2024 11:03

neon60 reviewed Oct 25, 2024

View reviewed changes

docs/how-to/hip_runtime_api/memory_management/unified_memory.rst Outdated Show resolved Hide resolved

neon60 approved these changes Oct 25, 2024

View reviewed changes

neon60 force-pushed the hip_runtime_api_how-to branch from 38d2787 to 8d5b771 Compare October 30, 2024 09:13

neon60 force-pushed the update-unified-memory branch from 6879e02 to ab44e41 Compare October 31, 2024 13:16

neon60 changed the base branch from hip_runtime_api_how-to to docs/develop October 31, 2024 14:30

randyh62 reviewed Oct 31, 2024

View reviewed changes

randyh62 reviewed Nov 1, 2024

View reviewed changes

docs/how-to/hip_runtime_api/memory_management/virtual_memory.rst Show resolved Hide resolved

randyh62 reviewed Nov 1, 2024

View reviewed changes

docs/how-to/hip_runtime_api/memory_management/virtual_memory.rst Show resolved Hide resolved

randyh62 reviewed Nov 1, 2024

View reviewed changes

neon60 force-pushed the docs/develop branch from de21da8 to 68b03fe Compare November 5, 2024 07:49

neon60 force-pushed the update-unified-memory branch from 78c8a5a to aa3584c Compare November 5, 2024 07:56

neon60 changed the base branch from docs/develop to hip_runtime_api_how-to November 5, 2024 07:57

neon60 force-pushed the update-unified-memory branch from aa3584c to 7c6949e Compare November 5, 2024 08:10

neon60 force-pushed the hip_runtime_api_how-to branch 2 times, most recently from eb41cb1 to 7cb026c Compare November 7, 2024 12:00

MKKnorr force-pushed the update-unified-memory branch 2 times, most recently from 0c4abec to 07a7fdc Compare November 8, 2024 10:41

randyh62 approved these changes Nov 8, 2024

View reviewed changes

MKKnorr force-pushed the update-unified-memory branch from 07a7fdc to af9064b Compare November 15, 2024 09:29

neon60 force-pushed the hip_runtime_api_how-to branch from 0c52420 to 31fa125 Compare November 18, 2024 16:16

neon60 deleted the branch docs/develop November 18, 2024 17:39

neon60 closed this Nov 18, 2024

neon60 reopened this Nov 18, 2024

MKKnorr changed the base branch from hip_runtime_api_how-to to docs/develop November 19, 2024 09:25

MKKnorr force-pushed the update-unified-memory branch 3 times, most recently from 890f0a5 to bbb00d0 Compare November 25, 2024 09:44

MKKnorr force-pushed the update-unified-memory branch from bbb00d0 to 24856f9 Compare November 26, 2024 12:00

Docs: Update unified memory documentation

b6da83e

MKKnorr force-pushed the update-unified-memory branch from 24856f9 to b6da83e Compare November 26, 2024 12:48

MKKnorr merged commit 923f80a into docs/develop Nov 26, 2024
4 checks passed

neon60 deleted the update-unified-memory branch December 5, 2024 12:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add expected behavior to unified memory #3617

Add expected behavior to unified memory #3617

matyas-streamhpc commented Sep 19, 2024

randyh62 Oct 30, 2024

randyh62 Oct 30, 2024

randyh62 Oct 30, 2024

randyh62 Oct 30, 2024

randyh62 Oct 30, 2024

randyh62 Oct 30, 2024

randyh62 Oct 30, 2024

randyh62 Oct 30, 2024

randyh62 Oct 30, 2024

randyh62 Nov 1, 2024

randyh62 Nov 1, 2024

randyh62 left a comment

randyh62 left a comment

g-h-c commented Nov 26, 2024

MKKnorr commented Nov 26, 2024

	and memory on GPUs. On AMD ROCm software, the HIP runtime uses :doc:`Common
	and memory on GPUs. On AMD ROCm software, the HIP runtime uses :doc:`Compute

Add expected behavior to unified memory #3617

Add expected behavior to unified memory #3617

Conversation

matyas-streamhpc commented Sep 19, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

randyh62 left a comment

Choose a reason for hiding this comment

randyh62 left a comment

Choose a reason for hiding this comment

g-h-c commented Nov 26, 2024

MKKnorr commented Nov 26, 2024