Memory Resource / View #158

mzient · 2021-04-24T19:14:34Z

Design: NVIDIA/cccl#967 (proposal from a comment)

This work builds on #105

The resource is defined in terms of memory kind.
resource_view (a glorifieid pointer with elaborate conversion logic) is defined in terms of the properties of the memory allocated. This allows, for example, to define a function that expects a resource_view<memory_access::host> to take memory_resource<host> and memory_resource<managed>, even though these types are not related.

Summary (from #105):
cuda::memory_kind (namespace)
Groups kinds of memory allocated
Possible values: device, managed, pinned, host
cuda::memory_resource<memory_kind, context>
Synchronous (de)allocation of storage of the specified memory_kind
cuda::stream_ordered_memory_resource<memory_kind>
Asynchronous, stream-ordered (de)allocation of storage of the specified memory_kind
The semantics of stream ordered memory allocation are as defined here
stream_ordered_memory_resource inherits from memory_resource and provides a default implementation of allocate/deallocate by allocating on the default stream and synchronizing.
cuda::stream_view
A non-owning wrapper for cudaStream_t
cuda::cuda_error
Exception thrown on CUDA runtime API errors
throw_on_cuda_error utility for checking result of CUDA runtime API calls

New (wrt #105):
cuda::basic_resource_view<resource_pointer, properties...> a pointer-like object that can be used in place of memory_resource and pameterized in terms of memory properties instead of memory kind.
cuda::memory_resource_base, cuda::stream_ordered_memory_resource_base - base classes with common interface for all memory resources regardless of their memory kind. The bases are private so that the user can't blindly declare a function parameter as cuda::memory_resource_base and unsafely pass a resource of any kind there. Instead, resource_view should be used.
resource_view, stream_ordererd_resource_view - template aliases for basic_resource_view, with pointers the base resource types substituted for the resource_pointer. These types are befriended with (stream_ordered_)memory_resource and therefore the access to the private base class is allowed. There's no way to obtain the base pointer except for calling operator-> which is intended for exposing the resource interface.

TODO: Add tests for resource_view

harrism

I ❤️ resource_view. It provides the missing je ne sais quoi that I could feel but not articulate in memory_resource with just memory_kind.

I have some concerns on some of the kind and property naming, nothing major. And a few typo / doc improvements.

harrism · 2021-06-02T10:57:03Z

include/cuda/memory_resource

+namespace memory_kind {
+  /*!
+   * \brief Ordinary host memory
+   */
+  struct host;
+
+  /*!
+   * \brief Device memory, as allocated by cudaMalloc.
+   */
+  struct device;
+
+  /*!
+   * \brief Device-accessible host memory.
+   */
+  struct pinned;
+
+  /*!
+   * \brief Virtual memory that is automatically migrated between the host and devices.
+   */
+  struct managed;
+};


If we are going to call non-pageable host memory "pinned", I would like to discuss whether we should call it host_pinned. Similarly, should we call non-pageable device memory device_pinned? Since the advent of CUDA unified memory, I believe internally our engineers use this terminology to a certain extent.

A similar discussion is probably warranted for "managed". (While the APIs call it "managed", official CUDA terminology is "Unified Memory".) But since "managed" reflects CUDA API names (unlike "pinned"), it's probably OK.

I have no strong opinions on pinned/host_pinned.
I'd keep device to spare our users some typing.
Regarding managed - well, unified is a bit dangerous due to possible confusion with UVA.

harrism · 2021-06-02T11:00:29Z

include/cuda/memory_resource

+struct oversubscribable;
+
+/*!
+ * \brief A memory property tag type indicating that the memory has a backing physical
+ *        storage in the target location at all times.
+ */
+struct resident;


I don't like that these two things which are opposites do not have names indicating they are opposites. Would the names "virtual_memory" and "physical_memory" work?

Virtual memory and physical memory have a very strong association with the layers in memory management to me, and I think we should avoid that.

I really like "resident"; is there a word that hints at being the opposite of it somehow? Maybe "resident" and "pageable" or something like that would make a good pair? (Though admittedly the strict meaning of "pageable" isn't really what people associate with it the most, so it's probably also a problematic word to use here.)

include/cuda/memory_resource

jrhemstad · 2021-06-02T12:57:34Z

include/cuda/memory_resource

+}
+
+
+class memory_resource_base {


memory_resource_base should probably be in detail::. I don't believe we expect users to be interacting with this class directly.

jrhemstad · 2021-06-02T13:18:56Z

include/cuda/memory_resource

+  virtual void do_deallocate(void *__mem, size_t __bytes, size_t __alignment) = 0;
+};
+
+class stream_ordered_memory_resource_base : public virtual memory_resource_base {


Same here, detail namespace.

jrhemstad · 2021-06-02T13:20:16Z

include/cuda/memory_resource

+template <typename _MemoryKind, typename _Context = any_context>
+class memory_resource : private virtual memory_resource_base, private detail::__get_context_impl<_Context> {
+public:
+  using memory_kind = _MemoryKind;


Wasn't there an issue with name collision of using memory_kind for the typedef here? I recall I just used kind.

There was a type/value clash when memory_kind was an enum. Now that it is a tag type, the problem is gone.

include/cuda/memory_resource

harrism · 2021-07-22T11:07:48Z

Must you force push? As you can see, doing so disassociates the comment history from the code.

mzient · 2021-07-26T13:39:47Z

Must you force push? As you can see, doing so disassociates the comment history from the code.

Sorry... I did it only because I rebased on main prior to merging the workaround from #183

wmaxey · 2021-07-26T17:04:04Z

Must you force push? As you can see, doing so disassociates the comment history from the code.

Sorry... I did it only because I rebased on main prior to merging the workaround from #183

Yeah, this is an unfortunate drawback to Github's PR workflow. We should probably setup some kind of etiquette or something for how we should handle this in the future.

harrism · 2021-07-27T01:32:30Z

Generally rebasing rather than merging is problematic with github. If you are worried about keeping the history clean, the repo should use squash merge commits when merging PRs.

harrism · 2021-08-03T04:59:50Z

@mzient did you see my comment here: #158 (comment)

I ask because the code I mentioned still segfaults with your latest commits.

// segfault
if (cuda::resource_view<cuda::memory_access::device>{nullptr} ==
      cuda::resource_view<cuda::memory_access::device>{nullptr})
    printf("equal\n");

The fix I suggested in that comment stops the segfault. But do we want two nullptr views to be considered equal, or unequal?

harrism

Couple comments on __do_as_kind.

include/cuda/memory_resource

harrism

Missing an inline keyword.

include/cuda/detail/__cuda_util

gonzalobg · 2021-10-15T10:31:43Z

include/cuda/memory_resource

+  /*!
+   * \brief Device memory, as allocated by cudaMalloc.
+   */
+  struct device;


I'm not sure about the wording.

host is device accessible on P9+V100, so this should explain what the difference is there between host and pinned

device is "pinned" to a device, and is host accessible in P9+V100

I think our matrix is:

where does the memory reside? host, device, migratable

which threads can access the memory? host, device, both

So IIUC, the current:

host -> host resident, host accessible in general, but host resident, both accessible in P9+V100

device -> device resident, device accessible in general, but device resident, both accessible in P9+V100

pinned -> host resident, both accessible - always

managed -> migratable, both accessible - always

I'm not really sure we need or should expose all for. Maybe these three would suffice?

managed -> migratable, both accessible - always

host_pinned -> host resident, both accessible -always

device_pinned -> device_resident, both accessible on P9+V100, otherwise device-accessible only

Alternatively, maybe we could expose the full matrix:

template <enum MemoryResidency, enum MemoryAccessible> struct memory_resource;

with

enum class MemoryResidency { Host, Device, Migratable }; enum class Memory Accessible { Host, Device, Both, }

@gonzalobg We've been there - we can't have multiple parameters describing memory resources. That's why we have resource_view.
Regarding what happens on a particular system architecture - unless we can know it at compile time, there's nothing we can do and we have to assign the traits conservatively.
If you look more closely at resource_view properties, you'll find memory_location, memory_access and more.

include/cuda/memory_resource

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

properly propagated to the resource view via operator->. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

* Fix typos. * Move base interfaces to namespace detail. * Add resource_view comparison. * Add a view_resource with default property (is_kind). Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

* Remove superfluous override * Make __do_as_kind final Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

…ror`. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Add operator bool in resource view. Disallow construction of resource_view from integer (0). Add context to as_kind. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

harrism · 2021-10-27T22:17:52Z

Must you force push? As you can see, doing so disassociates the comment history from the code.

Sorry... I did it only because I rebased on main prior to merging the workaround from #183

Yeah, this is an unfortunate drawback to Github's PR workflow. We should probably setup some kind of etiquette or something for how we should handle this in the future.

Really need to start merging rather than rebasing, and then use squash merging into the main/release branch. Then you get the clean git history benefit of rebasing without screwing up the PR history and disassociating reviews from the code. FWIW, this is what RAPIDS does and it works great (we didn't used to use squash merges, so our git history was messy. But now we do and all is good).

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

include/cuda/memory_resource

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

harrism · 2021-11-16T00:21:31Z

include/cuda/memory_resource

+
+
+/*!
+ * \brief Evaluaes to the a memory kind tag form _Properties (if listed as is_kind)


Suggested change

* \brief Evaluaes to the a memory kind tag form _Properties (if listed as is_kind)

* \brief Evaluates to the a memory kind tag from _Properties (if listed as is_kind)

harrism · 2021-11-24T04:13:09Z

include/cuda/memory_resource

+  using __kind_from_properties_helper_t = typename __kind_from_properties_helper<_Properties...>::type;
+
+  template <typename _Kind, typename... _Tail>
+  struct __kind_from_properties_helper<is_kind<_Kind>, _Tail...> {


This code does not compile for me anymore after this latest change -- is_kind is not defined until later in the file.

deps/libcudacxx-src/include/cuda/memory_resource:129:40: error: 'is_kind' was not declared in this scope 129 | struct __kind_from_properties_helper<is_kind<_Kind>, _Tail...> { | ^~~~~~~ _deps/libcudacxx-src/include/cuda/memory_resource:129:53: error: template argument 1 is invalid 129 | struct __kind_from_properties_helper<is_kind<_Kind>, _Tail...> { | ^ _deps/libcudacxx-src/include/cuda/memory_resource:149:3: error: wrong number of template arguments (2, should be 3) 149 | >;

Fix for this appears to be to move the struct is_kind definition above this.

harrism · 2021-11-24T04:23:51Z

include/cuda/memory_resource

+struct kind_from_properties {
+  using type = _CUDA_VSTD::conditional_t<
+    _CUDA_VSTD::is_same<detail::__kind_from_properties_helper_t<_Properties...>, void>,
+    detail::__kind_from_properties_helper_t<_Properties...>
+  >;


Not sure how this is supposed to compile. conditional_t takes 3 template parameters, not two, and the first one is a non-type parameter (so use is_same_v). Unfortunately I don't know what the T parameter for conditional_t should be (the case where the properties list is void). Should it be void as well?:

Suggested change

struct kind_from_properties {

using type = _CUDA_VSTD::conditional_t<

_CUDA_VSTD::is_same<detail::__kind_from_properties_helper_t<_Properties...>, void>,

detail::__kind_from_properties_helper_t<_Properties...>

>;

struct kind_from_properties {

using type = _CUDA_VSTD::conditional_t<

_CUDA_VSTD::is_same_v<detail::__kind_from_properties_helper_t<_Properties...>, void>,

void,

detail::__kind_from_properties_helper_t<_Properties...>

>;

It compiles if I write it as above, but I'm not sure if it's correct.

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

…red_resource. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

harrism · 2021-12-13T22:51:44Z

include/cuda/stream_view

+  /**
+   * \brief Synchronizes the wrapped stream.
+   *
+   * \throws cuda::cuda_error if synchronization fails.
+   *
+   */
+  void wait() const {
+    detail::__throw_on_cuda_error(::cudaStreamSynchronize(get()),
+                                  "Failed to synchronize stream.");
+  }


We should consider adding a non-throwing version of this (wait_no_throw()?) -- there is sometimes a need to sync a stream in a destructor of another class.

Let's hold off on this for the moment. @brycelelbach and I are discussing alternative ways for you to opt into not having a function throw.

raphaelreinauer · 2022-03-26T22:27:57Z

I wanted to ask what the current status is. I want to use the memory resource for my project.

jrhemstad · 2023-02-20T16:19:59Z

Superseded by #309

mzient requested review from brycelelbach, griwes and jrhemstad April 24, 2021 19:14

mzient mentioned this pull request Apr 25, 2021

Memory resource / view jrhemstad/libcudacxx#2

Open

harrism suggested changes Jun 2, 2021

View reviewed changes

jrhemstad reviewed Jun 2, 2021

View reviewed changes

mzient force-pushed the memres_view branch 2 times, most recently from 5e19b23 to 319db3e Compare June 7, 2021 17:27

mzient requested a review from jrhemstad June 7, 2021 17:28

jrhemstad mentioned this pull request Jun 11, 2021

[FEA] Add host_buffer class rapidsai/rmm#260

Open

harrism suggested changes Jun 22, 2021

View reviewed changes

include/cuda/memory_resource Outdated Show resolved Hide resolved

harrism suggested changes Jun 22, 2021

View reviewed changes

include/cuda/memory_resource Outdated Show resolved Hide resolved

harrism reviewed Jul 14, 2021

View reviewed changes

include/cuda/memory_resource Outdated Show resolved Hide resolved

mzient force-pushed the memres_view branch from 1c64895 to 9110638 Compare July 21, 2021 13:12

brycelelbach added this to the 2.0.0 milestone Jul 26, 2021

wmaxey mentioned this pull request Jul 28, 2021

How to use features requiring RTTI and exceptions? #183

Closed

harrism suggested changes Aug 4, 2021

View reviewed changes

include/cuda/memory_resource Show resolved Hide resolved

include/cuda/memory_resource Outdated Show resolved Hide resolved

include/cuda/memory_resource Show resolved Hide resolved

harrism suggested changes Aug 5, 2021

View reviewed changes

include/cuda/detail/__cuda_util Show resolved Hide resolved

mzient mentioned this pull request Aug 31, 2021

Use libcu++ interfaces. NVIDIA/DALI#3297

Merged

23 tasks

jrhemstad mentioned this pull request Sep 20, 2021

Using external memory allocator with PyTorch pytorch/pytorch#43144

Closed

mzient force-pushed the memres_view branch from b630b87 to 863f11a Compare September 24, 2021 16:44

gonzalobg reviewed Oct 15, 2021

View reviewed changes

harrism suggested changes Oct 20, 2021

View reviewed changes

include/cuda/memory_resource Outdated Show resolved Hide resolved

mzient added 14 commits October 27, 2021 10:33

Fix problems with resource view. Add tests for resource_view.

720ae7e

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Add a test checking that the interface of the resource pointer is

91b7abc

properly propagated to the resource view via operator->. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Review issues + comparison

698a502

* Fix typos. * Move base interfaces to namespace detail. * Add resource_view comparison. * Add a view_resource with default property (is_kind). Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Fix pmr wrapper.

696be93

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Fix clang build warnings.

fda1658

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Fix comparison - step 1.

c8dbdf8

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Workaround dynamic cast for private bases.

b43ce65

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

* Allow comparing views of null resources

af7bd53

* Remove superfluous override * Make __do_as_kind final Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Add missing inline and visibility specifiers to `__throw_on_cuda_er…

377a1e6

…ror`. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Add missing inline.

2285a65

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Fix C++17 build without PMR headers.

64728e0

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Enable comparison between resource_views and memory_resource pointers.

45aeec4

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Remove compatibility check from view comparison.

484c23d

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Add memory_resource reference comparison.

03ee4f8

Add operator bool in resource view. Disallow construction of resource_view from integer (0). Add context to as_kind. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

mzient force-pushed the memres_view branch from 3ae680e to 03ee4f8 Compare October 27, 2021 08:39

Remove context.

67624c5

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

harrism suggested changes Nov 11, 2021

View reviewed changes

include/cuda/memory_resource Outdated Show resolved Hide resolved

Remove unused typedefs.

0a6b39e

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

harrism reviewed Nov 16, 2021

View reviewed changes

harrism reviewed Nov 24, 2021

View reviewed changes

mzient added 3 commits December 7, 2021 15:59

Add property_list.

545d87b

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Rename resource_view to resource_ptr.

199d584

Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

Move memory resource to namespace pmr. Drop 'memory' from stream_orde…

64f087f

…red_resource. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>

harrism reviewed Dec 13, 2021

View reviewed changes

jrhemstad mentioned this pull request Aug 30, 2022

Implement {async_}resource_ref #309

Merged

3 tasks

jrhemstad closed this Feb 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory Resource / View #158

Memory Resource / View #158

mzient commented Apr 24, 2021

harrism left a comment

harrism Jun 2, 2021 •

edited

Loading

mzient Jun 7, 2021

harrism Jun 2, 2021

griwes Jun 2, 2021

jrhemstad Jun 2, 2021

mzient Jun 7, 2021 •

edited

Loading

jrhemstad Jun 2, 2021

mzient Jun 7, 2021

jrhemstad Jun 2, 2021

mzient Jun 7, 2021

harrism commented Jul 22, 2021

mzient commented Jul 26, 2021

wmaxey commented Jul 26, 2021

harrism commented Jul 27, 2021 •

edited

Loading

harrism commented Aug 3, 2021 •

edited

Loading

harrism left a comment

harrism left a comment

gonzalobg Oct 15, 2021 •

edited

Loading

mzient Oct 27, 2021

harrism commented Oct 27, 2021

harrism Nov 16, 2021

harrism Nov 24, 2021

harrism Nov 24, 2021

harrism Nov 24, 2021

harrism Nov 24, 2021

harrism Dec 13, 2021

jrhemstad Dec 15, 2021

raphaelreinauer commented Mar 26, 2022

jrhemstad commented Feb 20, 2023



		/*!
		* \brief Evaluaes to the a memory kind tag form _Properties (if listed as is_kind)

	* \brief Evaluaes to the a memory kind tag form _Properties (if listed as is_kind)
	* \brief Evaluates to the a memory kind tag from _Properties (if listed as is_kind)

Memory Resource / View #158

Memory Resource / View #158

Conversation

mzient commented Apr 24, 2021

harrism left a comment

Choose a reason for hiding this comment

harrism Jun 2, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mzient Jun 7, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harrism commented Jul 22, 2021

mzient commented Jul 26, 2021

wmaxey commented Jul 26, 2021

harrism commented Jul 27, 2021 • edited Loading

harrism commented Aug 3, 2021 • edited Loading

harrism left a comment

Choose a reason for hiding this comment

harrism left a comment

Choose a reason for hiding this comment

gonzalobg Oct 15, 2021 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

harrism commented Oct 27, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

raphaelreinauer commented Mar 26, 2022

jrhemstad commented Feb 20, 2023

harrism Jun 2, 2021 •

edited

Loading

mzient Jun 7, 2021 •

edited

Loading

harrism commented Jul 27, 2021 •

edited

Loading

harrism commented Aug 3, 2021 •

edited

Loading

gonzalobg Oct 15, 2021 •

edited

Loading