-
Notifications
You must be signed in to change notification settings - Fork 187
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I ❤️ resource_view
. It provides the missing je ne sais quoi that I could feel but not articulate in memory_resource
with just memory_kind
.
I have some concerns on some of the kind and property naming, nothing major. And a few typo / doc improvements.
include/cuda/memory_resource
Outdated
namespace memory_kind { | ||
/*! | ||
* \brief Ordinary host memory | ||
*/ | ||
struct host; | ||
|
||
/*! | ||
* \brief Device memory, as allocated by cudaMalloc. | ||
*/ | ||
struct device; | ||
|
||
/*! | ||
* \brief Device-accessible host memory. | ||
*/ | ||
struct pinned; | ||
|
||
/*! | ||
* \brief Virtual memory that is automatically migrated between the host and devices. | ||
*/ | ||
struct managed; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to call non-pageable host memory "pinned", I would like to discuss whether we should call it host_pinned
. Similarly, should we call non-pageable device memory device_pinned
? Since the advent of CUDA unified memory, I believe internally our engineers use this terminology to a certain extent.
A similar discussion is probably warranted for "managed". (While the APIs call it "managed", official CUDA terminology is "Unified Memory".) But since "managed" reflects CUDA API names (unlike "pinned"), it's probably OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no strong opinions on pinned/host_pinned.
I'd keep device to spare our users some typing.
Regarding managed - well, unified
is a bit dangerous due to possible confusion with UVA.
struct oversubscribable; | ||
|
||
/*! | ||
* \brief A memory property tag type indicating that the memory has a backing physical | ||
* storage in the target location at all times. | ||
*/ | ||
struct resident; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like that these two things which are opposites do not have names indicating they are opposites. Would the names "virtual_memory" and "physical_memory" work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Virtual memory and physical memory have a very strong association with the layers in memory management to me, and I think we should avoid that.
I really like "resident"; is there a word that hints at being the opposite of it somehow? Maybe "resident" and "pageable" or something like that would make a good pair? (Though admittedly the strict meaning of "pageable" isn't really what people associate with it the most, so it's probably also a problematic word to use here.)
} | ||
|
||
|
||
class memory_resource_base { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
memory_resource_base
should probably be in detail::
. I don't believe we expect users to be interacting with this class directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
include/cuda/memory_resource
Outdated
virtual void do_deallocate(void *__mem, size_t __bytes, size_t __alignment) = 0; | ||
}; | ||
|
||
class stream_ordered_memory_resource_base : public virtual memory_resource_base { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here, detail namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
template <typename _MemoryKind, typename _Context = any_context> | ||
class memory_resource : private virtual memory_resource_base, private detail::__get_context_impl<_Context> { | ||
public: | ||
using memory_kind = _MemoryKind; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wasn't there an issue with name collision of using memory_kind
for the typedef here? I recall I just used kind
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a type/value clash when memory_kind was an enum. Now that it is a tag type, the problem is gone.
5e19b23
to
319db3e
Compare
Must you force push? As you can see, doing so disassociates the comment history from the code. |
Sorry... I did it only because I rebased on main prior to merging the workaround from #183 |
Yeah, this is an unfortunate drawback to Github's PR workflow. We should probably setup some kind of etiquette or something for how we should handle this in the future. |
Generally rebasing rather than merging is problematic with github. If you are worried about keeping the history clean, the repo should use squash merge commits when merging PRs. |
@mzient did you see my comment here: #158 (comment) I ask because the code I mentioned still segfaults with your latest commits.
The fix I suggested in that comment stops the segfault. But do we want two nullptr views to be considered equal, or unequal? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple comments on __do_as_kind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing an inline keyword.
/*! | ||
* \brief Device memory, as allocated by cudaMalloc. | ||
*/ | ||
struct device; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about the wording.
host
is device accessible on P9+V100, so this should explain what the difference is there betweenhost
andpinned
device
is "pinned" to a device, and is host accessible in P9+V100
I think our matrix is:
- where does the memory reside? host, device, migratable
- which threads can access the memory? host, device, both
So IIUC, the current:
- host -> host resident, host accessible in general, but host resident, both accessible in P9+V100
- device -> device resident, device accessible in general, but device resident, both accessible in P9+V100
- pinned -> host resident, both accessible - always
- managed -> migratable, both accessible - always
I'm not really sure we need or should expose all for. Maybe these three would suffice?
- managed -> migratable, both accessible - always
- host_pinned -> host resident, both accessible -always
- device_pinned -> device_resident, both accessible on P9+V100, otherwise device-accessible only
Alternatively, maybe we could expose the full matrix:
template <enum MemoryResidency, enum MemoryAccessible>
struct memory_resource;
with
enum class MemoryResidency {
Host,
Device,
Migratable
};
enum class Memory Accessible {
Host,
Device,
Both,
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gonzalobg We've been there - we can't have multiple parameters describing memory resources. That's why we have resource_view
.
Regarding what happens on a particular system architecture - unless we can know it at compile time, there's nothing we can do and we have to assign the traits conservatively.
If you look more closely at resource_view properties, you'll find memory_location
, memory_access
and more.
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
properly propagated to the resource view via operator->. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
* Fix typos. * Move base interfaces to namespace detail. * Add resource_view comparison. * Add a view_resource with default property (is_kind). Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
* Remove superfluous override * Make __do_as_kind final Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
…ror`. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Add operator bool in resource view. Disallow construction of resource_view from integer (0). Add context to as_kind. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Really need to start merging rather than rebasing, and then use squash merging into the main/release branch. Then you get the clean git history benefit of rebasing without screwing up the PR history and disassociating reviews from the code. FWIW, this is what RAPIDS does and it works great (we didn't used to use squash merges, so our git history was messy. But now we do and all is good). |
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
include/cuda/memory_resource
Outdated
|
||
|
||
/*! | ||
* \brief Evaluaes to the a memory kind tag form _Properties (if listed as is_kind) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
* \brief Evaluaes to the a memory kind tag form _Properties (if listed as is_kind) | |
* \brief Evaluates to the a memory kind tag from _Properties (if listed as is_kind) |
include/cuda/memory_resource
Outdated
using __kind_from_properties_helper_t = typename __kind_from_properties_helper<_Properties...>::type; | ||
|
||
template <typename _Kind, typename... _Tail> | ||
struct __kind_from_properties_helper<is_kind<_Kind>, _Tail...> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code does not compile for me anymore after this latest change -- is_kind
is not defined until later in the file.
deps/libcudacxx-src/include/cuda/memory_resource:129:40: error: 'is_kind' was not declared in this scope
129 | struct __kind_from_properties_helper<is_kind<_Kind>, _Tail...> {
| ^~~~~~~
_deps/libcudacxx-src/include/cuda/memory_resource:129:53: error: template argument 1 is invalid
129 | struct __kind_from_properties_helper<is_kind<_Kind>, _Tail...> {
| ^
_deps/libcudacxx-src/include/cuda/memory_resource:149:3: error: wrong number of template arguments (2, should be 3)
149 | >;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix for this appears to be to move the struct is_kind
definition above this.
include/cuda/memory_resource
Outdated
struct kind_from_properties { | ||
using type = _CUDA_VSTD::conditional_t< | ||
_CUDA_VSTD::is_same<detail::__kind_from_properties_helper_t<_Properties...>, void>, | ||
detail::__kind_from_properties_helper_t<_Properties...> | ||
>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure how this is supposed to compile. conditional_t
takes 3 template parameters, not two, and the first one is a non-type parameter (so use is_same_v
). Unfortunately I don't know what the T
parameter for conditional_t
should be (the case where the properties list is void). Should it be void as well?:
struct kind_from_properties { | |
using type = _CUDA_VSTD::conditional_t< | |
_CUDA_VSTD::is_same<detail::__kind_from_properties_helper_t<_Properties...>, void>, | |
detail::__kind_from_properties_helper_t<_Properties...> | |
>; | |
struct kind_from_properties { | |
using type = _CUDA_VSTD::conditional_t< | |
_CUDA_VSTD::is_same_v<detail::__kind_from_properties_helper_t<_Properties...>, void>, | |
void, | |
detail::__kind_from_properties_helper_t<_Properties...> | |
>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It compiles if I write it as above, but I'm not sure if it's correct.
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
…red_resource. Signed-off-by: Michał Zientkiewicz <mzient@gmail.com>
/** | ||
* \brief Synchronizes the wrapped stream. | ||
* | ||
* \throws cuda::cuda_error if synchronization fails. | ||
* | ||
*/ | ||
void wait() const { | ||
detail::__throw_on_cuda_error(::cudaStreamSynchronize(get()), | ||
"Failed to synchronize stream."); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should consider adding a non-throwing version of this (wait_no_throw()
?) -- there is sometimes a need to sync a stream in a destructor of another class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's hold off on this for the moment. @brycelelbach and I are discussing alternative ways for you to opt into not having a function throw.
I wanted to ask what the current status is. I want to use the memory resource for my project. |
Superseded by #309 |
Design: NVIDIA/cccl#967 (proposal from a comment)
This work builds on #105
The resource is defined in terms of memory kind.
resource_view
(a glorifieid pointer with elaborate conversion logic) is defined in terms of the properties of the memory allocated. This allows, for example, to define a function that expects aresource_view<memory_access::host>
to takememory_resource<host>
andmemory_resource<managed>
, even though these types are not related.Summary (from #105):
cuda::memory_kind
(namespace)Groups kinds of memory allocated
Possible values: device, managed, pinned, host
cuda::memory_resource<memory_kind, context>
Synchronous (de)allocation of storage of the specified memory_kind
cuda::stream_ordered_memory_resource<memory_kind>
Asynchronous, stream-ordered (de)allocation of storage of the specified memory_kind
The semantics of stream ordered memory allocation are as defined here
stream_ordered_memory_resource inherits from memory_resource and provides a default implementation of allocate/deallocate by allocating on the default stream and synchronizing.
cuda::stream_view
A non-owning wrapper for cudaStream_t
cuda::cuda_error
Exception thrown on CUDA runtime API errors
throw_on_cuda_error utility for checking result of CUDA runtime API calls
New (wrt #105):
cuda::basic_resource_view<resource_pointer, properties...>
a pointer-like object that can be used in place of memory_resource and pameterized in terms of memory properties instead of memory kind.cuda::memory_resource_base
,cuda::stream_ordered_memory_resource_base
- base classes with common interface for all memory resources regardless of their memory kind. The bases are private so that the user can't blindly declare a function parameter ascuda::memory_resource_base
and unsafely pass a resource of any kind there. Instead, resource_view should be used.resource_view
,stream_ordererd_resource_view
- template aliases forbasic_resource_view
, with pointers the base resource types substituted for the resource_pointer. These types are befriended with(stream_ordered_)memory_resource
and therefore the access to the private base class is allowed. There's no way to obtain the base pointer except for callingoperator->
which is intended for exposing the resource interface.TODO: Add tests for resource_view