-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement LdgIterator abstraction #996
Conversation
Do we want to do a single PR that includes both the new LDG implementation and its use in |
I'd do both in the same PR. The iterator implementation should be ready for review, I'll update the collection. |
src/corecel/data/LdgIterator.hh
Outdated
{ | ||
using value_type = T; | ||
using pointer = std::add_pointer_t<value_type const>; | ||
using reference = std::add_lvalue_reference_t<value_type const>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since __ldg
should only ever return constant primitive type it should be cheaper to return a copy instead of const ref in most cases?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think the compiler would optimize to the same thing if it returns an lvalue or not... but isn't the signature T __ldg(T*)
i.e. it always returns an rvalue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, a function call that returns a non-reference is rvalue. In that case:
T const& f(T const* p)
{
return __ldg(p);
}
I'd assume we bind a reference to temporary which results in UB?
src/corecel/data/LdgIterator.hh
Outdated
using pointer = std::add_pointer_t<value_type const>; | ||
using reference = value_type; | ||
|
||
CELER_CONSTEXPR_FUNCTION static reference read(pointer p) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this isn't trying to conform to any existing interface, perhaps LdgLoader
with operator()
? Accessing with an instance could make sense if we extend LDG loading to "read member data from a struct" (e.g., for struct T
, the loader could hold U T::*
as class data and read member data from it using ldg?) I think this and the traits class should be in data/detail/LdgIteratorImpl.hh
since they're more than a couple of lines.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's easier to make this class stateless otherwise the LdgIterator has to make sure they both stay consistent.
661e070
to
4589421
Compare
auto ldg_start = LdgIterator<TestId>(some_data.data()); | ||
auto ldg_end = LdgIterator<TestId>(some_data.data() + n); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use make_ldg_iterator
in one place and CTAD in the other? And then add
EXPECT_TRUE((std::is_same_v<decltype(ldg_start), decltype(ldg_end)>));
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we use CTAD here? We have to somehow tell if the pointer points to regular builtin type or if it should reconstruct an OpaqueId.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤔 but that's just about how it's loaded under the hood right? Both should be LdgIterator<T>
and the under-the-hood loader handles the reconstruction implementation (and value type).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh wait, I see: your underlying data is size_type
not OpaqueId
. We have some collections of OpaqueId
itself, and I was thinking we __ldg
the underlying data. (Do we need to add a data()
accessor to OpaqueId, or is it permissible to use a cast?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see, if we want to do that we need the address of OpaqueId::value_
which we probably don't want to expose using an accessor, maybe declare the LdgLoader as friend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea. As I said, I'm fine postponing that stuff to another PR if it increases the scope too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it's actually not that much change, done in f41f64a so it's probably fine to do it here. Maybe separate the collection update in another PR.
We could also postpone the OpaqueId stuff until later: being able to do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM now! Do you want to merge with the test as is (backing out the Collection header include) or implement an LdgSpan
for Collection<T, device, const_reference>
?
Also, do we want to require LdgIterator
work with LDG-compatible data (const and fundamental type) or fall back to not using LDG? I'm thinking the former.
Let's do the |
I'm done pushing to this branch unless you have more comments. |
Realised that I was missing comparison operators while working on the LdgSpan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! Even though the const
requirement makes testing a little more awkward, it's the right thing to do from a device perspective: we never want to accidentally use LDG on non-const data.
Implement iterator for read-only device data using
__ldg
primitive