-
-
Notifications
You must be signed in to change notification settings - Fork 55
Support for loading values through the texture cache (__ldg) #158
Conversation
Perhaps something like |
Yeah I think I am going to try out the |
Hi, let me ask this: will this change allow to do something like the function tex2d? |
Sorry, no. This just exposes the texture cache, proper texture memory isn't supported yet. |
But will it enable one to read device array memory, and then do the interpolation by
hand, with performance comparable to texture fetches using tex2d?
…On Tue, Mar 13, 2018 at 4:32 PM Tim Besard ***@***.***> wrote:
Sorry, no. This just exposes the texture *cache*, proper texture memory
isn't supported yet.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#158 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAUWzzZWpq-jN0CtZ4HZl7zeEd93iUl3ks5td_SDgaJpZM4R5sii>
.
|
Sorry, missed your reply.
Hard to answer, I don't know how this interpolation is performed in hardware (I can't find much on the web). From a quick glance it looks like For my own reference (if I ever have time to implement this feature):
|
I'm not convinced by the design, this probably needs to be part of a buffer hierarchy. But let's keep it as a part of the history, for future reference.
Thanks for the thoughtful answer. I myself am trying to play with the ldg branch, though I'm still dealing with other unrelated issues. I'll post any results that I get. |
Great. Since I'm not too happy about the way this hooks into |
This branch packs a couple of improvements, as well as initial support for
__ldg
for loading values through the texture cache. See the CUDA docs for more info, but in short: the texture cache is faster than the global cache (which caches global values automatically), but it is a non-coherent cache which implies that the array should be read-only for the entire duration of the kernel.This PR only features the compiler support, no real front-end yet. Ideas/suggestions? Keyword argument to
getindex
? Constness typevar for CuArray/CuDeviceArray? Even though we'd need proper alias analysis (ref JuliaLang/julia#25890), this should already be usable in eg. non-mutatingbroadcast
.cc @vchuravy @MikeInnes