Tiled matrix multiplication example demonstrating shared memory api and usage #25

arhik · 2024-03-27T04:22:01Z

Shared memory can be declared in kernel (type and length is essential; which is different from CUDA.jl)
Private memory interface can also be along these lines.

Using shared memory is possible now. The worked out example is tiled matrix multiplication. Though shared memory usage is clear. This is not about correctness. Just about how to interface. multiplication still needs work.

Update tiled_matmul_kernel.jl This example demonstrate shared memory api and usage

arhik

LGTM

arhik added 3 commits March 26, 2024 21:36

partial shared memory template

78bd53d

Using shared memory is possible now. The worked out example is tiled matrix multiplication. Though shared memory usage is clear. This is not about correctness. Just about how to interface. multiplication still needs work.

Merge branch 'JuliaWGPU:main' into main

187fe99

Update tiled_matmul_kernel.jl

e933882

Update tiled_matmul_kernel.jl This example demonstrate shared memory api and usage

arhik commented Mar 27, 2024

View reviewed changes

arhik merged commit 176e3dd into JuliaWGPU:main Mar 27, 2024
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tiled matrix multiplication example demonstrating shared memory api and usage #25

Tiled matrix multiplication example demonstrating shared memory api and usage #25

arhik commented Mar 27, 2024

arhik left a comment

Tiled matrix multiplication example demonstrating shared memory api and usage #25

Tiled matrix multiplication example demonstrating shared memory api and usage #25

Conversation

arhik commented Mar 27, 2024

arhik left a comment

Choose a reason for hiding this comment