[Triton SMEM] Add not-yet-landed usage of Triton SMEM feature with autotuning #72

plotfi · 2024-08-22T06:36:06Z

NOTE: This is an experiment, and a draft. Do not review.

The following change requires a private patchset that is not yet available outside of plotfi/triton#4

This patch adds usage of shared memory using the tl.local_copy and tl.gather operations for the TW (time bias) and PW (position bias) tensors for the forward pass kernel.

Autotuning is also hooked up to the usage of these shared memory operators

…totuning The following change requires a private patchset that is not yet available outside of plotfi/triton#4 This patch adds usage of shared memory using the tl.local_copy and tl.gather operations for the TW (time bias) and PW (position bias) tensors for the forward pass kernel. Autotuning is also hooked up to the usage of these shared memory operators

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Triton SMEM] Add not-yet-landed usage of Triton SMEM feature with autotuning #72

[Triton SMEM] Add not-yet-landed usage of Triton SMEM feature with autotuning #72

plotfi commented Aug 22, 2024

[Triton SMEM] Add not-yet-landed usage of Triton SMEM feature with autotuning #72

Are you sure you want to change the base?

[Triton SMEM] Add not-yet-landed usage of Triton SMEM feature with autotuning #72

Conversation

plotfi commented Aug 22, 2024