Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MatMul] Pipe through global memory cache ops as additional scheduler options #1978

Merged
merged 5 commits into from
Feb 9, 2023

Conversation

shmsong
Copy link

@shmsong shmsong commented Sep 13, 2022

This PR starts integration of cache operators at assembly level into LoadStoreOps, most useful ones in matmul kernels are probably ca, cg, and cs, so starting with cg and ca on cp.async for now.

TODO:

Would need some massive cleanup to enable this on a wider scale:

  1. Extend LoadStoreOp to support ld.global and st.global, regardless of vectorization.
  2. Could maybe still use UnaryOp::SET as an alias for backward compatibility.

@shmsong shmsong changed the title WIP: [Not ready for review] Pipe through global memory cache ops as additional scheduler options Pipe through global memory cache ops as additional scheduler options Sep 21, 2022
Copy link
Collaborator

@naoyam naoyam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@csarofeen csarofeen changed the title Pipe through global memory cache ops as additional scheduler options [MatMul] Pipe through global memory cache ops as additional scheduler options Oct 19, 2022
@zasdfgbnm zasdfgbnm mentioned this pull request Feb 8, 2023
@zasdfgbnm
Copy link
Collaborator

This PR is cherry-picked into devel

@zasdfgbnm zasdfgbnm merged commit 93c2ad4 into misc_minor_codgen_change Feb 9, 2023
@zasdfgbnm zasdfgbnm deleted the cache_op_interface branch February 9, 2023 05:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants