-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matmul benchmarking: case without tile quantization: #1980
Conversation
97c6ee9
to
d0c3fc9
Compare
da4b1b7
to
4d8daea
Compare
@@ -20,6 +20,7 @@ if(USE_CUDA) | |||
softmax_backward.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to myself: I have split this file out and merged separately in #2007
This file is no longer needed here anymore.
benchmarks/cpp/nvfuser/matmul.cpp
Outdated
@@ -0,0 +1,356 @@ | |||
#include <torch/csrc/jit/codegen/cuda/arith.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to myself: I have split this file out and merged separately in #2007
This file is no longer needed here anymore.
After rebasing, this PR is just a trivial PR adding a test, I will merge this now to the bottom of the stack |
This is the benchmarking PR in this series, tracking the resulting performance from this stack of PRs.
Most recent run on A100: