tf32 vs fp32 fix (#157)

I had originally made this a part of my PR tf32 vs fp32 but somehow its not there anymore. this is an essential part of letting a user choose between tf32 and fp32. without it our internal CI fails w/ numerical issues, since the vanilla matmuls run in fp32 but grouped gemm incorrectly runs in tf32 Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
pyg-team · Dec 5, 2022 · 08d02e9 · 08d02e9
1 parent a767c5b
commit 08d02e9
Showing 1 changed file with 1 addition and 0 deletions.
diff --git a/pyg_lib/csrc/ops/cuda/matmul_kernel.cu b/pyg_lib/csrc/ops/cuda/matmul_kernel.cu
@@ -2,6 +2,7 @@
 #include <ATen/cuda/CUDAContext.h>
 #include <cutlass/util/host_tensor.h>
 #include <torch/library.h>
+#include <torch/version.h>
 #include "cutlass/cutlass.h"
 #include "cutlass/gemm/device/gemm_grouped.h"
 #include "cutlass/gemm/device/gemm_universal.h"