Skip to content

Commit

Permalink
tf32 vs fp32 fix (#157)
Browse files Browse the repository at this point in the history
I had originally made this a part of my PR tf32 vs fp32 but somehow its
not there anymore. this is an essential part of letting a user choose
between tf32 and fp32. without it our internal CI fails w/ numerical
issues, since the vanilla matmuls run in fp32 but grouped gemm
incorrectly runs in tf32

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
puririshi98 and pre-commit-ci[bot] authored Dec 5, 2022
1 parent a767c5b commit 08d02e9
Showing 1 changed file with 1 addition and 0 deletions.
1 change: 1 addition & 0 deletions pyg_lib/csrc/ops/cuda/matmul_kernel.cu
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
#include <ATen/cuda/CUDAContext.h>
#include <cutlass/util/host_tensor.h>
#include <torch/library.h>
#include <torch/version.h>
#include "cutlass/cutlass.h"
#include "cutlass/gemm/device/gemm_grouped.h"
#include "cutlass/gemm/device/gemm_universal.h"
Expand Down

0 comments on commit 08d02e9

Please sign in to comment.