-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Memory corruption of Pascal Titan X for some shapes #8
Labels
Comments
Fixed in ba1520a. |
ptillet
pushed a commit
that referenced
this issue
Aug 21, 2017
Fix the bug of vector subscript out of range
ptillet
added a commit
that referenced
this issue
Sep 12, 2022
ThomasRaoux
pushed a commit
that referenced
this issue
Mar 13, 2024
There are two tests that failed under AddressSanitizer: * test/TritonGPU/loop-pipeline.mlir * python/test/regression/test_functional_regressions.py with an error: ``` ==8475==ERROR: AddressSanitizer: heap-use-after-free on address 0x50c000bd0be0 at pc 0x557b03278847 bp 0x7ffd69b2c4a0 sp 0x7ffd69b2c498 READ of size 8 at 0x50c000bd0be0 thread T0 #0 0x557b03278846 in getNextOperandUsingThisValue [third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h:43](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h?l=43&ws=aliia/3018&snapshot=215):58 #1 0x557b03278846 in operator++ [third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h:322](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h?l=322&ws=aliia/3018&snapshot=215):39 #2 0x557b03278846 in mlir::ResultRange::UseIterator::operator++() [third_party/llvm/llvm-project/mlir/lib/IR/OperationSupport.cpp:614](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/mlir/lib/IR/OperationSupport.cpp?l=614&ws=aliia/3018&snapshot=215):5 #3 0x557affde38c4 in operator++ [third_party/llvm/llvm-project/llvm/include/llvm/ADT/iterator.h:281](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/include/llvm/ADT/iterator.h?l=281&ws=aliia/3018&snapshot=215):5 #4 0x557affde38c4 in createAsyncCopy [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:117](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=117&ws=aliia/3018&snapshot=215):26 #5 0x557affde38c4 in createAsyncLoad [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:135](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=135&ws=aliia/3018&snapshot=215):3 #6 0x557affde38c4 in createAsynOps [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:501](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=501&ws=aliia/3018&snapshot=215):5 #7 0x557affde38c4 in mlir::triton::preProcessLoopAndGetSchedule(mlir::scf::ForOp&, int, mlir::triton::PipeliningOption&) [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:740](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=740&ws=aliia/3018&snapshot=215):7 #8 0x557affe01c0c in pipelineLoop [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/SoftwarePipeliner.cpp:76](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/SoftwarePipeliner.cpp?l=76&ws=aliia/3018&snapshot=215):19 ... ``` This is likely happening due to iterator being invalidated after `alloc.erase()`. This PR moves erases of allocations outside of a loop and fixes heap-use-after-free issue. Do you know if there is an easy way to run the tests under sanitizers upstream? It would be handy if we can automate it, so we catch this kind of errors early on.
htyu
pushed a commit
to htyu/triton
that referenced
this issue
Mar 20, 2024
There are two tests that failed under AddressSanitizer: * test/TritonGPU/loop-pipeline.mlir * python/test/regression/test_functional_regressions.py with an error: ``` ==8475==ERROR: AddressSanitizer: heap-use-after-free on address 0x50c000bd0be0 at pc 0x557b03278847 bp 0x7ffd69b2c4a0 sp 0x7ffd69b2c498 READ of size 8 at 0x50c000bd0be0 thread T0 #0 0x557b03278846 in getNextOperandUsingThisValue [third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h:43](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h?l=43&ws=aliia/3018&snapshot=215):58 triton-lang#1 0x557b03278846 in operator++ [third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h:322](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h?l=322&ws=aliia/3018&snapshot=215):39 triton-lang#2 0x557b03278846 in mlir::ResultRange::UseIterator::operator++() [third_party/llvm/llvm-project/mlir/lib/IR/OperationSupport.cpp:614](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/mlir/lib/IR/OperationSupport.cpp?l=614&ws=aliia/3018&snapshot=215):5 triton-lang#3 0x557affde38c4 in operator++ [third_party/llvm/llvm-project/llvm/include/llvm/ADT/iterator.h:281](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/include/llvm/ADT/iterator.h?l=281&ws=aliia/3018&snapshot=215):5 triton-lang#4 0x557affde38c4 in createAsyncCopy [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:117](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=117&ws=aliia/3018&snapshot=215):26 triton-lang#5 0x557affde38c4 in createAsyncLoad [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:135](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=135&ws=aliia/3018&snapshot=215):3 triton-lang#6 0x557affde38c4 in createAsynOps [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:501](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=501&ws=aliia/3018&snapshot=215):5 triton-lang#7 0x557affde38c4 in mlir::triton::preProcessLoopAndGetSchedule(mlir::scf::ForOp&, int, mlir::triton::PipeliningOption&) [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:740](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=740&ws=aliia/3018&snapshot=215):7 triton-lang#8 0x557affe01c0c in pipelineLoop [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/SoftwarePipeliner.cpp:76](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/SoftwarePipeliner.cpp?l=76&ws=aliia/3018&snapshot=215):19 ... ``` This is likely happening due to iterator being invalidated after `alloc.erase()`. This PR moves erases of allocations outside of a loop and fixes heap-use-after-free issue. Do you know if there is an easy way to run the tests under sanitizers upstream? It would be handy if we can automate it, so we catch this kind of errors early on.
ptillet
added a commit
that referenced
this issue
Apr 1, 2024
karupayun
referenced
this issue
in openxla/triton
Apr 3, 2024
There are two tests that failed under AddressSanitizer: * test/TritonGPU/loop-pipeline.mlir * python/test/regression/test_functional_regressions.py with an error: ``` ==8475==ERROR: AddressSanitizer: heap-use-after-free on address 0x50c000bd0be0 at pc 0x557b03278847 bp 0x7ffd69b2c4a0 sp 0x7ffd69b2c498 READ of size 8 at 0x50c000bd0be0 thread T0 #0 0x557b03278846 in getNextOperandUsingThisValue [third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h:43](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h?l=43&ws=aliia/3018&snapshot=215):58 #1 0x557b03278846 in operator++ [third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h:322](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/mlir/include/mlir/IR/UseDefLists.h?l=322&ws=aliia/3018&snapshot=215):39 #2 0x557b03278846 in mlir::ResultRange::UseIterator::operator++() [third_party/llvm/llvm-project/mlir/lib/IR/OperationSupport.cpp:614](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/mlir/lib/IR/OperationSupport.cpp?l=614&ws=aliia/3018&snapshot=215):5 #3 0x557affde38c4 in operator++ [third_party/llvm/llvm-project/llvm/include/llvm/ADT/iterator.h:281](https://cs.corp.google.com/piper///depot/google3/third_party/llvm/llvm-project/llvm/include/llvm/ADT/iterator.h?l=281&ws=aliia/3018&snapshot=215):5 #4 0x557affde38c4 in createAsyncCopy [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:117](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=117&ws=aliia/3018&snapshot=215):26 #5 0x557affde38c4 in createAsyncLoad [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:135](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=135&ws=aliia/3018&snapshot=215):3 #6 0x557affde38c4 in createAsynOps [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:501](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=501&ws=aliia/3018&snapshot=215):5 #7 0x557affde38c4 in mlir::triton::preProcessLoopAndGetSchedule(mlir::scf::ForOp&, int, mlir::triton::PipeliningOption&) [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp:740](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/MatmulLoopPipeline.cpp?l=740&ws=aliia/3018&snapshot=215):7 #8 0x557affe01c0c in pipelineLoop [third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/SoftwarePipeliner.cpp:76](https://cs.corp.google.com/piper///depot/google3/third_party/triton/lib/Dialect/TritonGPU/Transforms/Pipeliner/SoftwarePipeliner.cpp?l=76&ws=aliia/3018&snapshot=215):19 ... ``` This is likely happening due to iterator being invalidated after `alloc.erase()`. This PR moves erases of allocations outside of a loop and fixes heap-use-after-free issue. Do you know if there is an easy way to run the tests under sanitizers upstream? It would be handy if we can automate it, so we catch this kind of errors early on.
jlebar
pushed a commit
that referenced
this issue
Jun 21, 2024
When running [convert_blocked1d_to_slice0](https://github.com/triton-lang/triton/blob/0ba5f0c3cd029d5c3d1f01b9bf29dac32c27345e/test/Conversion/tritongpu_to_llvm.mlir#L924) Triton ends up computing a rank of a matrix with 0 columns during linear layout lowering, which trips up f2reduce, and causes undefined behavior, detectable through [UBSAN](https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html). Fix this by returning the rank (0) early in these cases, without calling f2reduce. <details><summary>Stack trace</summary> <p> ``` third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30: runtime error: shift exponent 18446744073709551615 is too large for 64-bit type 'unsigned long long' #0 0x556ee2fea3be in inplace_rref_small third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30 #1 0x556ee2fea3be in f2reduce::inplace_rref_strided(unsigned long*, unsigned long, unsigned long, unsigned long) third_party/triton/third_party/f2reduce/f2reduce.cpp:470:9 #2 0x556ee2ea70da in getMatrixRank third_party/triton/lib/Tools/LinearLayout.cpp:125:3 #3 0x556ee2ea70da in mlir::triton::LinearLayout::checkInvariants(bool) third_party/triton/lib/Tools/LinearLayout.cpp:299:7 #4 0x556ee2ea656d in mlir::triton::LinearLayout::tryCreate(llvm::MapVector<mlir::StringAttr, std::__u::vector<std::__u::vector<int, std::__u::allocator<int>>, std::__u::allocator<std::__u::vector<int, std::__u::allocator<int>>>>, llvm::DenseMap<mlir::StringAttr, unsigned int, llvm::DenseMapInfo<mlir::StringAttr, void>, llvm::detail::DenseMapPair<mlir::StringAttr, unsigned int>>, llvm::SmallVector<std::__u::pair<mlir::StringAttr, std::__u::vector<std::__u::vector<int, std::__u::allocator<int>>, std::__u::allocator<std::__u::vector<int, std::__u::allocator<int>>>>>, 0u>>, llvm::ArrayRef<std::__u::pair<mlir::StringAttr, int>>, bool) third_party/triton/lib/Tools/LinearLayout.cpp:190:41 #5 0x556ee2eb2150 in mlir::triton::LinearLayout::divideRight(mlir::triton::LinearLayout const&) third_party/triton/lib/Tools/LinearLayout.cpp:654:51 #6 0x556ee2ee1c39 in mlir::cvtNeedsSharedMemory(mlir::RankedTensorType, mlir::RankedTensorType) third_party/triton/lib/Analysis/Utility.cpp:652:14 #7 0x556ee2cf38fd in mlir::triton::getRepShapeForCvtLayout(mlir::triton::gpu::ConvertLayoutOp) third_party/triton/lib/Analysis/Allocation.cpp:66:8 #8 0x556ee2cf3efa in mlir::triton::getScratchConfigForCvtLayout(mlir::triton::gpu::ConvertLayoutOp, unsigned int&, unsigned int&) third_party/triton/lib/Analysis/Allocation.cpp:95:19 #9 0x556ee2cf6057 in mlir::triton::AllocationAnalysis::getScratchValueSize(mlir::Operation*) third_party/triton/lib/Analysis/Allocation.cpp:272:24 #10 0x556ee2cf5499 in operator() third_party/triton/lib/Analysis/Allocation.cpp:343:7 #11 0x556ee2cf5499 in void llvm::function_ref<void (mlir::Operation*)>::callback_fn<mlir::triton::AllocationAnalysis::getValuesAndSizes()::'lambda'(mlir::Operation*)>(long, mlir::Operation*) third_party/llvm/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12 #12 0x556edeeee7a9 in operator() third_party/llvm/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12 #13 0x556edeeee7a9 in void mlir::detail::walk<mlir::ForwardIterator>(mlir::Operation*, llvm::function_ref<void (mlir::Operation*)>, mlir::WalkOrder) third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:174:5 #14 0x556edeeee87c in void mlir::detail::walk<mlir::ForwardIterator>(mlir::Operation*, llvm::function_ref<void (mlir::Operation*)>, mlir::WalkOrder) third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:182:9 #15 0x556ee2cf49e7 in walk<(mlir::WalkOrder)0, mlir::ForwardIterator, (lambda at third_party/triton/lib/Analysis/Allocation.cpp:341:42), mlir::Operation *, void> third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:313:10 #16 0x556ee2cf49e7 in walk<(mlir::WalkOrder)0, mlir::ForwardIterator, (lambda at third_party/triton/lib/Analysis/Allocation.cpp:341:42), void> third_party/llvm/llvm-project/mlir/include/mlir/IR/Operation.h:794:12 #17 0x556ee2cf49e7 in mlir::triton::AllocationAnalysis::getValuesAndSizes() third_party/triton/lib/Analysis/Allocation.cpp:341:16 #18 0x556ee2cf4852 in run third_party/triton/lib/Analysis/Allocation.cpp:182:5 #19 0x556ee2cf4852 in AllocationAnalysis third_party/triton/lib/Analysis/Allocation.cpp:169:5 #20 0x556ee2cf4852 in mlir::Allocation::run(llvm::DenseMap<mlir::FunctionOpInterface, mlir::Allocation, llvm::DenseMapInfo<mlir::FunctionOpInterface, void>, llvm::detail::DenseMapPair<mlir::FunctionOpInterface, mlir::Allocation>>&) third_party/triton/lib/Analysis/Allocation.cpp:627:3 #21 0x556ee1677402 in operator() third_party/triton/include/triton/Analysis/Allocation.h:227:26 #22 0x556ee1677402 in void mlir::CallGraph<mlir::Allocation>::doWalk<(mlir::WalkOrder)0, (mlir::WalkOrder)1, mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::CallOpInterface, mlir::FunctionOpInterface), mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::FunctionOpInterface)>(mlir::FunctionOpInterface, llvm::DenseSet<mlir::FunctionOpInterface, llvm::DenseMapInfo<mlir::FunctionOpInterface, void>>&, mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::CallOpInterface, mlir::FunctionOpInterface), mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::FunctionOpInterface)) third_party/triton/include/triton/Analysis/Utility.h:350:7 #23 0x556ee16756b3 in walk<(mlir::WalkOrder)0, (mlir::WalkOrder)1, (lambda at third_party/triton/include/triton/Analysis/Allocation.h:222:9), (lambda at third_party/triton/include/triton/Analysis/Allocation.h:224:9)> third_party/triton/include/triton/Analysis/Utility.h:242:7 #24 0x556ee16756b3 in mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp) third_party/triton/include/triton/Analysis/Allocation.h:220:5 #25 0x556ee2c2bf18 in (anonymous namespace)::AllocateSharedMemory::runOnOperation() third_party/triton/lib/Conversion/TritonGPUToLLVM/AllocateSharedMemory.cpp:26:22 ... UndefinedBehaviorSanitizer: invalid-shift-exponent third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30 ``` </p> </details>
oraluben
pushed a commit
to oraluben/triton
that referenced
this issue
Sep 11, 2024
…ent (triton-lang#8) * [BACKEND][CPU] Make it buildable and runnable in a different environment * Revert seemingly inconsistent python code formatting
pawelszczerbuk
pushed a commit
that referenced
this issue
Nov 7, 2024
* add simple python -> ptx test for sm_100
gglin001
pushed a commit
to gglin001/triton
that referenced
this issue
Nov 13, 2024
…ent (triton-lang#8) * [BACKEND][CPU] Make it buildable and runnable in a different environment * Revert seemingly inconsistent python code formatting
bertmaher
pushed a commit
to bertmaher/triton
that referenced
this issue
Dec 10, 2024
When running [convert_blocked1d_to_slice0](https://github.com/triton-lang/triton/blob/0ba5f0c3cd029d5c3d1f01b9bf29dac32c27345e/test/Conversion/tritongpu_to_llvm.mlir#L924) Triton ends up computing a rank of a matrix with 0 columns during linear layout lowering, which trips up f2reduce, and causes undefined behavior, detectable through [UBSAN](https://clang.llvm.org/docs/UndefinedBehaviorSanitizer.html). Fix this by returning the rank (0) early in these cases, without calling f2reduce. <details><summary>Stack trace</summary> <p> ``` third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30: runtime error: shift exponent 18446744073709551615 is too large for 64-bit type 'unsigned long long' #0 0x556ee2fea3be in inplace_rref_small third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30 triton-lang#1 0x556ee2fea3be in f2reduce::inplace_rref_strided(unsigned long*, unsigned long, unsigned long, unsigned long) third_party/triton/third_party/f2reduce/f2reduce.cpp:470:9 triton-lang#2 0x556ee2ea70da in getMatrixRank third_party/triton/lib/Tools/LinearLayout.cpp:125:3 triton-lang#3 0x556ee2ea70da in mlir::triton::LinearLayout::checkInvariants(bool) third_party/triton/lib/Tools/LinearLayout.cpp:299:7 triton-lang#4 0x556ee2ea656d in mlir::triton::LinearLayout::tryCreate(llvm::MapVector<mlir::StringAttr, std::__u::vector<std::__u::vector<int, std::__u::allocator<int>>, std::__u::allocator<std::__u::vector<int, std::__u::allocator<int>>>>, llvm::DenseMap<mlir::StringAttr, unsigned int, llvm::DenseMapInfo<mlir::StringAttr, void>, llvm::detail::DenseMapPair<mlir::StringAttr, unsigned int>>, llvm::SmallVector<std::__u::pair<mlir::StringAttr, std::__u::vector<std::__u::vector<int, std::__u::allocator<int>>, std::__u::allocator<std::__u::vector<int, std::__u::allocator<int>>>>>, 0u>>, llvm::ArrayRef<std::__u::pair<mlir::StringAttr, int>>, bool) third_party/triton/lib/Tools/LinearLayout.cpp:190:41 triton-lang#5 0x556ee2eb2150 in mlir::triton::LinearLayout::divideRight(mlir::triton::LinearLayout const&) third_party/triton/lib/Tools/LinearLayout.cpp:654:51 triton-lang#6 0x556ee2ee1c39 in mlir::cvtNeedsSharedMemory(mlir::RankedTensorType, mlir::RankedTensorType) third_party/triton/lib/Analysis/Utility.cpp:652:14 triton-lang#7 0x556ee2cf38fd in mlir::triton::getRepShapeForCvtLayout(mlir::triton::gpu::ConvertLayoutOp) third_party/triton/lib/Analysis/Allocation.cpp:66:8 triton-lang#8 0x556ee2cf3efa in mlir::triton::getScratchConfigForCvtLayout(mlir::triton::gpu::ConvertLayoutOp, unsigned int&, unsigned int&) third_party/triton/lib/Analysis/Allocation.cpp:95:19 triton-lang#9 0x556ee2cf6057 in mlir::triton::AllocationAnalysis::getScratchValueSize(mlir::Operation*) third_party/triton/lib/Analysis/Allocation.cpp:272:24 triton-lang#10 0x556ee2cf5499 in operator() third_party/triton/lib/Analysis/Allocation.cpp:343:7 triton-lang#11 0x556ee2cf5499 in void llvm::function_ref<void (mlir::Operation*)>::callback_fn<mlir::triton::AllocationAnalysis::getValuesAndSizes()::'lambda'(mlir::Operation*)>(long, mlir::Operation*) third_party/llvm/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:45:12 triton-lang#12 0x556edeeee7a9 in operator() third_party/llvm/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:68:12 triton-lang#13 0x556edeeee7a9 in void mlir::detail::walk<mlir::ForwardIterator>(mlir::Operation*, llvm::function_ref<void (mlir::Operation*)>, mlir::WalkOrder) third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:174:5 triton-lang#14 0x556edeeee87c in void mlir::detail::walk<mlir::ForwardIterator>(mlir::Operation*, llvm::function_ref<void (mlir::Operation*)>, mlir::WalkOrder) third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:182:9 triton-lang#15 0x556ee2cf49e7 in walk<(mlir::WalkOrder)0, mlir::ForwardIterator, (lambda at third_party/triton/lib/Analysis/Allocation.cpp:341:42), mlir::Operation *, void> third_party/llvm/llvm-project/mlir/include/mlir/IR/Visitors.h:313:10 triton-lang#16 0x556ee2cf49e7 in walk<(mlir::WalkOrder)0, mlir::ForwardIterator, (lambda at third_party/triton/lib/Analysis/Allocation.cpp:341:42), void> third_party/llvm/llvm-project/mlir/include/mlir/IR/Operation.h:794:12 triton-lang#17 0x556ee2cf49e7 in mlir::triton::AllocationAnalysis::getValuesAndSizes() third_party/triton/lib/Analysis/Allocation.cpp:341:16 triton-lang#18 0x556ee2cf4852 in run third_party/triton/lib/Analysis/Allocation.cpp:182:5 triton-lang#19 0x556ee2cf4852 in AllocationAnalysis third_party/triton/lib/Analysis/Allocation.cpp:169:5 triton-lang#20 0x556ee2cf4852 in mlir::Allocation::run(llvm::DenseMap<mlir::FunctionOpInterface, mlir::Allocation, llvm::DenseMapInfo<mlir::FunctionOpInterface, void>, llvm::detail::DenseMapPair<mlir::FunctionOpInterface, mlir::Allocation>>&) third_party/triton/lib/Analysis/Allocation.cpp:627:3 triton-lang#21 0x556ee1677402 in operator() third_party/triton/include/triton/Analysis/Allocation.h:227:26 triton-lang#22 0x556ee1677402 in void mlir::CallGraph<mlir::Allocation>::doWalk<(mlir::WalkOrder)0, (mlir::WalkOrder)1, mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::CallOpInterface, mlir::FunctionOpInterface), mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::FunctionOpInterface)>(mlir::FunctionOpInterface, llvm::DenseSet<mlir::FunctionOpInterface, llvm::DenseMapInfo<mlir::FunctionOpInterface, void>>&, mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::CallOpInterface, mlir::FunctionOpInterface), mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp)::'lambda'(mlir::FunctionOpInterface)) third_party/triton/include/triton/Analysis/Utility.h:350:7 triton-lang#23 0x556ee16756b3 in walk<(mlir::WalkOrder)0, (mlir::WalkOrder)1, (lambda at third_party/triton/include/triton/Analysis/Allocation.h:222:9), (lambda at third_party/triton/include/triton/Analysis/Allocation.h:224:9)> third_party/triton/include/triton/Analysis/Utility.h:242:7 triton-lang#24 0x556ee16756b3 in mlir::ModuleAllocation::ModuleAllocation(mlir::ModuleOp) third_party/triton/include/triton/Analysis/Allocation.h:220:5 triton-lang#25 0x556ee2c2bf18 in (anonymous namespace)::AllocateSharedMemory::runOnOperation() third_party/triton/lib/Conversion/TritonGPUToLLVM/AllocateSharedMemory.cpp:26:22 ... UndefinedBehaviorSanitizer: invalid-shift-exponent third_party/triton/third_party/f2reduce/f2reduce.cpp:421:30 ``` </p> </details>
stephen-huan
pushed a commit
to stephen-huan/triton
that referenced
this issue
Dec 24, 2024
…ent (triton-lang#8) * [BACKEND][CPU] Make it buildable and runnable in a different environment * Revert seemingly inconsistent python code formatting
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
It seems like some kernels tend to corrupt memory. This problem does not show on the benchmarks but breaks the BLAS3-test (despite computing the correct matrix).
The text was updated successfully, but these errors were encountered: