Skip to content

Conversation

tomtor
Copy link
Contributor

@tomtor tomtor commented Aug 10, 2025

Fix #153156

See also #152028

Copy link
Contributor

@Patryk27 Patryk27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cc @benshi001 as well

@tomtor
Copy link
Contributor Author

tomtor commented Aug 10, 2025

@Patryk27 The regression is back?!

Edit: just that Rust nightly-2025-08-10-x86_64-unknown-linux-gnu fails again....

@Patryk27
Copy link
Contributor

Huh, weird - from what I've gathered, the issue is that Rust's LLVM fork generates a somewhat longer assembly output, right?

@tomtor
Copy link
Contributor Author

tomtor commented Aug 10, 2025

Huh, weird - from what I've gathered, the issue is that Rust's LLVM fork generates a somewhat longer assembly output, right?

Yes, starting 2025-08-07. But this morning nightly was ok again.

62735d2 introduced the fix after introducing the bug 3 days earlier (see commit messsge), so it is not a real Rust issue, it is in the llvm used by Rust nightly, which apparently is very up-to-date..

mulqi3 was no longer generated for an 8bit modulo operation, but instead mulhi3 (which is the longer assembly) because the 8bit value was wrongly extended to 16bit. I have no clue why Rust nightly is wrong again.

This PR just rebased on current llvm and it runs fine.

@tgross35
Copy link
Contributor

tgross35 commented Aug 11, 2025

62735d2 introduced the fix after introducing the bug 3 days earlier (see commit messsge), so it is not a real Rust issue, it is in the llvm used by Rust nightly, which apparently is very up-to-date..

Rustc stays pretty up to date - usually the major version bumps get merged as soon as we know LLVM's release date will be before Rust's stable release date for that cycle (after a few months of testing), and minor bumps happen immediately. But that commit and the one that fixed it are still too new to have made it to our fork https://github.com/rust-lang/llvm-project/commits/rustc/21.1-2025-08-01/.

Maybe it regressed with the llvm bump but got fixed with a more codegen change on Rust's side?

(probably not worth all that much investigation if it's fixed now)

@tomtor
Copy link
Contributor Author

tomtor commented Aug 11, 2025

@tgross35

Thanks for the feedback, but the nightly build fails since 2025-08-07 till today:

searched nightlies: from nightly-2025-08-01 to nightly-2025-08-08
regressed nightly: nightly-2025-08-07
searched commit range: rust-lang/rust@ec7c026...7d82b83
regressed commit: rust-lang/rust@dc0bae1

@tgross35
Copy link
Contributor

Ah sorry! I thought you were saying it was fixed in rustc (not just llvm)

@tomtor
Copy link
Contributor Author

tomtor commented Aug 11, 2025

Ah sorry! I thought you were saying it was fixed in rustc (not just llvm)

@tgross35 Its confusing, but I just tested again nightlies 2025-08-xx:

06 ok
07 fail
08 ok
09 ok
10 ok

So only 2025-08-07 failed, and it is indeed fixed now. Sorry for my inconsistent messaging, but yesterday it looked as if it failed again. Probably mixed up viewing release and debug builds.

Still strange that only 07 fails, but as you said, not worth the trouble of investigating if it is ok now. Thanks!

@tomtor tomtor marked this pull request as draft August 12, 2025 09:10
@tomtor
Copy link
Contributor Author

tomtor commented Aug 12, 2025

It is a regression in llvm:

#153156

This PR needs a different test (and a fix:-) )

(And it IS present in current Rust, but my test script tested binary sizes instead of the actual generated code :/ )

@tomtor tomtor changed the title [AVR] Add extra codegen test [AVR] Mulhi3/mulqi3 regression test Aug 12, 2025
@tomtor tomtor changed the title [AVR] Mulhi3/mulqi3 regression test [AVR] 8 bit trunc regression Aug 22, 2025
@tomtor
Copy link
Contributor Author

tomtor commented Aug 22, 2025

Fix #153156

@tomtor tomtor marked this pull request as ready for review August 22, 2025 10:56
@tomtor tomtor requested a review from nikic as a code owner August 22, 2025 10:56
@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:analysis Includes value tracking, cost tables and constant folding llvm:transforms labels Aug 22, 2025
@llvmbot
Copy link
Member

llvmbot commented Aug 22, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Tom Vijlbrief (tomtor)

Changes

Fix #153156

See also #152028


Full diff: https://github.com/llvm/llvm-project/pull/152902.diff

6 Files Affected:

  • (modified) llvm/lib/Analysis/InlineCost.cpp (+3)
  • (modified) llvm/lib/Target/AVR/AVRTargetTransformInfo.h (+6)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp (+7-2)
  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+4)
  • (added) llvm/test/CodeGen/AVR/issue-151080-mod.ll (+58)
  • (added) llvm/test/CodeGen/AVR/issue-153156.ll (+90)
diff --git a/llvm/lib/Analysis/InlineCost.cpp b/llvm/lib/Analysis/InlineCost.cpp
index 757f68999691e..1b0710e09b0f9 100644
--- a/llvm/lib/Analysis/InlineCost.cpp
+++ b/llvm/lib/Analysis/InlineCost.cpp
@@ -1766,6 +1766,9 @@ bool CallAnalyzer::visitGetElementPtr(GetElementPtrInst &I) {
 // This handles the case only when the Cmp instruction is guarding a recursive
 // call that will cause the Cmp to fail/succeed for the recursive call.
 bool CallAnalyzer::simplifyCmpInstForRecCall(CmpInst &Cmp) {
+  // FIXME Regression on AVR: github.com/llvm/llvm-project/issues/153156
+  if (!DL.isLegalInteger(32) && DL.isLegalInteger(8))
+    return false;
   // Bail out if LHS is not a function argument or RHS is NOT const:
   if (!isa<Argument>(Cmp.getOperand(0)) || !isa<Constant>(Cmp.getOperand(1)))
     return false;
diff --git a/llvm/lib/Target/AVR/AVRTargetTransformInfo.h b/llvm/lib/Target/AVR/AVRTargetTransformInfo.h
index 0daeeb8f11cfe..e6862d8743bbe 100644
--- a/llvm/lib/Target/AVR/AVRTargetTransformInfo.h
+++ b/llvm/lib/Target/AVR/AVRTargetTransformInfo.h
@@ -44,6 +44,12 @@ class AVRTTIImpl final : public BasicTTIImplBase<AVRTTIImpl> {
 
   bool isLSRCostLess(const TargetTransformInfo::LSRCost &C1,
                      const TargetTransformInfo::LSRCost &C2) const override;
+
+  TypeSize
+  getRegisterBitWidth(TargetTransformInfo::RegisterKind K) const override {
+    // default is 32, so change it to 16
+    return TypeSize::getFixed(16);
+  }
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
index 6477141ab095f..88801a22ffa51 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
@@ -1632,12 +1632,17 @@ Instruction *InstCombinerImpl::visitPHINode(PHINode &PN) {
     return replaceInstUsesWith(PN, &IdenticalPN);
   }
 
-  // If this is an integer PHI and we know that it has an illegal type, see if
+  // For 8/16 bit CPUs prefer 8 bit registers
+  bool preferByteRegister = !DL.isLegalInteger(32);
+
+  // If this is an integer PHI and we know that it has an illegal type,
+  // (or 16 bit on 8/16 bit CPUs), see if
   // it is only used by trunc or trunc(lshr) operations.  If so, we split the
   // PHI into the various pieces being extracted.  This sort of thing is
   // introduced when SROA promotes an aggregate to a single large integer type.
   if (PN.getType()->isIntegerTy() &&
-      !DL.isLegalInteger(PN.getType()->getPrimitiveSizeInBits()))
+      ((!DL.isLegalInteger(PN.getType()->getPrimitiveSizeInBits())) ||
+       (preferByteRegister && PN.getType()->getPrimitiveSizeInBits() == 16)))
     if (Instruction *Res = SliceUpIllegalIntegerPHI(PN))
       return Res;
 
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 5ee3bb1abe86e..bb4144ad109ca 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -311,6 +311,10 @@ bool InstCombinerImpl::shouldChangeType(unsigned FromWidth,
   bool FromLegal = FromWidth == 1 || DL.isLegalInteger(FromWidth);
   bool ToLegal = ToWidth == 1 || DL.isLegalInteger(ToWidth);
 
+  // For 8/16 bit CPUs prefer 8 bit.
+  if (!DL.isLegalInteger(32) && ToWidth == 16)
+    ToLegal = false;
+
   // Convert to desirable widths even if they are not legal types.
   // Only shrink types, to prevent infinite loops.
   if (ToWidth < FromWidth && isDesirableIntType(ToWidth))
diff --git a/llvm/test/CodeGen/AVR/issue-151080-mod.ll b/llvm/test/CodeGen/AVR/issue-151080-mod.ll
new file mode 100644
index 0000000000000..e2981236482e6
--- /dev/null
+++ b/llvm/test/CodeGen/AVR/issue-151080-mod.ll
@@ -0,0 +1,58 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -O=3 -mtriple=avr-none -mcpu=attiny85 -verify-machineinstrs | FileCheck %s
+
+@c = dso_local local_unnamed_addr global i8 0, align 1
+define dso_local void @mod(i16 noundef %0) local_unnamed_addr addrspace(1) {
+; CHECK-LABEL: mod:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    push r14
+; CHECK-NEXT:    push r16
+; CHECK-NEXT:    push r17
+; CHECK-NEXT:    cpi r24, 10
+; CHECK-NEXT:    cpc r25, r1
+; CHECK-NEXT:    brlo .LBB0_2
+; CHECK-NEXT:  ; %bb.1:
+; CHECK-NEXT:    ldi r18, 205
+; CHECK-NEXT:    ldi r19, 204
+; CHECK-NEXT:    ldi r20, 0
+; CHECK-NEXT:    ldi r21, 0
+; CHECK-NEXT:    movw r22, r24
+; CHECK-NEXT:    mov r14, r24
+; CHECK-NEXT:    movw r24, r20
+; CHECK-NEXT:    rcall __mulsi3
+; CHECK-NEXT:    movw r16, r24
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    movw r24, r16
+; CHECK-NEXT:    rcall mod
+; CHECK-NEXT:    mov r24, r16
+; CHECK-NEXT:    ldi r22, -10
+; CHECK-NEXT:    rcall __mulqi3
+; CHECK-NEXT:    add r24, r14
+; CHECK-NEXT:  .LBB0_2:
+; CHECK-NEXT:    ori r24, 48
+; CHECK-NEXT:    sts c, r24
+; CHECK-NEXT:    pop r17
+; CHECK-NEXT:    pop r16
+; CHECK-NEXT:    pop r14
+; CHECK-NEXT:    ret
+  %2 = icmp ugt i16 %0, 9
+  %3 = trunc i16 %0 to i8
+  br i1 %2, label %4, label %9
+4:                                                ; preds = %1
+  %5 = udiv i16 %0, 10
+  %6 = trunc i16 %5 to i8
+  %7 = mul i8 %6, -10
+  tail call addrspace(1) void @mod(i16 noundef %5)
+  %8 = add i8 %7, %3
+  br label %9
+9:                                                ; preds = %4, %1
+  %10 = phi i8 [ %3, %1 ], [ %8, %4 ]
+  %11 = or disjoint i8 %10, 48
+  store i8 %11, ptr @c, align 1
+  ret void
+}
diff --git a/llvm/test/CodeGen/AVR/issue-153156.ll b/llvm/test/CodeGen/AVR/issue-153156.ll
new file mode 100644
index 0000000000000..f9d08fc095d3f
--- /dev/null
+++ b/llvm/test/CodeGen/AVR/issue-153156.ll
@@ -0,0 +1,90 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: opt -Os -mtriple=avr-none < %s | llc -mtriple=avr-none -mcpu=attiny85 -verify-machineinstrs | FileCheck %s
+
+@c = dso_local global i8 0, align 1
+@ti = dso_local global i16 0, align 1
+
+define dso_local void @mod(i16 noundef %0) local_unnamed_addr addrspace(1) {
+; CHECK-LABEL: mod:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    push r14
+; CHECK-NEXT:    push r16
+; CHECK-NEXT:    push r17
+; CHECK-NEXT:    cpi r24, 10
+; CHECK-NEXT:    cpc r25, r1
+; CHECK-NEXT:    brlo .LBB0_2
+; CHECK-NEXT:  ; %bb.1: ; %tailrecurse.preheader
+; CHECK-NEXT:    ldi r18, 205
+; CHECK-NEXT:    ldi r19, 204
+; CHECK-NEXT:    ldi r20, 0
+; CHECK-NEXT:    ldi r21, 0
+; CHECK-NEXT:    movw r22, r24
+; CHECK-NEXT:    mov r14, r24
+; CHECK-NEXT:    movw r24, r20
+; CHECK-NEXT:    rcall __mulsi3
+; CHECK-NEXT:    movw r16, r24
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    movw r24, r16
+; CHECK-NEXT:    rcall mod
+; CHECK-NEXT:    mov r24, r16
+; CHECK-NEXT:    ldi r22, -10
+; CHECK-NEXT:    rcall __mulqi3
+; CHECK-NEXT:    add r24, r14
+; CHECK-NEXT:  .LBB0_2: ; %tailrecurse._crit_edge
+; CHECK-NEXT:    ori r24, 48
+; CHECK-NEXT:    sts c, r24
+; CHECK-NEXT:    pop r17
+; CHECK-NEXT:    pop r16
+; CHECK-NEXT:    pop r14
+; CHECK-NEXT:    ret
+  %2 = alloca i16, align 1
+  store i16 %0, ptr %2, align 1
+  %3 = load i16, ptr %2, align 1
+  %4 = icmp ugt i16 %3, 9
+  br i1 %4, label %5, label %10
+
+5:                                                ; preds = %1
+  %6 = load i16, ptr %2, align 1
+  %7 = udiv i16 %6, 10
+  call addrspace(1) void @mod(i16 noundef %7)
+  %8 = load i16, ptr %2, align 1
+  %9 = urem i16 %8, 10
+  call addrspace(1) void @mod(i16 noundef %9)
+  br label %14
+
+10:                                               ; preds = %1
+  %11 = load i16, ptr %2, align 1
+  %12 = add i16 48, %11
+  %13 = trunc i16 %12 to i8
+  store volatile i8 %13, ptr @c, align 1
+  br label %14
+
+14:                                               ; preds = %10, %5
+  ret void
+}
+
+define dso_local void @t(i16 noundef %0) addrspace(1) {
+; CHECK-LABEL: t:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    ldi r22, 57
+; CHECK-NEXT:    rcall __mulqi3
+; CHECK-NEXT:    mov r25, r24
+; CHECK-NEXT:    lsl r25
+; CHECK-NEXT:    sbc r25, r25
+; CHECK-NEXT:    sts ti+1, r25
+; CHECK-NEXT:    sts ti, r24
+; CHECK-NEXT:    ret
+  %2 = alloca i16, align 1
+  store i16 %0, ptr %2, align 1
+  %3 = load i16, ptr %2, align 1
+  %4 = mul nsw i16 57, %3
+  %5 = trunc i16 %4 to i8
+  %6 = sext i8 %5 to i16
+  store i16 %6, ptr @ti, align 1
+  ret void
+}

@llvmbot
Copy link
Member

llvmbot commented Aug 22, 2025

@llvm/pr-subscribers-llvm-analysis

Author: Tom Vijlbrief (tomtor)

Changes

Fix #153156

See also #152028


Full diff: https://github.com/llvm/llvm-project/pull/152902.diff

6 Files Affected:

  • (modified) llvm/lib/Analysis/InlineCost.cpp (+3)
  • (modified) llvm/lib/Target/AVR/AVRTargetTransformInfo.h (+6)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp (+7-2)
  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+4)
  • (added) llvm/test/CodeGen/AVR/issue-151080-mod.ll (+58)
  • (added) llvm/test/CodeGen/AVR/issue-153156.ll (+90)
diff --git a/llvm/lib/Analysis/InlineCost.cpp b/llvm/lib/Analysis/InlineCost.cpp
index 757f68999691e..1b0710e09b0f9 100644
--- a/llvm/lib/Analysis/InlineCost.cpp
+++ b/llvm/lib/Analysis/InlineCost.cpp
@@ -1766,6 +1766,9 @@ bool CallAnalyzer::visitGetElementPtr(GetElementPtrInst &I) {
 // This handles the case only when the Cmp instruction is guarding a recursive
 // call that will cause the Cmp to fail/succeed for the recursive call.
 bool CallAnalyzer::simplifyCmpInstForRecCall(CmpInst &Cmp) {
+  // FIXME Regression on AVR: github.com/llvm/llvm-project/issues/153156
+  if (!DL.isLegalInteger(32) && DL.isLegalInteger(8))
+    return false;
   // Bail out if LHS is not a function argument or RHS is NOT const:
   if (!isa<Argument>(Cmp.getOperand(0)) || !isa<Constant>(Cmp.getOperand(1)))
     return false;
diff --git a/llvm/lib/Target/AVR/AVRTargetTransformInfo.h b/llvm/lib/Target/AVR/AVRTargetTransformInfo.h
index 0daeeb8f11cfe..e6862d8743bbe 100644
--- a/llvm/lib/Target/AVR/AVRTargetTransformInfo.h
+++ b/llvm/lib/Target/AVR/AVRTargetTransformInfo.h
@@ -44,6 +44,12 @@ class AVRTTIImpl final : public BasicTTIImplBase<AVRTTIImpl> {
 
   bool isLSRCostLess(const TargetTransformInfo::LSRCost &C1,
                      const TargetTransformInfo::LSRCost &C2) const override;
+
+  TypeSize
+  getRegisterBitWidth(TargetTransformInfo::RegisterKind K) const override {
+    // default is 32, so change it to 16
+    return TypeSize::getFixed(16);
+  }
 };
 
 } // end namespace llvm
diff --git a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
index 6477141ab095f..88801a22ffa51 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombinePHI.cpp
@@ -1632,12 +1632,17 @@ Instruction *InstCombinerImpl::visitPHINode(PHINode &PN) {
     return replaceInstUsesWith(PN, &IdenticalPN);
   }
 
-  // If this is an integer PHI and we know that it has an illegal type, see if
+  // For 8/16 bit CPUs prefer 8 bit registers
+  bool preferByteRegister = !DL.isLegalInteger(32);
+
+  // If this is an integer PHI and we know that it has an illegal type,
+  // (or 16 bit on 8/16 bit CPUs), see if
   // it is only used by trunc or trunc(lshr) operations.  If so, we split the
   // PHI into the various pieces being extracted.  This sort of thing is
   // introduced when SROA promotes an aggregate to a single large integer type.
   if (PN.getType()->isIntegerTy() &&
-      !DL.isLegalInteger(PN.getType()->getPrimitiveSizeInBits()))
+      ((!DL.isLegalInteger(PN.getType()->getPrimitiveSizeInBits())) ||
+       (preferByteRegister && PN.getType()->getPrimitiveSizeInBits() == 16)))
     if (Instruction *Res = SliceUpIllegalIntegerPHI(PN))
       return Res;
 
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index 5ee3bb1abe86e..bb4144ad109ca 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -311,6 +311,10 @@ bool InstCombinerImpl::shouldChangeType(unsigned FromWidth,
   bool FromLegal = FromWidth == 1 || DL.isLegalInteger(FromWidth);
   bool ToLegal = ToWidth == 1 || DL.isLegalInteger(ToWidth);
 
+  // For 8/16 bit CPUs prefer 8 bit.
+  if (!DL.isLegalInteger(32) && ToWidth == 16)
+    ToLegal = false;
+
   // Convert to desirable widths even if they are not legal types.
   // Only shrink types, to prevent infinite loops.
   if (ToWidth < FromWidth && isDesirableIntType(ToWidth))
diff --git a/llvm/test/CodeGen/AVR/issue-151080-mod.ll b/llvm/test/CodeGen/AVR/issue-151080-mod.ll
new file mode 100644
index 0000000000000..e2981236482e6
--- /dev/null
+++ b/llvm/test/CodeGen/AVR/issue-151080-mod.ll
@@ -0,0 +1,58 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: llc < %s -O=3 -mtriple=avr-none -mcpu=attiny85 -verify-machineinstrs | FileCheck %s
+
+@c = dso_local local_unnamed_addr global i8 0, align 1
+define dso_local void @mod(i16 noundef %0) local_unnamed_addr addrspace(1) {
+; CHECK-LABEL: mod:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    push r14
+; CHECK-NEXT:    push r16
+; CHECK-NEXT:    push r17
+; CHECK-NEXT:    cpi r24, 10
+; CHECK-NEXT:    cpc r25, r1
+; CHECK-NEXT:    brlo .LBB0_2
+; CHECK-NEXT:  ; %bb.1:
+; CHECK-NEXT:    ldi r18, 205
+; CHECK-NEXT:    ldi r19, 204
+; CHECK-NEXT:    ldi r20, 0
+; CHECK-NEXT:    ldi r21, 0
+; CHECK-NEXT:    movw r22, r24
+; CHECK-NEXT:    mov r14, r24
+; CHECK-NEXT:    movw r24, r20
+; CHECK-NEXT:    rcall __mulsi3
+; CHECK-NEXT:    movw r16, r24
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    movw r24, r16
+; CHECK-NEXT:    rcall mod
+; CHECK-NEXT:    mov r24, r16
+; CHECK-NEXT:    ldi r22, -10
+; CHECK-NEXT:    rcall __mulqi3
+; CHECK-NEXT:    add r24, r14
+; CHECK-NEXT:  .LBB0_2:
+; CHECK-NEXT:    ori r24, 48
+; CHECK-NEXT:    sts c, r24
+; CHECK-NEXT:    pop r17
+; CHECK-NEXT:    pop r16
+; CHECK-NEXT:    pop r14
+; CHECK-NEXT:    ret
+  %2 = icmp ugt i16 %0, 9
+  %3 = trunc i16 %0 to i8
+  br i1 %2, label %4, label %9
+4:                                                ; preds = %1
+  %5 = udiv i16 %0, 10
+  %6 = trunc i16 %5 to i8
+  %7 = mul i8 %6, -10
+  tail call addrspace(1) void @mod(i16 noundef %5)
+  %8 = add i8 %7, %3
+  br label %9
+9:                                                ; preds = %4, %1
+  %10 = phi i8 [ %3, %1 ], [ %8, %4 ]
+  %11 = or disjoint i8 %10, 48
+  store i8 %11, ptr @c, align 1
+  ret void
+}
diff --git a/llvm/test/CodeGen/AVR/issue-153156.ll b/llvm/test/CodeGen/AVR/issue-153156.ll
new file mode 100644
index 0000000000000..f9d08fc095d3f
--- /dev/null
+++ b/llvm/test/CodeGen/AVR/issue-153156.ll
@@ -0,0 +1,90 @@
+; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
+; RUN: opt -Os -mtriple=avr-none < %s | llc -mtriple=avr-none -mcpu=attiny85 -verify-machineinstrs | FileCheck %s
+
+@c = dso_local global i8 0, align 1
+@ti = dso_local global i16 0, align 1
+
+define dso_local void @mod(i16 noundef %0) local_unnamed_addr addrspace(1) {
+; CHECK-LABEL: mod:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    push r14
+; CHECK-NEXT:    push r16
+; CHECK-NEXT:    push r17
+; CHECK-NEXT:    cpi r24, 10
+; CHECK-NEXT:    cpc r25, r1
+; CHECK-NEXT:    brlo .LBB0_2
+; CHECK-NEXT:  ; %bb.1: ; %tailrecurse.preheader
+; CHECK-NEXT:    ldi r18, 205
+; CHECK-NEXT:    ldi r19, 204
+; CHECK-NEXT:    ldi r20, 0
+; CHECK-NEXT:    ldi r21, 0
+; CHECK-NEXT:    movw r22, r24
+; CHECK-NEXT:    mov r14, r24
+; CHECK-NEXT:    movw r24, r20
+; CHECK-NEXT:    rcall __mulsi3
+; CHECK-NEXT:    movw r16, r24
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    lsr r17
+; CHECK-NEXT:    ror r16
+; CHECK-NEXT:    movw r24, r16
+; CHECK-NEXT:    rcall mod
+; CHECK-NEXT:    mov r24, r16
+; CHECK-NEXT:    ldi r22, -10
+; CHECK-NEXT:    rcall __mulqi3
+; CHECK-NEXT:    add r24, r14
+; CHECK-NEXT:  .LBB0_2: ; %tailrecurse._crit_edge
+; CHECK-NEXT:    ori r24, 48
+; CHECK-NEXT:    sts c, r24
+; CHECK-NEXT:    pop r17
+; CHECK-NEXT:    pop r16
+; CHECK-NEXT:    pop r14
+; CHECK-NEXT:    ret
+  %2 = alloca i16, align 1
+  store i16 %0, ptr %2, align 1
+  %3 = load i16, ptr %2, align 1
+  %4 = icmp ugt i16 %3, 9
+  br i1 %4, label %5, label %10
+
+5:                                                ; preds = %1
+  %6 = load i16, ptr %2, align 1
+  %7 = udiv i16 %6, 10
+  call addrspace(1) void @mod(i16 noundef %7)
+  %8 = load i16, ptr %2, align 1
+  %9 = urem i16 %8, 10
+  call addrspace(1) void @mod(i16 noundef %9)
+  br label %14
+
+10:                                               ; preds = %1
+  %11 = load i16, ptr %2, align 1
+  %12 = add i16 48, %11
+  %13 = trunc i16 %12 to i8
+  store volatile i8 %13, ptr @c, align 1
+  br label %14
+
+14:                                               ; preds = %10, %5
+  ret void
+}
+
+define dso_local void @t(i16 noundef %0) addrspace(1) {
+; CHECK-LABEL: t:
+; CHECK:       ; %bb.0:
+; CHECK-NEXT:    ldi r22, 57
+; CHECK-NEXT:    rcall __mulqi3
+; CHECK-NEXT:    mov r25, r24
+; CHECK-NEXT:    lsl r25
+; CHECK-NEXT:    sbc r25, r25
+; CHECK-NEXT:    sts ti+1, r25
+; CHECK-NEXT:    sts ti, r24
+; CHECK-NEXT:    ret
+  %2 = alloca i16, align 1
+  store i16 %0, ptr %2, align 1
+  %3 = load i16, ptr %2, align 1
+  %4 = mul nsw i16 57, %3
+  %5 = trunc i16 %4 to i8
+  %6 = sext i8 %5 to i16
+  store i16 %6, ptr @ti, align 1
+  ret void
+}

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The AVR TTI change is fine, but the rest is not acceptable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:analysis Includes value tracking, cost tables and constant folding llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[AVR] code generation regression (mulhi3 instead of mulqi3)

5 participants