Skip to content

Conversation

@LeiWang1999
Copy link
Member

@LeiWang1999 LeiWang1999 commented Nov 10, 2025

  • Changed Max computation for AbsMax type to use absolute values of lhs and rhs.
  • Removed unused shared memory reduction logic and related checks for buffer dimensions and thread extents, simplifying the Lower method.
  • Added a fatal log for unsupported buffer scope reductions.

Summary by CodeRabbit

  • Refactor
    • Updated AbsMax reduction computation.
    • Replaced direct reduction call with a macro-backed reduction flow that handles fragment vs shared buffer combinations and performs necessary intermediate copies.
    • Removed shared-memory reduction lowering; only fragment-local reductions are supported and other buffer-scope combinations now raise an error.

…ion and remove unused shared memory reduction logic

* Changed Max computation for AbsMax type to use absolute values of lhs and rhs.
* Removed unused shared memory reduction logic and related checks for buffer dimensions and thread extents, simplifying the Lower method.
* Added a fatal log for unsupported buffer scope reductions.
@github-actions
Copy link

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Nov 10, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

AbsMax reduction now computes Max(abs(lhs), abs(rhs)). The shared-memory reduction lowering block in src/op/reduce.cc was removed, causing shared/shared.dyn scope paths to no longer be lowered. tilelang/language/reduce.py adds a macro-backed reduce that branches on buffer scopes and uses fragment allocation and copies.

Changes

Cohort / File(s) Summary
C++ reducer removal
src/op/reduce.cc
Rewrote AbsMax to compute Max(abs(lhs), abs(rhs)); removed the shared/shared.dyn shared-memory reduction lowering path (buffer checks, dim/thread validations, AllReduce/run_hopper logic, accumulation back to dst). Those scope combinations now fall through to a fatal "Reduce for buffers in scope (...) is not implemented."
Python reduce macro & scope handling
tilelang/language/reduce.py
Replaced direct tl.reduce intrinsic call with a reduce_macro that branches on input/output buffer scopes (is_shared / is_fragment), allocates fragment buffers, performs renames and intermediate copies via IRBuilder, invokes reduction on the fragment, and copies results back when needed; raises on unsupported scope combos. Added imports supporting fragment allocation and renaming.
Builder eval loosened for Buffer
tilelang/language/v2/builder.py
Adjusted eval so encountering a tvm.tir.Buffer no longer triggers the prior generic TypeError branch; Buffer values are effectively skipped in that path while other type checks remain unchanged.

Sequence Diagram(s)

sequenceDiagram
    participant Caller
    participant reduce_macro_py as reduce_macro (py)
    participant FragmentBuf as fragment_buf
    participant Intrinsic as intrinsic_reduce
    participant MakeReduceC as MakeReduce (C++)

    Caller->>reduce_macro_py: reduce(input_buf, output_buf, op)
    alt supported scope combos (fragment / shared→fragment / fragment→shared)
        reduce_macro_py->>FragmentBuf: alloc/rename/copy as needed
        reduce_macro_py->>Intrinsic: call reduce on fragment
        Intrinsic->>MakeReduceC: invoke backend lowering
        MakeReduceC-->>Intrinsic: fragment-local lowering path
        reduce_macro_py->>output_buf: copy result back if required
        reduce_macro_py-->>Caller: return result
    else unsupported (shared/shared.dyn)
        reduce_macro_py-->>Caller: raise ValueError or MakeReduceC will fatal
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

  • Pay attention to:
    • AbsMax semantics (NaN, signed zero, overflow) and test coverage.
    • Call sites that expected shared-memory lowering and migration/compatibility.
    • Correctness of fragment allocation/renaming, copy placement, lifetimes, and aliasing in the new macro.
    • Builder eval change: ensure skipping Buffer here doesn't mask errors.

Possibly related PRs

Poem

🐰 I hopped through code both old and new,
AbsMax trimmed to a clearer view,
Shared paths vanished, fragments play,
I copy, rename, and bound away,
A tiny thump for reductions today 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly identifies the main changes: refactoring the reduce operation and adding support for shared memory as input/output buffers, which aligns with the substantial changes in both the C++ and Python implementation files.

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 570c481 and 814eed7.

📒 Files selected for processing (1)
  • tilelang/language/v2/builder.py (1 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e663246 and 570c481.

📒 Files selected for processing (1)
  • tilelang/language/reduce.py (2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tilelang/language/reduce.py (4)
tilelang/language/copy.py (1)
  • copy (11-87)
tilelang/language/tir/entry.py (1)
  • macro (66-117)
tilelang/language/allocate.py (1)
  • alloc_fragment (59-70)
tilelang/utils/language.py (2)
  • is_shared (25-39)
  • is_fragment (81-91)
🪛 Ruff (0.14.3)
tilelang/language/reduce.py

100-100: Avoid specifying long messages outside the exception class

(TRY003)

Comment on lines +50 to +88
copy(buffer, red_frag_in)
tir.call_intrin(
"handle",
tir.op.Op.get("tl.reduce"),
red_frag_in.access_ptr("r"),
red_frag_out.access_ptr("w"),
reduce_type,
dim,
clear,
)
copy(red_frag_out, out)
elif is_shared(buffer) and is_fragment(out):
red_frag_in = alloc_fragment(buffer.shape, buffer.dtype)
IRBuilder.name(buffer.name + "_frag", red_frag_in)

copy(buffer, red_frag_in)
tir.call_intrin(
"handle",
tir.op.Op.get("tl.reduce"),
red_frag_in.access_ptr("r"),
out.access_ptr("w"),
reduce_type,
dim,
clear,
)
elif is_fragment(buffer) and is_shared(out):
red_frag_out = alloc_fragment(out.shape, out.dtype)
IRBuilder.name(out.name + "_frag", red_frag_out)

tir.call_intrin(
"handle",
tir.op.Op.get("tl.reduce"),
buffer.access_ptr("r"),
red_frag_out.access_ptr("w"),
reduce_type,
dim,
clear,
)
copy(red_frag_out, out)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Preserve destination contents when clear=False.

When out lives in shared memory, we stage it through red_frag_out, but we never preload its existing values. For reduce_* calls with clear=False, that staging buffer must start from the prior accumulator; otherwise we zero out (or leave undefined) red_frag_out, call tl.reduce, and lose the intended accumulation semantics. This regresses any caller relying on clear=False.

Please seed red_frag_out from out before the intrinsic whenever clear is false, in both shared→shared and fragment→shared branches, e.g.:

         red_frag_out = alloc_fragment(out.shape, out.dtype)
         IRBuilder.name(out.name + "_frag", red_frag_out)

         copy(buffer, red_frag_in)
+        if not clear:
+            copy(out, red_frag_out)
         tir.call_intrin(
@@
         red_frag_out = alloc_fragment(out.shape, out.dtype)
         IRBuilder.name(out.name + "_frag", red_frag_out)

+        if not clear:
+            copy(out, red_frag_out)
         tir.call_intrin(
🤖 Prompt for AI Agents
In tilelang/language/reduce.py around lines 50 to 88, when staging a
shared-memory destination through red_frag_out we never preload its existing
values, so reduce calls with clear=False lose prior accumulator state; fix by,
after allocating and naming red_frag_out (both in the shared->shared branch and
the fragment->shared branch), conditionally copy(out, red_frag_out) when clear
is False before invoking tir.call_intrin so the fragment starts seeded from the
current out contents.

* Changed the type check for eval values to raise a TypeError for unsupported types, specifically excluding instances of tvm.tir.Buffer. This improves error handling and clarity in the Builder class.
@LeiWang1999 LeiWang1999 merged commit 47039f0 into tile-ai:main Nov 10, 2025
1 of 3 checks passed
RubiaCx pushed a commit to RubiaCx/tilelang that referenced this pull request Nov 24, 2025
…ile-ai#1219)

* [Refactor] Update ReduceOpNode to use absolute values in Max computation and remove unused shared memory reduction logic

* Changed Max computation for AbsMax type to use absolute values of lhs and rhs.
* Removed unused shared memory reduction logic and related checks for buffer dimensions and thread extents, simplifying the Lower method.
* Added a fatal log for unsupported buffer scope reductions.

* reduce fix

* [Fix] Update type check for eval value in Builder class

* Changed the type check for eval values to raise a TypeError for unsupported types, specifically excluding instances of tvm.tir.Buffer. This improves error handling and clarity in the Builder class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant