Skip to content

Conversation

@LeiWang1999
Copy link
Member

@LeiWang1999 LeiWang1999 commented Oct 29, 2025

This pull request refactors the let-binding logic in the TLVectorizer class to improve correctness and maintainability when vectorizing loops, especially those involving let-bound variables. It introduces new data structures to track variable mappings and their bound values, updates substitution and scalarization logic to handle nested let-bindings, and adds a new convenience entry point for vectorization. Additionally, a new test is introduced to verify correct vectorization of let-bound loads.

Vectorizer refactor and let-binding improvements

  • Replaced the old let_binding_ map with two new maps: let_var_map_ (tracks variable mapping) and let_value_binding_ (tracks mapping from new variables to their bound values), and updated all relevant logic to use these new structures for handling let-bindings in vectorization. [1] [2] [3]
  • Enhanced the Scalarize method to substitute let-bound variables correctly in the presence of nested or reused let-bindings, ensuring that all relevant variables are properly scoped and substituted during scalarization.
  • Updated the vectorization entry point by adding a static Vectorize method to TLVectorizer, simplifying how vectorization is invoked from call sites. [1] [2]

Code cleanup and correctness

  • Renamed and replaced variable references from let_binding_ to let_var_map_ throughout the class to reflect the new data structure and its purpose. [1] [2] [3]
  • Minor code cleanups, such as removing commented-out code and improving variable naming for clarity (e.g., changing ".s" to "_s" in scalarization index variable names). [1] [2]

Testing

  • Added a new test test_let_vectorize_load in testing/python/language/test_tilelang_language_let.py to verify that let-bound loads are correctly vectorized, ensuring that the generated CUDA code contains the expected vectorized variable declaration.

Summary by CodeRabbit

  • Bug Fixes

    • Improved loop vectorization to correctly handle let-bound variables and scalarized regions, fixing generation issues in vectorized code.
  • New Features

    • Exposed a stable vectorization entry to improve how loop bodies are processed during vectorization.
  • Tests

    • Added test coverage verifying let-binding behavior and generated vectorized code shapes.

@github-actions
Copy link

👋 Hi! Thank you for contributing to the TileLang project.

Please remember to run pre-commit run --all-files in the root directory of the project to ensure your changes are properly linted and formatted. This will help ensure your contribution passes the format check.

We appreciate you taking this step! Our team will review your contribution, and we look forward to your awesome work! 🚀

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 29, 2025

Walkthrough

The vectorizer gains a public static entry point TLVectorizer::Vectorize(var, var_lanes, body), replaces internal let_binding_ with let_var_map_ and let_value_binding_, and updates scalarization to rebind let-bound variables around scalarized regions. A new test validates vectorized code generation with let bindings.

Changes

Cohort / File(s) Summary
Vectorizer refactor
src/transform/vectorize_loop.cc
Added static Stmt Vectorize(const Var&, const PrimExpr&, Stmt); replaced let_binding_ with let_var_map_ and introduced let_value_binding_; updated LetStmt/LetExpr handling to populate binding maps and wrap scalarized code with necessary LetStmt bindings; adjusted scalarization to return a single scalarized_stmt; minor naming tweak for scalar index.
New language test
testing/python/language/test_tilelang_language_let.py
Added a Python test that builds a TileLang prim_func using vectorized loads/stores, compiles for CUDA, and asserts the generated CUDA source contains the expected float4 b vectorized code.

Sequence Diagram

sequenceDiagram
    participant LoopVec as LoopVectorizer
    participant Entry as TLVectorizer::Vectorize()
    participant TV as TLVectorizer (state)
    participant Scalar as Scalarization

    LoopVec->>Entry: Vectorize(var, var_lanes, body)
    activate Entry

    Entry->>TV: initialize `let_var_map_` and `let_value_binding_`
    activate TV

    TV->>TV: visit LetStmt / LetExpr\npopulate binding maps
    note right of TV #E3F2FD: bindings recorded for\nlater rebinding

    TV->>Scalar: perform scalarization\n(detect used let-bound vars)
    activate Scalar
    note right of Scalar #E8F5E9: create local `scalarized_stmt`
    Scalar->>TV: return scalarized_stmt
    deactivate Scalar

    TV->>TV: wrap scalarized_stmt with\nLetStmt bindings from `let_value_binding_`
    TV->>Entry: return vectorized Stmt
    deactivate TV

    Entry->>LoopVec: result
    deactivate Entry
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

  • Areas to inspect closely:
    • Correctness and completeness of replacing let_binding_ with let_var_map_ and let_value_binding_
    • LetStmt/LetExpr visitor updates and any edge cases (nested lets, shadowing)
    • Scalarization path ensuring the single-return scalarized_stmt contains all necessary rebindings
    • Integration in LoopVectorizer::VisitStmt calling the new static entry
    • The new test's assumptions about generated CUDA code and coverage of let scenarios

Poem

🐰 I hopped through binds and lanes so neat,

I chased each let to make it meet.
Maps tucked values, rebinds took flight,
Vectorized bytes danced in the night.
Tests clap softly — code hops right. 🥕

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 58.33% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The PR title "[Bugfix] Enhance LetStmt handling in Vectorize Loop Pass" is directly aligned with the main objective of the changeset. The summary clearly indicates that the PR focuses on refactoring let-binding logic in TLVectorizer and enhancing LetStmt and LetExpr handling to improve correctness when vectorizing loops with let-bound variables. The title is specific and descriptive, clearly identifying both what is being enhanced (LetStmt handling) and where (Vectorize Loop Pass), without being vague or using generic terms. While the title doesn't enumerate every change (such as the new static Vectorize entry point or the test additions), this level of detail is not expected for a PR title and is appropriately summarized in the main objective.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

📜 Recent review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a78e640 and dce4cfd.

📒 Files selected for processing (1)
  • testing/python/language/test_tilelang_language_let.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • testing/python/language/test_tilelang_language_let.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)
  • GitHub Check: Test for Python 3.12 with CUDA-12.8 (on self-hosted-nvidia)
  • GitHub Check: Test for Python 3.12 with ROCm-6.3 (on self-hosted-amd)
  • GitHub Check: Test for Python 3.12 with Metal (on macos-latest)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
testing/python/language/test_tilelang_language_let.py (1)

5-5: Remove unused import.

The map_torch_type import is not used anywhere in this test file.

Apply this diff to remove the unused import:

-from tilelang.utils.tensor import map_torch_type
📜 Review details

Configuration used: CodeRabbit UI

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between feef9ef and a78e640.

📒 Files selected for processing (2)
  • src/transform/vectorize_loop.cc (9 hunks)
  • testing/python/language/test_tilelang_language_let.py (1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
testing/python/language/test_tilelang_language_let.py (2)
tilelang/utils/tensor.py (1)
  • map_torch_type (21-38)
tilelang/jit/__init__.py (1)
  • compile (30-79)
🪛 Ruff (0.14.2)
testing/python/language/test_tilelang_language_let.py

12-12: Loop control variable blockIdx not used within loop body

Rename unused blockIdx to _blockIdx

(B007)


13-13: Loop control variable threadIdx not used within loop body

Rename unused threadIdx to _threadIdx

(B007)

🔇 Additional comments (9)
testing/python/language/test_tilelang_language_let.py (2)

12-13: Thread binding variables: static analysis false positive.

The static analysis tool flags blockIdx and threadIdx as unused. In TVM/TileLang, thread binding variables establish the execution context and do not require explicit references within the loop body. This is expected behavior.


7-19: Test correctly validates vectorized let-binding code generation.

The test verifies that a let-bound vectorized load (b: T.float32x4) is correctly emitted as a CUDA float4 declaration in the generated source. The test focuses on code generation correctness, which aligns with the PR's objective to enhance LetStmt handling in the vectorizer.

src/transform/vectorize_loop.cc (7)

36-36: LGTM: Include added for new set usage.

The <unordered_set> include is correctly added to support the used_let_bound_vars set introduced in the enhanced Scalarize method.


212-218: LGTM: Clean static entry point improves API.

The new static Vectorize method provides a clean, single-entry API that encapsulates the construction-then-invocation pattern. This improves usability at call sites.


225-235: LGTM: Clearer scalarization flow.

Extracting the scalarized statement into a local variable improves readability and makes the control flow more explicit.


414-415: LGTM: Let-binding refactor improves correctness.

The split of let_binding_ into let_var_map_ (variable substitution) and let_value_binding_ (value tracking) provides better separation of concerns and enables correct scalarization of let-bound variables. The implementation consistently maintains both maps across LetExpr and LetStmt visitors.

Also applies to: 544-557, 667-680


704-727: LGTM: Enhanced scalarization correctly handles let-bound variables.

The enhanced Scalarize method correctly:

  1. Identifies all let-bound variables used within the statement
  2. Substitutes the loop variable with the scalar index
  3. Re-establishes let bindings by wrapping the scalarized statement with LetStmt nodes, using the scalarized bound values

The key insight is that Scalarize operates on the original (pre-vectorization) statement (line 229), ensuring that variable lookups in let_value_binding_ retrieve the correct unvectorized expressions for substitution.

Minor improvement: the variable suffix changed from ".s" to "_s", which is clearer and follows conventional naming.


742-746: LGTM: Member variables properly documented.

The member variable declarations are clearly documented and consistent with the refactored let-binding mechanism.


844-844: LGTM: Updated to use new static entry point.

The integration correctly uses the new TLVectorizer::Vectorize static method, providing a cleaner call site.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant