[Bugfix] Fix layout inference for free fragment buffer #443

LeiWang1999 · 2025-04-29T08:34:50Z

This pull request introduces several enhancements and bug fixes across multiple components of the TileLang compiler and runtime. The changes focus on improving error handling, adding configuration flexibility, optimizing performance, and enhancing code maintainability. Below is a categorized summary of the most important changes:

Error Handling Improvements:

Added TILELANG_CHECK macros in both cuda/common.h and hip/common.h to standardize error checking for CUDA and HIP API calls. These macros capture and log errors with detailed information, improving debugging capabilities. [1] [2]
Enhanced kernel launch error handling in tilelang/jit/adapter/wrapper.py by adding checks for CUDA errors after kernel execution. Errors are logged with function-specific details, and execution halts on failure.

Layout and Loop Optimization:

Updated the LoopPartitioner class in loop_partition.cc to handle fragment buffers more effectively. Introduced logic to avoid replicating loop layouts for fragment buffers, improving performance for certain workloads. [1] [2]
Modified the InferLayout function in parallel.cc to prioritize non-replicated buffers for layout inference, enhancing accuracy.

Configuration and Flexibility Enhancements:

Introduced a new PassConfigKey class in tilelang/transform/pass_config.py to centralize and document configuration options for TileLang compiler passes. This includes options for enabling/disabling specific optimizations.
Updated tilelang/engine/phase.py to allow passing a PassContext object to functions like allow_tma_and_warp_specialized and allow_vectorize, enabling more flexible configuration management. [1] [2] [3]

Codebase Simplification and Maintenance:

Replaced direct imports of tvm.transform.PassContext with a unified import in tilelang/transform/__init__.py, ensuring consistency and reducing redundancy.
Refactored _load_tile_lang_lib in tilelang/__init__.py to include PassConfigKey, aligning it with new configuration management practices.

Minor Fixes:

Fixed indentation in the PREDEF_HOST_FUNC template in tilelang/jit/adapter/wrapper.py to align with coding standards.

…i#441) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process.

…g logic * Introduced TILELANG_CHECK macro for improved error handling in CUDA and HIP code, providing detailed error messages for kernel launches. * Enhanced loop partitioning logic to handle fragment buffers more effectively, ensuring correct replication based on thread extent. * Added logging for thread range in PlanLoopPartition to aid in debugging and performance analysis. * Updated pass configuration management to streamline vectorization control in the optimization process.

…ce_2025_fragment

* [Enhancement] Improve layout inference accuracy in ParallelOp (tile-ai#441) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process. * [Enhancement] Add error handling macros and refactor loop partitioning logic * Introduced TILELANG_CHECK macro for improved error handling in CUDA and HIP code, providing detailed error messages for kernel launches. * Enhanced loop partitioning logic to handle fragment buffers more effectively, ensuring correct replication based on thread extent. * Added logging for thread range in PlanLoopPartition to aid in debugging and performance analysis. * Updated pass configuration management to streamline vectorization control in the optimization process. * lint fix * remove debug print

LeiWang1999 added 5 commits April 28, 2025 09:15

[Enhancement] Improve layout inference accuracy in ParallelOp (tile-a…

3f18073

…i#441) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process.

lint fix

f55341c

Merge branch 'main' of https://github.com/tile-ai/tilelang into enhan…

0ed90a4

…ce_2025_fragment

remove debug print

d367b6b

LeiWang1999 merged commit 9fd936c into tile-ai:main Apr 29, 2025
3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bugfix] Fix layout inference for free fragment buffer #443

[Bugfix] Fix layout inference for free fragment buffer #443

Uh oh!

LeiWang1999 commented Apr 29, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Bugfix] Fix layout inference for free fragment buffer #443

[Bugfix] Fix layout inference for free fragment buffer #443

Uh oh!

Conversation

LeiWang1999 commented Apr 29, 2025

Error Handling Improvements:

Layout and Loop Optimization:

Configuration and Flexibility Enhancements:

Codebase Simplification and Maintenance:

Minor Fixes:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant