[Refactor] Update KernelLaunch to clarify block name #441

LeiWang1999 · 2025-04-27T04:35:41Z

This pull request refactors the KernelLaunch function in src/ir.cc to improve clarity and consistency in handling CPU and GPU kernel launches. The most significant changes include reorganizing comments, ensuring proper annotations for blocks, and replacing empty block names with a more descriptive default.

Improvements to code clarity:

Reorganized the comment for launching the CPU kernel to align with the relevant code block, improving readability. [1] [2]

Consistency in block handling:

Replaced empty block names (Block("")) with a more descriptive default (Block("root")) to enhance code clarity and maintain consistency. This change applies both when attributes are defined and in the fallback case.

…ogic * Added comments to distinguish between CPU and GPU kernel launch sections for better code readability. * Changed the creation of empty blocks to use a consistent "root" identifier, enhancing clarity in frame management.

…i#441) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process.

…442) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process.

* [Enhancement] Improve layout inference accuracy in ParallelOp (#441) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process. * [Enhancement] Add error handling macros and refactor loop partitioning logic * Introduced TILELANG_CHECK macro for improved error handling in CUDA and HIP code, providing detailed error messages for kernel launches. * Enhanced loop partitioning logic to handle fragment buffers more effectively, ensuring correct replication based on thread extent. * Added logging for thread range in PlanLoopPartition to aid in debugging and performance analysis. * Updated pass configuration management to streamline vectorization control in the optimization process. * lint fix * remove debug print

* [Enhancement] Improve layout inference accuracy in ParallelOp (#441) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process. * [Enhancement] Add error handling macros and refactor loop partitioning logic * Introduced TILELANG_CHECK macro for improved error handling in CUDA and HIP code, providing detailed error messages for kernel launches. * Enhanced loop partitioning logic to handle fragment buffers more effectively, ensuring correct replication based on thread extent. * Added logging for thread range in PlanLoopPartition to aid in debugging and performance analysis. * Updated pass configuration management to streamline vectorization control in the optimization process. * lint fix * remove debug print * [Refactor] Update legalize_safe_memory_access.cc to improve memory access handling * Replaced Apache License header with MIT License. * Added logic to handle local buffer conditions in memory access. * Introduced IsLocalBuffer function to check buffer scope. * Enhanced comments for clarity on memory access operations.

…ogic (tile-ai#441) * Added comments to distinguish between CPU and GPU kernel launch sections for better code readability. * Changed the creation of empty blocks to use a consistent "root" identifier, enhancing clarity in frame management.

…i#441) (tile-ai#442) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process.

* [Enhancement] Improve layout inference accuracy in ParallelOp (tile-ai#441) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process. * [Enhancement] Add error handling macros and refactor loop partitioning logic * Introduced TILELANG_CHECK macro for improved error handling in CUDA and HIP code, providing detailed error messages for kernel launches. * Enhanced loop partitioning logic to handle fragment buffers more effectively, ensuring correct replication based on thread extent. * Added logging for thread range in PlanLoopPartition to aid in debugging and performance analysis. * Updated pass configuration management to streamline vectorization control in the optimization process. * lint fix * remove debug print

* [Enhancement] Improve layout inference accuracy in ParallelOp (tile-ai#441) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process. * [Enhancement] Add error handling macros and refactor loop partitioning logic * Introduced TILELANG_CHECK macro for improved error handling in CUDA and HIP code, providing detailed error messages for kernel launches. * Enhanced loop partitioning logic to handle fragment buffers more effectively, ensuring correct replication based on thread extent. * Added logging for thread range in PlanLoopPartition to aid in debugging and performance analysis. * Updated pass configuration management to streamline vectorization control in the optimization process. * lint fix * remove debug print * [Refactor] Update legalize_safe_memory_access.cc to improve memory access handling * Replaced Apache License header with MIT License. * Added logic to handle local buffer conditions in memory access. * Introduced IsLocalBuffer function to check buffer scope. * Enhanced comments for clarity on memory access operations.

…ogic (tile-ai#441) * Added comments to distinguish between CPU and GPU kernel launch sections for better code readability. * Changed the creation of empty blocks to use a consistent "root" identifier, enhancing clarity in frame management.

…i#441) (tile-ai#442) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process.

* [Enhancement] Improve layout inference accuracy in ParallelOp (tile-ai#441) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process. * [Enhancement] Add error handling macros and refactor loop partitioning logic * Introduced TILELANG_CHECK macro for improved error handling in CUDA and HIP code, providing detailed error messages for kernel launches. * Enhanced loop partitioning logic to handle fragment buffers more effectively, ensuring correct replication based on thread extent. * Added logging for thread range in PlanLoopPartition to aid in debugging and performance analysis. * Updated pass configuration management to streamline vectorization control in the optimization process. * lint fix * remove debug print

* [Enhancement] Improve layout inference accuracy in ParallelOp (tile-ai#441) * Added logic to use non-replicated buffers as source buffers for more accurate layout inference. * Enhanced comments to clarify the rationale behind buffer selection in layout inference process. * [Enhancement] Add error handling macros and refactor loop partitioning logic * Introduced TILELANG_CHECK macro for improved error handling in CUDA and HIP code, providing detailed error messages for kernel launches. * Enhanced loop partitioning logic to handle fragment buffers more effectively, ensuring correct replication based on thread extent. * Added logging for thread range in PlanLoopPartition to aid in debugging and performance analysis. * Updated pass configuration management to streamline vectorization control in the optimization process. * lint fix * remove debug print * [Refactor] Update legalize_safe_memory_access.cc to improve memory access handling * Replaced Apache License header with MIT License. * Added logic to handle local buffer conditions in memory access. * Introduced IsLocalBuffer function to check buffer scope. * Enhanced comments for clarity on memory access operations.

LeiWang1999 merged commit 6fc627e into tile-ai:main Apr 27, 2025
2 of 3 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Refactor] Update KernelLaunch to clarify block name #441

[Refactor] Update KernelLaunch to clarify block name #441

Uh oh!

LeiWang1999 commented Apr 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[Refactor] Update KernelLaunch to clarify block name #441

[Refactor] Update KernelLaunch to clarify block name #441

Uh oh!

Conversation

LeiWang1999 commented Apr 27, 2025

Improvements to code clarity:

Consistency in block handling:

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant