Skip to content

Conversation

@LeiWang1999
Copy link
Member

This pull request introduces several changes across multiple files to enhance functionality, improve modularity, and add new features for dequantization and flash attention operations. The most significant updates include the addition of new examples for dequantization and flash attention, refactoring of existing scripts to improve usability, and updates to the tilelang library to support these operations.

New Examples and Features:

Dequantization Enhancements:

  • Added a new example in examples/dequantize_gemm/example_dequant_gemv_fp16xint4.py for dequantizing GEMV operations with support for various configurations, including fast decoding and scaling. This includes a comprehensive implementation of the dequantize_gemv function and a testable main function.
  • Refactored examples/dequantize_gemm/example_dequant_gemm_fp4_hopper.py to encapsulate logic within a main function, improving modularity. The script now accepts parameters via the main function or command-line arguments. [1] [2]

Flash Attention Enhancements:

  • Introduced a new example in examples/flash_attention/example_mha_fwd_bhsd_wgmma_pipelined.py for flash attention with pipelined execution. This implementation includes kernel macros for matrix multiplication, softmax, and rescaling, along with support for autotuning configurations.

Testing and Licensing:

  • Added a new test suite in examples/dequantize_gemm/test_example_dequantize_gemm.py to validate the functionality of the dequantization examples for different configurations.
  • Standardized licensing headers across files, adding or updating the MIT license notice where applicable. [1] [2] [3] [4]

Library Updates:

  • Updated tilelang/quantize/__init__.py to expose additional utility functions and intrinsics for quantization, such as interleave_weight and get_lop3_intrin_group.
  • Added a missing import for DataType in tilelang/__init__.py to resolve potential issues with type handling.

@LeiWang1999 LeiWang1999 merged commit 9cfa724 into tile-ai:main May 14, 2025
2 of 3 checks passed
lucifer1004 pushed a commit to lucifer1004/tilelang that referenced this pull request May 16, 2025
…for dequant gemm exmaple (tile-ai#494)

* Remove deprecated example_dequant_gemm.py and add DataType import in __init__.py

* lint fix

* lint fix

* Refactor dequantization examples to use tilelang imports and update data type handling in quantization utilities

* lint fix
LeiWang1999 added a commit to LeiWang1999/tilelang that referenced this pull request Jul 18, 2025
…for dequant gemm exmaple (tile-ai#494)

* Remove deprecated example_dequant_gemm.py and add DataType import in __init__.py

* lint fix

* lint fix

* Refactor dequantization examples to use tilelang imports and update data type handling in quantization utilities

* lint fix
LeiWang1999 added a commit to LeiWang1999/tilelang that referenced this pull request Jul 20, 2025
…for dequant gemm exmaple (tile-ai#494)

* Remove deprecated example_dequant_gemm.py and add DataType import in __init__.py

* lint fix

* lint fix

* Refactor dequantization examples to use tilelang imports and update data type handling in quantization utilities

* lint fix
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant