Skip to content

Move moe tune to csrc#1790

Merged
valarLip merged 12 commits intomainfrom
move_moe_tune_to_csrc
Jan 12, 2026
Merged

Move moe tune to csrc#1790
valarLip merged 12 commits intomainfrom
move_moe_tune_to_csrc

Conversation

@yzhou103
Copy link
Contributor

@yzhou103 yzhou103 commented Jan 8, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR consolidates MoE (Mixture of Experts) tuning functionality by removing the gfx950-specific tune.py file and merging its functionality into the central gemm_moe_tune.py file. The changes enable both gfx942 and gfx950 architectures to use a unified tuning interface with architecture-specific logic handled through conditional branches.

Key changes:

  • Removed gfx950-specific FmoeTuner950 class and integrated its logic into the base FmoeTuner class
  • Extended get_1stage_file_info method to support both gfx950 and gfx942 with per_1x32 quantization support for gfx950
  • Added comprehensive README documentation for the unified tuning workflow

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.

File Description
hsa/gfx950/fmoe_2stages/tune.py Removed gfx950-specific tuner implementation that is now integrated into the main tuner
csrc/ck_gemm_moe_2stages_codegen/gemm_moe_tune.py Extended to handle both gfx950 and gfx942 architectures with conditional logic in get_1stage_file_info
csrc/ck_gemm_moe_2stages_codegen/README.md Added comprehensive documentation for tuning workflow, including usage examples and configuration options
Comments suppressed due to low confidence (3)

csrc/ck_gemm_moe_2stages_codegen/gemm_moe_tune.py:1316

  • The get_1stage_file_info method does not handle the case when get_gfx() returns a value other than 'gfx950' or 'gfx942'. This will result in the function returning None implicitly, which could lead to unexpected errors in calling code. Add an else clause to either raise an informative error or return a default value.
    csrc/ck_gemm_moe_2stages_codegen/gemm_moe_tune.py:2
  • The copyright year is set to 2026, which is in the future. The current year is 2026 but the copyright range should reflect when the code was actually written. Consider using 2024-2025 or verify if 2026 is intentional.
    csrc/ck_gemm_moe_2stages_codegen/gemm_moe_tune.py:1316
  • The get_1stage_file_info method has significant code duplication between the gfx950 and gfx942 branches. The only difference is the handling of QuantType.per_1x32 (lines 1295-1296) which only exists in the gfx950 branch. Consider refactoring to eliminate duplication by extracting the common logic and only branching for the gfx950-specific per_1x32 case.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@valarLip valarLip merged commit 362044a into main Jan 12, 2026
17 checks passed
@valarLip valarLip deleted the move_moe_tune_to_csrc branch January 12, 2026 05:19
zhuyuhua-v pushed a commit that referenced this pull request Jan 14, 2026
* mv fmoe tune to csrc/ck_gemm_moe_2stages_codegen

* rename fmoe tune.py to gemm_moe_tune.py

* update tune readme for more usages

* update splitK info and fix splitK error

* Apply suggestions from code review

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* fix splitK error in cktile tune and clean mp_tuner log

* fix tune hang when get result error

* fix mp_tuner error

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants