Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
This PR consolidates MoE (Mixture of Experts) tuning functionality by removing the gfx950-specific tune.py file and merging its functionality into the central gemm_moe_tune.py file. The changes enable both gfx942 and gfx950 architectures to use a unified tuning interface with architecture-specific logic handled through conditional branches.
Key changes:
- Removed gfx950-specific FmoeTuner950 class and integrated its logic into the base FmoeTuner class
- Extended get_1stage_file_info method to support both gfx950 and gfx942 with per_1x32 quantization support for gfx950
- Added comprehensive README documentation for the unified tuning workflow
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| hsa/gfx950/fmoe_2stages/tune.py | Removed gfx950-specific tuner implementation that is now integrated into the main tuner |
| csrc/ck_gemm_moe_2stages_codegen/gemm_moe_tune.py | Extended to handle both gfx950 and gfx942 architectures with conditional logic in get_1stage_file_info |
| csrc/ck_gemm_moe_2stages_codegen/README.md | Added comprehensive documentation for tuning workflow, including usage examples and configuration options |
Comments suppressed due to low confidence (3)
csrc/ck_gemm_moe_2stages_codegen/gemm_moe_tune.py:1316
- The get_1stage_file_info method does not handle the case when get_gfx() returns a value other than 'gfx950' or 'gfx942'. This will result in the function returning None implicitly, which could lead to unexpected errors in calling code. Add an else clause to either raise an informative error or return a default value.
csrc/ck_gemm_moe_2stages_codegen/gemm_moe_tune.py:2 - The copyright year is set to 2026, which is in the future. The current year is 2026 but the copyright range should reflect when the code was actually written. Consider using 2024-2025 or verify if 2026 is intentional.
csrc/ck_gemm_moe_2stages_codegen/gemm_moe_tune.py:1316 - The get_1stage_file_info method has significant code duplication between the gfx950 and gfx942 branches. The only difference is the handling of QuantType.per_1x32 (lines 1295-1296) which only exists in the gfx950 branch. Consider refactoring to eliminate duplication by extracting the common logic and only branching for the gfx950-specific per_1x32 case.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
valarLip
approved these changes
Jan 12, 2026
zhuyuhua-v
pushed a commit
that referenced
this pull request
Jan 14, 2026
* mv fmoe tune to csrc/ck_gemm_moe_2stages_codegen * rename fmoe tune.py to gemm_moe_tune.py * update tune readme for more usages * update splitK info and fix splitK error * Apply suggestions from code review Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> * fix splitK error in cktile tune and clean mp_tuner log * fix tune hang when get result error * fix mp_tuner error --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Motivation
Technical Details
Test Plan
Test Result
Submission Checklist