Fix A16W4 shuffle weight and scale for aiter/main #808
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
aiter/main: https://github.com/ROCm/aiter/blob/main/aiter/ops/shuffle.py
aiter/355_wip: https://github.com/ROCm/aiter/blob/355_wip/aiter/ops/shuffle.py
Fixes shuffle_mxfp4_scale vs shuffle_scale_a16w4 differences, but the expected
srcinput tensor shapes for shuffle_weight_a16w4 (e, n, k) and shuffle_scale_a16w4 (e * n, k) are internally inconsistent in aiter/main (ROCm/aiter#1341). We should probably go fix this in aiter main.Purpose
Test Plan
Test Result
Essential Elements of an Effective PR Description Checklist
supported_models.mdandexamplesfor a new model.