add a8w8 fp8 ck gemm tune support by solinzby1 · Pull Request #1782 · ROCm/aiter

solinzby1 · 2026-01-07T07:31:55Z

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.

Copilot

Pull request overview

This PR adds FP8 (8-bit floating point) support to the CK GEMM tuning infrastructure, expanding beyond the existing INT8 quantization. The changes enable tuning and execution of GEMM operations with FP8 quantized inputs while maintaining the existing INT8 functionality.

Key changes:

Extended instance generation to support both INT8 and FP8 kernel variants during tuning
Added FP8 data generation and reference computation paths in the Python tuning script
Modified the CUDA kernel dispatcher to handle both INT8 and FP8 input types with appropriate template parameters

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File	Description
csrc/ck_gemm_a8w8/gen_instances.py	Refactored instance generation to create separate I8 and F8 kernel instances for tuning with appropriate dtype template parameters
csrc/ck_gemm_a8w8/gemm_a8w8_tune.py	Added quant_dtype parameter support, FP8 data generation logic, and updated reference computation to handle both I8 and FP8 quantization types
csrc/ck_gemm_a8w8/gemm_a8w8_tune.cu	Updated kernel dispatcher to detect input dtype (I8 vs FP8) and dispatch to appropriate kernel templates with matching scale dtypes

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

csrc/ck_gemm_a8w8/gemm_a8w8_tune.cu

yadaish · 2026-01-08T02:29:26Z

LGTM

aiter/ops/gemm_op_a8w8.py

yzhou103 · 2026-01-12T02:11:11Z

LGTM

* add a8w8 fp8 tune support * add q_dtype_w to deal with different type and refine config csv file --------- Co-authored-by: solin <bingzhou@amd.com> Co-authored-by: yzhou103 <Ying.Zhou2@amd.com>

solinzby and others added 4 commits January 4, 2026 04:32

add a8w8 fp8 tune support

e5fae9d

modify errratio to open splitk

2711817

refine code

5df9382

refine

2cc7691

solinzby1 requested review from a team and Copilot January 7, 2026 07:31

Copilot AI reviewed Jan 7, 2026

View reviewed changes

csrc/ck_gemm_a8w8/gemm_a8w8_tune.cu Show resolved Hide resolved

solinzby1 requested review from yadaish and yzhou103 January 7, 2026 11:16

yadaish previously approved these changes Jan 8, 2026

View reviewed changes

add q_dtype_w to deal with different type and refine config csv file

2c540e2

solinzby1 dismissed yadaish’s stale review via 2c540e2 January 9, 2026 07:46

fix format issue

f8695c4

yzhou103 reviewed Jan 9, 2026

View reviewed changes

aiter/ops/gemm_op_a8w8.py Show resolved Hide resolved

solinzby and others added 4 commits January 9, 2026 10:48

refine

5a8914f

Update gemm_op_a8w8.py

d64f8d9

Merge branch 'main' into so/fp8_a8w8_ck

d34d3eb

fix format

f1287ec

yzhou103 approved these changes Jan 12, 2026

View reviewed changes

yadaish self-requested a review January 12, 2026 06:01

yadaish approved these changes Jan 12, 2026

View reviewed changes

ROCm deleted a comment from Copilot AI Jan 12, 2026

solinzby1 merged commit 2e39dbe into main Jan 12, 2026
17 checks passed

solinzby1 deleted the so/fp8_a8w8_ck branch January 12, 2026 06:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add a8w8 fp8 ck gemm tune support#1782

add a8w8 fp8 ck gemm tune support#1782
solinzby1 merged 10 commits intomainfrom
so/fp8_a8w8_ck

solinzby1 commented Jan 7, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

yadaish commented Jan 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

yzhou103 commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

solinzby1 commented Jan 7, 2026

Motivation

Technical Details

Test Plan

Test Result

Submission Checklist

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

yadaish commented Jan 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

yzhou103 commented Jan 12, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

yadaish commented Jan 8, 2026 •

edited

Loading