Skip to content

Commit a9efa91

Browse files
committed
flip mx scaling enum default to RCEIL
Summary: Overall we know RCEIL is better from industry knowledge, the benchmarks below are very light just to validate we can measure the increase. Accuracy * before ``` wikitext: {'alias': 'wikitext', 'word_perplexity,none': 7.609070006132819, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 1.4615491037668933, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 0.5474983002838458, 'bits_per_byte_stderr,none': 'N/A'} winogrande: {'alias': 'winogrande', 'acc,none': 0.7292817679558011, 'acc_stderr,none': 0.012487904760626407} ``` * after ``` wikitext: {'alias': 'wikitext', 'word_perplexity,none': 7.605192917647689, 'word_perplexity_stderr,none': 'N/A', 'byte_perplexity,none': 1.4614098103053235, 'byte_perplexity_stderr,none': 'N/A', 'bits_per_byte,none': 0.547360797163005, 'bits_per_byte_stderr,none': 'N/A'} winogrande: {'alias': 'winogrande', 'acc,none': 0.7355958958168903, 'acc_stderr,none': 0.012394724896983764} ``` nice lift in perplexity and winogrande accuracy score Performance on norm -> linear benchmarks * before: https://gist.github.com/vkuzo/e4eab53fc9a23c007585c2235a7c7088 * after: https://gist.github.com/vkuzo/4ac7cde8a3ec1cd8f4d66847df091f7e a slight performance regression, but we have not optimized RCEIL performance at all and we aren't using the intrinsics yet, so room to optimize Test Plan: ``` pytest test/prototype/mx_formats/ -s -x ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: f47b33f ghstack-comment-id: 3608933956 Pull-Request: #3428
1 parent 1e3558c commit a9efa91

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

torchao/prototype/mx_formats/README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -230,7 +230,7 @@ Note: the accuracy results below are WIP and are not optimized yet.
230230
| recipe | wikitext word_perplexity | winogrande |
231231
| ------ | -------- | ---------- |
232232
| bfloat16 (baseline) | 7.5472105433748435 | 0.7426992896606156 |
233-
| mxfp8 | 7.609070006132819 | 0.7292817679558011 |
233+
| mxfp8 | 7.605192917647689 | 0.7355958958168903 |
234234
| nvfp4 | 8.44478255417328 | 0.7182320441988951 |
235235

236236
To reproduce:

torchao/prototype/mx_formats/mx_tensor.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@
8787
class QuantizeTensorToMXKwargs(QuantizeTensorKwargs):
8888
elem_dtype: Union[torch.dtype, str] = torch.float8_e4m3fn
8989
block_size: int = 32
90-
scaling_mode: ScaleCalculationMode = ScaleCalculationMode.FLOOR
90+
scaling_mode: ScaleCalculationMode = ScaleCalculationMode.RCEIL
9191
kernel_preference: KernelPreference = KernelPreference.EMULATED
9292
is_swizzled_scales: bool = False
9393

@@ -144,7 +144,7 @@ def to_mx(
144144
data_hp: torch.Tensor,
145145
elem_dtype: Union[torch.dtype, str],
146146
block_size: int,
147-
scaling_mode: ScaleCalculationMode = ScaleCalculationMode.FLOOR,
147+
scaling_mode: ScaleCalculationMode = ScaleCalculationMode.RCEIL,
148148
is_swizzled_scales: bool = False,
149149
):
150150
"""
@@ -533,7 +533,7 @@ def to_mx(
533533
data_hp: torch.Tensor,
534534
elem_dtype: Union[torch.dtype, str],
535535
block_size: int = BLOCK_SIZE_DEFAULT,
536-
scaling_mode: ScaleCalculationMode = ScaleCalculationMode.FLOOR,
536+
scaling_mode: ScaleCalculationMode = ScaleCalculationMode.RCEIL,
537537
# TODO(future PR): switch default gemm to cublas
538538
kernel_preference: KernelPreference = KernelPreference.EMULATED,
539539
act_quant_kwargs: Optional[QuantizeTensorToMXKwargs] = None,

0 commit comments

Comments
 (0)