-
Notifications
You must be signed in to change notification settings - Fork 363
Commit 106e146
committed
Update on "[bc-breaking] Generalize FakeQuantizeConfig beyond intx"
**Summary:** The existing `FakeQuantizeConfig` performs only
intx quantization, but we plan to extend QAT to other dtypes
such as fp8 and nvfp4 in the near future. This is the necessary
refactor before that. Specifically:
```
# New abstract class
FakeQuantizeConfigBase
# Rename
FakeQuantizeConfig -> IntxFakeQuantizeConfig
```
In the future, we will have other types of `FakeQuantizeConfigBase`
for float dtypes that users can pass in instead of the existing
Intx one.
**BC-breaking notes:** For BC, we keep around the old names to
reference the new ones. However, this commit is still BC-breaking
in the sense that a few APIs now accept the abstract
`FakeQuantizeConfigBase` instead. For the most part, this abstract
class will be hidden from the user.
Before:
```
activation_config = FakeQuantizeConfig(torch.int8, "per_token", is_symmetric=False)
weight_config = FakeQuantizeConfig(torch.int4, group_size=32)
```
After:
```
activation_config = IntxFakeQuantizeConfig(torch.int8, "per_token", is_symmetric=False)
weight_config = IntxFakeQuantizeConfig(torch.int4, group_size=32)
```
**Test Plan:**
python test/quantization/test_qat.py
[ghstack-poisoned]1 parent cba7430 commit 106e146Copy full SHA for 106e146
File tree
Expand file treeCollapse file tree
0 file changed
+0
-0
lines changedOpen diff view settings
Filter options
Expand file treeCollapse file tree
0 file changed
+0
-0
lines changedOpen diff view settings
0 commit comments