Refactor Affine Quantized Tensor #1234

jainapurva · 2024-11-07T01:47:52Z

Move Dtype Layout / TensorImpl class and dispatcher check-impl to their respective layout files

Affine_quantized_tensor_ops.py contains
- aten.ops implementations for aqt
- Dispatch registration and dtype kernels impl/check imports
Dtype Layout files
- torchao/dtypes/floatx
  - Float8_layout.py
  - Floatx_tensor_core_layout.py
- torchao/dtypes/uintx
  - Block_sparse_layout.py
  - Marlin_sparse_layout.py
  - Plain_layout.py
  - Semi_sparse_layout.py
  - Tensor_core_tiled_layout.py
  - Uint4_layout.py
  - uintx_layout.py
  - marlin_qqq_layout.py

Test Plan: Pass all external and internal tests

pytorch-bot · 2024-11-07T01:47:56Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1234

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

[DomainsOnly] Jobs fail with GLIBC version not found

✅ No Failures

As of commit d992432 with merge base 06ad55a ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

torchao/dtypes/affine_quantized_tensor_ops.py

torchao/dtypes/uintx/uint8.py

torchao/dtypes/uintx/uintx.py

Ruff format and lint on some high traffic files

Update pre-commit to match CI/CD

stack-info: PR: #1228, branch: drisspg/stack/19

* add module swap UX * update * fix typing. add small notes * try NF4 support * fix * fix unpacking * fix * update nf4 integration * update backward pass

Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags:

torchao/dtypes/affine_quantized_tensor_ops.py

jerryzh168 · 2024-11-14T00:52:33Z

torchao/dtypes/floatx/__init__.py

@@ -12,4 +13,6 @@
    "to_scaled_tc_floatx",
    "from_scaled_tc_floatx",
    "_SPLIT_K_MAP",
+    "Float8AQTTensorImpl",


if this is not used by other people, maybe we can remove, this is supposed to be internal mostly I think

jerryzh168 · 2024-11-14T00:53:45Z

torchao/dtypes/uintx/__init__.py

@@ -12,4 +31,11 @@
    "UintxAQTTensorImpl",
    "to_uintx",
    "_DTYPE_TO_BIT_WIDTH",
+    "_BIT_WIDTH_TO_DTYPE",
+    "UInt4Tensor",
+    "PlainAQTTensorImpl",


probably don't need to expose this one

jerryzh168 · 2024-11-14T00:54:08Z

torchao/dtypes/uintx/__init__.py

@@ -12,4 +31,11 @@
    "UintxAQTTensorImpl",


we can probably remove this one as well

jerryzh168 · 2024-11-14T00:57:47Z

torchao/quantization/quant_api.py

@@ -36,7 +36,7 @@
    to_affine_quantized_floatx_static,
    to_affine_quantized_intx,
 )
-from torchao.dtypes.uintx.uintx import UintxLayout
+from torchao.dtypes.uintx.uintx_layout import UintxLayout


we can import from torchao.dtypes for now I think

jerryzh168 · 2024-11-14T22:25:57Z

torchao/dtypes/__init__.py

@@ -1,13 +1,9 @@
+from . import affine_quantized_tensor_ops
+
+# from ..prototype.dtypes.uint2 import UInt2Tensor, BitnetTensor


nit: we can remove this I think

jerryzh168 · 2024-11-14T22:26:04Z

torchao/dtypes/__init__.py

+from .utils import (
+    Layout,
+    PlainLayout,
+)

 # from ..prototype.dtypes.uint2 import UInt2Tensor, BitnetTensor


same for this line

jerryzh168 · 2024-11-14T22:27:35Z

test/hqq/test_hqq_affine.py

@@ -1,12 +1,7 @@
 import unittest
 import torch
-from torchao.dtypes.affine_quantized_tensor import (
-    to_affine_quantized_intx,
+from torchao.quantization.quant_primitives import (


nit: maybe we can just import from torchao.quantization now

jerryzh168 · 2024-11-14T22:29:25Z

torchao/prototype/hqq/example.py

 )
+from torchao.quantization.quant_primitives import (


jerryzh168

looks good, thanks! please import to fbcode and check if there is any internal failures

facebook-github-bot · 2024-11-15T00:57:44Z