intx weight only linear quantizer for mps #1192

manuelcandales · 2024-10-29T18:31:38Z

Differential Revision: D65079774

pytorch-bot · 2024-10-29T18:31:41Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1192

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8e43385 with merge base c546c5c ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

facebook-github-bot · 2024-10-29T18:31:50Z

This pull request was exported from Phabricator. Differential Revision: D65079774

torchao/experimental/ops/mps/register.mm

torchao/experimental/quant_api.py

torchao/experimental/ops/mps/register.mm

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

facebook-github-bot · 2024-10-30T17:43:22Z

This pull request was exported from Phabricator. Differential Revision: D65079774

kimishpatel

Left a few comments

torchao/experimental/ops/mps/register.mm

torchao/experimental/quant_api.py

torchao/experimental/ops/mps/test/test_quantizer.py

kimishpatel · 2024-10-30T18:04:40Z

torchao/experimental/ops/mps/test/test_quantizer.py

+from torchao.experimental.quant_api import IntxWeightOnlyLinearQuantizer
+
+
+def parameterized(test_cases):


isnt there something equivalent within torchao?

well, there is something in torch.testing._internal, but then I ran into dependency issues with 'expecttest'.

we use this in other places:

ao/test/integration/test_integration.py

Line 647 in c546c5c

@parameterized.expand(COMMON_DEVICE_DTYPE)

, not sure why there is a dependency issue

Can we please use what Jerry is pointing to? Less code is better.

Unresolving the comment

torchao/experimental/ops/mps/test/test_quantizer.py

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

facebook-github-bot · 2024-10-30T21:37:58Z

This pull request was exported from Phabricator. Differential Revision: D65079774

facebook-github-bot · 2024-10-30T21:38:16Z

This pull request was exported from Phabricator. Differential Revision: D65079774

torchao/experimental/ops/mps/test/test_quantizer.py

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

facebook-github-bot · 2024-11-13T20:34:50Z

This pull request was exported from Phabricator. Differential Revision: D65079774

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

facebook-github-bot · 2024-11-13T20:37:19Z

This pull request was exported from Phabricator. Differential Revision: D65079774

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

facebook-github-bot · 2024-11-13T20:38:12Z

This pull request was exported from Phabricator. Differential Revision: D65079774

facebook-github-bot · 2024-11-13T20:38:12Z

This pull request was exported from Phabricator. Differential Revision: D65079774

kimishpatel

Left some comments

kimishpatel · 2024-11-15T15:49:10Z

torchao/experimental/ops/mps/test/test_quantizer.py

+from torchao.experimental.quant_api import IntxWeightOnlyLinearQuantizer
+
+
+def parameterized(test_cases):


Can we please use what Jerry is pointing to? Less code is better.

Unresolving the comment

kimishpatel · 2024-11-15T15:50:59Z

torchao/experimental/quant_api.py

+    if dtype == torch.int8:
+        qmin = -(1 << (nbit - 1))
+        qmax = (1 << (nbit - 1)) - 1
+    elif dtype == torch.uint8:
+        qmin = 0
+        qmax = (1 << nbit) - 1


Instead of overloading dtype=int8 to convey signed vs. unsigned, can you just do signed=True?

kimishpatel · 2024-11-15T16:09:58Z

torchao/experimental/quant_api.py

+        )
+
+
+def _replace_linear_with_quantized_linear_mps(module: nn.Module, kwargs={}):


Can you at least add todo to the effect

kimishpatel · 2024-11-15T16:15:32Z

torchao/experimental/quant_api.py

+            _replace_linear_with_quantized_linear_mps(child, kwargs)
+        else:
+            assert child.bias is None
+            qlinear = UIntxWeightOnlyQuantizedLinear(


Do you have groupsize restriction? If so where is that asserted? I would have expected that groupsize will be constructor arg so that constructor can check and throw if the quantized linear supports it or not. I dont exactly like exceptions in this scenario but maybe thats a better choice because you cannot create an instance of quantized linear for invaild group size

kimishpatel · 2024-11-15T16:15:48Z

torchao/experimental/quant_api.py

+                ),
+            )
+            setattr(module, name, qlinear)
+            getattr(module, name).quantize_and_pack_weights(


Suggested change

getattr(module, name).quantize_and_pack_weights(

qlinear.quantize_and_pack_weights(

kimishpatel · 2024-11-15T16:27:25Z

torchao/experimental/ops/mps/test/test_quantizer.py

+
+    @parameterized(cases)
+    def test_export(self, nbit):
+        model, group_size, k0, n = self._model_setup()


Are all tests doing only group size of 32? If so we should test other group sizes as well including those that should result in exception.

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 29, 2024

facebook-github-bot added the fb-exported label Oct 29, 2024

metascroy reviewed Oct 29, 2024

View reviewed changes

torchao/experimental/ops/mps/register.mm Outdated Show resolved Hide resolved

metascroy reviewed Oct 29, 2024

View reviewed changes

torchao/experimental/ops/mps/register.mm Outdated Show resolved Hide resolved

metascroy reviewed Oct 29, 2024

View reviewed changes

torchao/experimental/quant_api.py Outdated Show resolved Hide resolved

jerryzh168 reviewed Oct 30, 2024

View reviewed changes

torchao/experimental/quant_api.py Outdated Show resolved Hide resolved

kimishpatel reviewed Oct 30, 2024

View reviewed changes

torchao/experimental/ops/mps/register.mm Outdated Show resolved Hide resolved

manuelcandales added a commit to manuelcandales/ao that referenced this pull request Oct 30, 2024

intx weight only linear quantizer for mps (pytorch#1192)

ed83de7

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

manuelcandales force-pushed the export-D65079774 branch from edd4a18 to ed83de7 Compare October 30, 2024 17:43

kimishpatel requested changes Oct 30, 2024

View reviewed changes

manuelcandales added a commit to manuelcandales/ao that referenced this pull request Oct 30, 2024

intx weight only linear quantizer for mps (pytorch#1192)

7dede6f

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

manuelcandales force-pushed the export-D65079774 branch from ed83de7 to 7dede6f Compare October 30, 2024 21:37

manuelcandales added a commit to manuelcandales/ao that referenced this pull request Oct 30, 2024

intx weight only linear quantizer for mps (pytorch#1192)

df02572

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

manuelcandales force-pushed the export-D65079774 branch from 7dede6f to df02572 Compare October 30, 2024 21:37

malfet reviewed Oct 31, 2024

View reviewed changes

torchao/experimental/ops/mps/test/test_quantizer.py Outdated Show resolved Hide resolved

manuelcandales added a commit to manuelcandales/ao that referenced this pull request Nov 13, 2024

intx weight only linear quantizer for mps (pytorch#1192)

b1d27e1

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

manuelcandales force-pushed the export-D65079774 branch from df02572 to b1d27e1 Compare November 13, 2024 20:33

manuelcandales added a commit to manuelcandales/ao that referenced this pull request Nov 13, 2024

intx weight only linear quantizer for mps (pytorch#1192)

b93f3ea

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

manuelcandales force-pushed the export-D65079774 branch from b1d27e1 to b93f3ea Compare November 13, 2024 20:36

intx weight only linear quantizer for mps (pytorch#1192)

8e43385

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

manuelcandales added a commit to manuelcandales/ao that referenced this pull request Nov 13, 2024

intx weight only linear quantizer for mps (pytorch#1192)

faa0c6b

Summary: Pull Request resolved: pytorch#1192 Differential Revision: D65079774

manuelcandales force-pushed the export-D65079774 branch 2 times, most recently from faa0c6b to 8e43385 Compare November 13, 2024 20:37

manuelcandales added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Nov 13, 2024

manuelcandales requested a review from kimishpatel November 14, 2024 15:31

kimishpatel requested changes Nov 15, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

intx weight only linear quantizer for mps #1192

intx weight only linear quantizer for mps #1192

manuelcandales commented Oct 29, 2024

pytorch-bot bot commented Oct 29, 2024 •

edited

Loading

facebook-github-bot commented Oct 29, 2024

facebook-github-bot commented Oct 30, 2024

kimishpatel left a comment

kimishpatel Oct 30, 2024

manuelcandales Nov 13, 2024

jerryzh168 Nov 13, 2024

kimishpatel Nov 15, 2024

facebook-github-bot commented Oct 30, 2024

facebook-github-bot commented Oct 30, 2024

facebook-github-bot commented Nov 13, 2024

facebook-github-bot commented Nov 13, 2024

facebook-github-bot commented Nov 13, 2024

facebook-github-bot commented Nov 13, 2024

kimishpatel left a comment

kimishpatel Nov 15, 2024

kimishpatel Nov 15, 2024

kimishpatel Nov 15, 2024

kimishpatel Nov 15, 2024

kimishpatel Nov 15, 2024

kimishpatel Nov 15, 2024

		from torchao.experimental.quant_api import IntxWeightOnlyLinearQuantizer


		def parameterized(test_cases):

		)


		def _replace_linear_with_quantized_linear_mps(module: nn.Module, kwargs={}):

	getattr(module, name).quantize_and_pack_weights(
	qlinear.quantize_and_pack_weights(

intx weight only linear quantizer for mps #1192

Are you sure you want to change the base?

intx weight only linear quantizer for mps #1192

Conversation

manuelcandales commented Oct 29, 2024

pytorch-bot bot commented Oct 29, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1192

✅ No Failures

facebook-github-bot commented Oct 29, 2024

facebook-github-bot commented Oct 30, 2024

kimishpatel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

facebook-github-bot commented Oct 30, 2024

facebook-github-bot commented Oct 30, 2024

facebook-github-bot commented Nov 13, 2024

facebook-github-bot commented Nov 13, 2024

facebook-github-bot commented Nov 13, 2024

facebook-github-bot commented Nov 13, 2024

kimishpatel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

pytorch-bot bot commented Oct 29, 2024 •

edited

Loading