Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

intx weight only linear quantizer for mps #1192

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

manuelcandales
Copy link
Contributor

Differential Revision: D65079774

Copy link

pytorch-bot bot commented Oct 29, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1192

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 8e43385 with merge base c546c5c (image):
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Oct 29, 2024
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D65079774

manuelcandales added a commit to manuelcandales/ao that referenced this pull request Oct 30, 2024
Summary: Pull Request resolved: pytorch#1192

Differential Revision: D65079774
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D65079774

Copy link
Contributor

@kimishpatel kimishpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few comments

torchao/experimental/ops/mps/register.mm Outdated Show resolved Hide resolved
torchao/experimental/ops/mps/register.mm Outdated Show resolved Hide resolved
torchao/experimental/ops/mps/register.mm Outdated Show resolved Hide resolved
torchao/experimental/quant_api.py Show resolved Hide resolved
torchao/experimental/quant_api.py Outdated Show resolved Hide resolved
torchao/experimental/ops/mps/test/test_quantizer.py Outdated Show resolved Hide resolved
from torchao.experimental.quant_api import IntxWeightOnlyLinearQuantizer


def parameterized(test_cases):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isnt there something equivalent within torchao?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, there is something in torch.testing._internal, but then I ran into dependency issues with 'expecttest'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we use this in other places:

@parameterized.expand(COMMON_DEVICE_DTYPE)
, not sure why there is a dependency issue

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please use what Jerry is pointing to? Less code is better.

Unresolving the comment

torchao/experimental/ops/mps/test/test_quantizer.py Outdated Show resolved Hide resolved
torchao/experimental/ops/mps/test/test_quantizer.py Outdated Show resolved Hide resolved
manuelcandales added a commit to manuelcandales/ao that referenced this pull request Oct 30, 2024
Summary: Pull Request resolved: pytorch#1192

Differential Revision: D65079774
manuelcandales added a commit to manuelcandales/ao that referenced this pull request Oct 30, 2024
Summary: Pull Request resolved: pytorch#1192

Differential Revision: D65079774
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D65079774

1 similar comment
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D65079774

manuelcandales added a commit to manuelcandales/ao that referenced this pull request Nov 13, 2024
Summary: Pull Request resolved: pytorch#1192

Differential Revision: D65079774
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D65079774

manuelcandales added a commit to manuelcandales/ao that referenced this pull request Nov 13, 2024
Summary: Pull Request resolved: pytorch#1192

Differential Revision: D65079774
Summary: Pull Request resolved: pytorch#1192

Differential Revision: D65079774
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D65079774

manuelcandales added a commit to manuelcandales/ao that referenced this pull request Nov 13, 2024
Summary: Pull Request resolved: pytorch#1192

Differential Revision: D65079774
@manuelcandales manuelcandales force-pushed the export-D65079774 branch 2 times, most recently from faa0c6b to 8e43385 Compare November 13, 2024 20:37
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D65079774

1 similar comment
@facebook-github-bot
Copy link

This pull request was exported from Phabricator. Differential Revision: D65079774

@manuelcandales manuelcandales added the topic: not user facing Use this tag if you don't want this PR to show up in release notes label Nov 13, 2024
Copy link
Contributor

@kimishpatel kimishpatel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some comments

from torchao.experimental.quant_api import IntxWeightOnlyLinearQuantizer


def parameterized(test_cases):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we please use what Jerry is pointing to? Less code is better.

Unresolving the comment

Comment on lines +30 to +35
if dtype == torch.int8:
qmin = -(1 << (nbit - 1))
qmax = (1 << (nbit - 1)) - 1
elif dtype == torch.uint8:
qmin = 0
qmax = (1 << nbit) - 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of overloading dtype=int8 to convey signed vs. unsigned, can you just do signed=True?

)


def _replace_linear_with_quantized_linear_mps(module: nn.Module, kwargs={}):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you at least add todo to the effect

_replace_linear_with_quantized_linear_mps(child, kwargs)
else:
assert child.bias is None
qlinear = UIntxWeightOnlyQuantizedLinear(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have groupsize restriction? If so where is that asserted? I would have expected that groupsize will be constructor arg so that constructor can check and throw if the quantized linear supports it or not. I dont exactly like exceptions in this scenario but maybe thats a better choice because you cannot create an instance of quantized linear for invaild group size

),
)
setattr(module, name, qlinear)
getattr(module, name).quantize_and_pack_weights(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
getattr(module, name).quantize_and_pack_weights(
qlinear.quantize_and_pack_weights(


@parameterized(cases)
def test_export(self, nbit):
model, group_size, k0, n = self._model_setup()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all tests doing only group size of 32? If so we should test other group sizes as well including those that should result in exception.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported topic: not user facing Use this tag if you don't want this PR to show up in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants