Fix the impl for `to` for int4 weight only use case #522

jerryzh168 · 2024-07-17T19:50:18Z

Summary:
Note that we can do the following right now:

initialize and quantize the model with int4_weight_only quant in cpu
move the model to cuda

we'll enable this in a separate PR

Test Plan:
python test/quantization/test_quant_api.py -k test_int4wo_quantized_model_to_device
Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-bot · 2024-07-17T19:50:21Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/522

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 3ed1a46 with merge base 6dd82d8 ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

larryliu0820 · 2024-07-17T19:51:35Z

test/quantization/test_quant_api.py

@@ -637,6 +637,22 @@ def test_quantized_model_to_device(self):
        cuda_res = m(*example_inputs_cuda)
        self.assertEqual(cuda_res.cpu(), ref)

+    # TODO: enable this test


why disable test?

cpu -> cuda does not work yet, I changed it to cuda to cuda for now

Summary: Note that we can do the following right now: * initialize and quantize the model with int4_weight_only quant in cpu * move the model to cuda we'll enable this in a separate PR Test Plan: CI Reviewers: Subscribers: Tasks: Tags:

jerryzh168 requested a review from msaroufim July 17, 2024 19:50

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 17, 2024

jerryzh168 requested a review from HDCharles July 17, 2024 19:50

larryliu0820 reviewed Jul 17, 2024

View reviewed changes

larryliu0820 approved these changes Jul 17, 2024

View reviewed changes

jerryzh168 force-pushed the test-to branch 2 times, most recently from 2019b80 to 948b7ed Compare July 17, 2024 20:40

Fix the impl for to for int4 weight only use case

3ed1a46

Summary: Note that we can do the following right now: * initialize and quantize the model with int4_weight_only quant in cpu * move the model to cuda we'll enable this in a separate PR Test Plan: CI Reviewers: Subscribers: Tasks: Tags:

jerryzh168 force-pushed the test-to branch from 948b7ed to 3ed1a46 Compare July 17, 2024 21:55

jerryzh168 merged commit d36de1b into pytorch:main Jul 17, 2024
13 checks passed

jerryzh168 deleted the test-to branch July 17, 2024 23:13

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix the impl for `to` for int4 weight only use case #522

Fix the impl for `to` for int4 weight only use case #522

Uh oh!

jerryzh168 commented Jul 17, 2024 •

edited

Loading

Uh oh!

pytorch-bot bot commented Jul 17, 2024 •

edited

Loading

Uh oh!

larryliu0820 Jul 17, 2024

Uh oh!

jerryzh168 Jul 17, 2024

Uh oh!

Uh oh!

Uh oh!

Fix the impl for to for int4 weight only use case #522

Fix the impl for to for int4 weight only use case #522

Uh oh!

Conversation

jerryzh168 commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pytorch-bot bot commented Jul 17, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/522

✅ No Failures

Uh oh!

larryliu0820 Jul 17, 2024

Choose a reason for hiding this comment

Uh oh!

jerryzh168 Jul 17, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Fix the impl for `to` for int4 weight only use case #522

Fix the impl for `to` for int4 weight only use case #522

jerryzh168 commented Jul 17, 2024 •

edited

Loading

pytorch-bot bot commented Jul 17, 2024 •

edited

Loading