Add option to disable fake quant for 8da4w QAT #198

andrewor14 · 2024-05-01T20:50:11Z

Summary: This feature helps with model convergence during QAT. The user can disable observation/fake quant for the first N steps and renable them later, allowing the activation and weight values to stabilize before applying quantization.

Test Plan:
python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant
python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant_backward

Reviewers: jerryzh168, cpuhrsch

Subscribers: jerryzh168, cpuhrsch, supriyar

test/quantization/test_qat.py

andrewor14 · 2024-05-01T23:09:20Z

Actually I had to remove enable_observer because saving the qparams increased the memory usage significantly and caused OOM in my experiments. I think this is not really needed for 8da4w anyway because there's no state in the observer

facebook-github-bot · 2024-05-02T22:13:40Z

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: This feature helps with model convergence during QAT. The user can disable observation/fake quant for the first N steps and renable them later, allowing the activation and weight values to stabilize before applying quantization. Test Plan: python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant_backward Reviewers: jerryzh168, cpuhrsch Subscribers: jerryzh168, cpuhrsch, supriyar

facebook-github-bot · 2024-05-02T22:19:06Z

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-05-03T17:29:30Z

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

facebook-github-bot · 2024-05-03T20:59:56Z

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

jerryzh168 · 2024-05-04T00:12:36Z

@andrewor14 maybe you can unlink the diff

andrewor14 · 2024-05-04T03:45:53Z

@andrewor14 maybe you can unlink the diff

How can I do that?

facebook-github-bot · 2024-05-04T03:46:16Z

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: This feature helps with model convergence during QAT. The user can disable observation/fake quant for the first N steps and renable them later, allowing the activation and weight values to stabilize before applying quantization. Test Plan: python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant_backward Reviewers: jerryzh168, cpuhrsch Subscribers: jerryzh168, cpuhrsch, supriyar

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 1, 2024

andrewor14 requested a review from jerryzh168 May 1, 2024 20:50

jerryzh168 reviewed May 1, 2024

View reviewed changes

test/quantization/test_qat.py Outdated Show resolved Hide resolved

jerryzh168 approved these changes May 1, 2024

View reviewed changes

andrewor14 force-pushed the 8da4w_qat_enable branch from 4f3b17e to 25c36be Compare May 1, 2024 23:06

andrewor14 force-pushed the 8da4w_qat_enable branch from 25c36be to fbc5742 Compare May 2, 2024 21:25

andrewor14 force-pushed the 8da4w_qat_enable branch from fbc5742 to 56afc27 Compare May 2, 2024 22:18

Merge branch 'main' into 8da4w_qat_enable

162a37d

Merge branch 'main' into 8da4w_qat_enable

14af551

Merge branch 'main' into 8da4w_qat_enable

05ff87c

andrewor14 merged commit 2dc57a8 into main May 4, 2024
16 of 17 checks passed

yanbing-j pushed a commit to yanbing-j/ao that referenced this pull request Dec 9, 2024

we are targetting 3.10 not 3.8 (pytorch#198)

eb6e070

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add option to disable fake quant for 8da4w QAT #198

Add option to disable fake quant for 8da4w QAT #198

Uh oh!

andrewor14 commented May 1, 2024 •

edited

Loading

Uh oh!

Uh oh!

andrewor14 commented May 1, 2024

Uh oh!

facebook-github-bot commented May 2, 2024

Uh oh!

facebook-github-bot commented May 2, 2024

Uh oh!

facebook-github-bot commented May 3, 2024

Uh oh!

facebook-github-bot commented May 3, 2024

Uh oh!

jerryzh168 commented May 4, 2024

Uh oh!

andrewor14 commented May 4, 2024

Uh oh!

facebook-github-bot commented May 4, 2024

Uh oh!

Uh oh!

Uh oh!

Add option to disable fake quant for 8da4w QAT #198

Add option to disable fake quant for 8da4w QAT #198

Uh oh!

Conversation

andrewor14 commented May 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

andrewor14 commented May 1, 2024

Uh oh!

facebook-github-bot commented May 2, 2024

Uh oh!

facebook-github-bot commented May 2, 2024

Uh oh!

facebook-github-bot commented May 3, 2024

Uh oh!

facebook-github-bot commented May 3, 2024

Uh oh!

jerryzh168 commented May 4, 2024

Uh oh!

andrewor14 commented May 4, 2024

Uh oh!

facebook-github-bot commented May 4, 2024

Uh oh!

Uh oh!

Uh oh!

andrewor14 commented May 1, 2024 •

edited

Loading