Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to disable fake quant for 8da4w QAT #198

Merged
merged 4 commits into from
May 4, 2024
Merged

Conversation

andrewor14
Copy link
Contributor

@andrewor14 andrewor14 commented May 1, 2024

Summary: This feature helps with model convergence during QAT. The user can disable observation/fake quant for the first N steps and renable them later, allowing the activation and weight values to stabilize before applying quantization.

Test Plan:
python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant
python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant_backward

Reviewers: jerryzh168, cpuhrsch

Subscribers: jerryzh168, cpuhrsch, supriyar

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 1, 2024
@andrewor14 andrewor14 requested a review from jerryzh168 May 1, 2024 20:50
@andrewor14
Copy link
Contributor Author

Actually I had to remove enable_observer because saving the qparams increased the memory usage significantly and caused OOM in my experiments. I think this is not really needed for 8da4w anyway because there's no state in the observer

@facebook-github-bot
Copy link

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

Summary: This feature helps with model convergence during QAT.
The user can disable observation/fake quant for the first N
steps and renable them later, allowing the activation and
weight values to stabilize before applying quantization.

Test Plan:
python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant
python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant_backward

Reviewers: jerryzh168, cpuhrsch

Subscribers: jerryzh168, cpuhrsch, supriyar
@facebook-github-bot
Copy link

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@jerryzh168
Copy link
Contributor

@andrewor14 maybe you can unlink the diff

@andrewor14
Copy link
Contributor Author

@andrewor14 maybe you can unlink the diff

How can I do that?

@facebook-github-bot
Copy link

@andrewor14 has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@andrewor14 andrewor14 merged commit 2dc57a8 into main May 4, 2024
16 of 17 checks passed
dbyoung18 pushed a commit to dbyoung18/ao that referenced this pull request Jul 31, 2024
Summary: This feature helps with model convergence during QAT.
The user can disable observation/fake quant for the first N
steps and renable them later, allowing the activation and
weight values to stabilize before applying quantization.

Test Plan:
python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant
python test/quantization/test_qat.py -k test_qat_8da4w_quantizer_disable_fake_quant_backward

Reviewers: jerryzh168, cpuhrsch

Subscribers: jerryzh168, cpuhrsch, supriyar
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants