Skip to content

Update QAT README.md #2162

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
May 2, 2025
Merged

Update QAT README.md #2162

merged 1 commit into from
May 2, 2025

Conversation

SalmanMohammadi
Copy link
Contributor

Closes #2155.

Copy link

pytorch-bot bot commented May 2, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2162

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label May 2, 2025
@jerryzh168 jerryzh168 merged commit 4850998 into pytorch:main May 2, 2025
6 of 18 checks passed
@jerryzh168
Copy link
Contributor

cc @andrewor14

quantize_(
m,
IntXQuantizationAwareTrainingConfig(weight_config=weight_config),
filter_fn=lambda m, _: isinstance(m, torch.nn.Embedding),
filter_fn=lambda m, _: isinstance(m, torch.nn.Embedding) or _is_linear(m),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually this is only if you want to use the same configuration for embedding and linear. I kept them as two separate calls because in the above example linear additionally has activation quantization.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah you're right and this example won't work if you try and apply a config with activation quantization to both linear and embedding layers at the same time.
You can stack calls to quantize right? Would the right way to go about this two quantize calls, one which iflters for linear, then another which filters for embeddings?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah if you have slightly different quantization configurations for embedding and linear the right way would be two separate quantize_ calls. This is by design because we don't want to complicate quantize_ to accept a dictionary of configs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

QAT docs
4 participants