Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow cpu and gpu in int4wo and int4wo-gptq quantizer #131

Merged
merged 1 commit into from
Apr 12, 2024
Merged

Conversation

jerryzh168
Copy link
Contributor

Summary:
att

Test Plan:
verified in torchat

Reviewers:

Subscribers:

Tasks:

Tags:

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 12, 2024
@cpuhrsch cpuhrsch requested a review from HDCharles April 12, 2024 06:18
@cpuhrsch
Copy link
Contributor

Do we need to release a 0.1.1 for this?

@jerryzh168
Copy link
Contributor Author

Do we need to release a 0.1.1 for this?

it's fine, this is for torchat, and it will be using torchao-nightly. I'll still looking at some perf issue for this, I'll merge after that

@@ -762,11 +762,15 @@ def _check_linear_int4_k(k, groupsize = 1, inner_k_tiles = None):
return k_divisible_by_groupsize

def linear_forward_int4(x, weight_int4pack, scales_and_zeros, out_features, groupsize):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So these conversions to bfloat16 are primarily needed because of _weight_int4pack_mm?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that's correct

Summary:
att

Test Plan:
verified in torchat

Reviewers:

Subscribers:

Tasks:

Tags:
@jerryzh168
Copy link
Contributor Author

looks like there is no perf issues, I'll just merge

@jerryzh168 jerryzh168 merged commit b9beaf3 into main Apr 12, 2024
7 checks passed
dbyoung18 pushed a commit to dbyoung18/ao that referenced this pull request Jul 31, 2024
Summary:
att

Test Plan:
verified in torchat

Reviewers:

Subscribers:

Tasks:

Tags:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants