Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] Fix bug in cast in quantization #481

Merged
merged 3 commits into from
Dec 21, 2024
Merged

[BUG] Fix bug in cast in quantization #481

merged 3 commits into from
Dec 21, 2024

Conversation

vadiklyutiy
Copy link
Collaborator

No description provided.

@vadiklyutiy vadiklyutiy self-assigned this Dec 21, 2024
@vadiklyutiy vadiklyutiy merged commit c0525b7 into main Dec 21, 2024
10 of 22 checks passed
@vadiklyutiy vadiklyutiy deleted the vadim/quant-cast branch December 21, 2024 02:02
vadiklyutiy added a commit that referenced this pull request Dec 21, 2024
vadiklyutiy added a commit that referenced this pull request Dec 21, 2024
vadiklyutiy pushed a commit that referenced this pull request Dec 26, 2024
Changing QxK^T accumulator from fp16 to fp32 in causal attention.
Previously it solved the accuracy issue in masked attention:
CentML/hidet#465

---------

Co-authored-by: Zhumakhan <nazirzhumakhan@gmail,.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant