-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
[PHI] add int4 weight only quant kernel, add int4 weight only permute…
… kernel (#64094) * Add int4 quantzie kernel and permute kernel * Update weight_quantize_kernel_gpu_impl.h * dont reshape it version * update kernel * fix int4 quant kernel * Update weight_quantize_kernel_gpu_impl.h * fix conflicts * fix int4 per channel quant row pack error * fix int4 dequant launch kernel * remove printf * add int4 gpucpu check * Update test_weight_only_linear.py * Update weight_dequantize_kernel.cu * fix compile error * fix * fix ci * recommit * fix code --------- Co-authored-by: yuanlehome <yuanlehome@163.com>
- Loading branch information
1 parent
347bad6
commit 4991383
Showing
4 changed files
with
448 additions
and
42 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.