You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @yzh119 Thank you for your excellent work. Are there any current plans to support quantization, such as AWQ, SmoothQuant, KV Cache Int8, KV Cache FP8?Thanks.
The text was updated successfully, but these errors were encountered:
Hi @zhyncs , KV Cache Int8, KV Cache FP8 are mentioned in #125 .
Regarding AWQ and SmoothQuant, I suppose the most critical operators are fused dequant+gemv/fused dequant+gemm, and existing libraries have good support for them. I want to avoid duplicate work, and I'm glad to implement the missing operators in these libraries.
Hi @yzh119 Thank you for your excellent work. Are there any current plans to support quantization, such as AWQ, SmoothQuant, KV Cache Int8, KV Cache FP8?Thanks.
The text was updated successfully, but these errors were encountered: