You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I have verified AWQ models can be supported (loaded in vLLM and converted to LowBitLinear in ipex-llm), but only asym_int4 quantization format is supported.
This feature will need some adaption from both vLLM side and ipex-llm side. I will update to this thread once those supported prs are merged.
I will provide AWQ model from customer, and customer will evaluate FP8 and Int4 performance.
The text was updated successfully, but these errors were encountered: