Hi,
From #430, it seems that 8da4w is primarily for Executorch, and is set to be deprecated. Please advise if there are any plans to enable it for CUDA & CPU as well, such that int4 weights could be converted to int8 just before computation?
Thanks!
cc @jerryzh168