You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MIT released their TinyChatEngine which includes kernels for every kind of platform. To ensure wide availability, we should integrate the kernels for Metal which are kernels for running quantized on MacOS.
Requirements:
Run the Metal kernel through Python (maybe run on the fly?)
Automatically switch to the Metal kernel if torch.backends.mps.is_available(), otherwise use CUDA.
Figure out: How to support either GEMV/GEMM format with Metal
MIT released their TinyChatEngine which includes kernels for every kind of platform. To ensure wide availability, we should integrate the kernels for Metal which are kernels for running quantized on MacOS.
Requirements:
torch.backends.mps.is_available()
, otherwise use CUDA.Kernel:
https://github.com/mit-han-lab/TinyChatEngine/blob/main/kernels/metal/kernel/op.metal#L10
The text was updated successfully, but these errors were encountered: