diff --git a/index.html b/index.html index 2db05345..845527ce 100644 --- a/index.html +++ b/index.html @@ -79,11 +79,11 @@

This is enabled by LLM model compression technique: SmoothQuant and AWQ (Activation-aware Weight Quantization), co-designed with TinyChatEngine that implements the compressed low-precision model.

Demo on an NVIDIA GeForce RTX 4070 laptop:

-

chat_demo_gpu coding_demo_gpu

LLaMA Chat Code LLaMA
-

+

chat_demo_gpu coding_demo_gpu

+

Demo on an Apple MacBook Pro (M1, 2021):

-

chat_demo_m1 coding_demo_m1

LLaMA Chat Code LLaMA
-

Feel free to check out our slides for more details!

+

chat_demo_m1 coding_demo_m1

+

Feel free to check out our slides for more details!

Overview