From 2c95bc34f83a092e5df7b34c79d178907f670b5d Mon Sep 17 00:00:00 2001 From: RaymondWang0 Date: Mon, 19 Feb 2024 00:01:01 +0000 Subject: [PATCH] deploy: 2b93f2ca1861789c28b70a35ee6810cb00ab819d --- index.html | 74 +++++++++---------------------------------- search/all_0.js | 2 +- search/all_10.js | 26 ++++++++------- search/all_11.js | 16 ++-------- search/all_12.js | 6 ++-- search/all_13.js | 11 ++++--- search/all_14.js | 16 ++++------ search/all_15.js | 21 ++++++------ search/all_16.js | 4 +-- search/all_17.js | 16 ---------- search/all_18.js | 4 --- search/all_19.js | 4 --- search/all_2.js | 2 +- search/all_3.js | 2 +- search/all_4.js | 19 ++++++----- search/all_5.js | 16 ++++------ search/all_6.js | 25 +++++++-------- search/all_7.js | 17 ++++------ search/all_8.js | 4 +-- search/all_9.js | 2 +- search/all_a.js | 2 +- search/all_b.js | 51 +++++++++++++++++++++++++++-- search/all_c.js | 70 ++++++++++++---------------------------- search/all_d.js | 44 ++++++++++++------------- search/all_e.js | 26 ++------------- search/all_f.js | 15 +++++++-- search/searchdata.js | 2 +- vlm_demo_m1.gif | Bin 5528520 -> 0 bytes 28 files changed, 211 insertions(+), 286 deletions(-) delete mode 100644 search/all_17.js delete mode 100644 search/all_18.js delete mode 100644 search/all_19.js delete mode 100644 vlm_demo_m1.gif diff --git a/index.html b/index.html index 17a448e3..d8efbf5c 100644 --- a/index.html +++ b/index.html @@ -84,14 +84,11 @@

Code LLaMA Demo on an NVIDIA GeForce RTX 4070 laptop:

-VLM Demo on an Apple MacBook Pro (M1, 2021):

-

-

LLaMA Chat Demo on an Apple MacBook Pro (M1, 2021):

-

+

Overview

-

+

LLM Compression: SmoothQuant and AWQ

SmoothQuant: Smooth the activation outliers by migrating the quantization difficulty from activations to weights, with a mathematically equal transformation (100*1 = 10*10).

@@ -99,7 +96,7 @@

smoothquant_intuition

AWQ (Activation-aware Weight Quantization): Protect salient weight channels by analyzing activation magnitude as opposed to the weights.

-

+

LLM Inference Engine: TinyChatEngine