A pure JavaScript port of Karpathy's llama2.c with a simple UI.
-
Download Karpathy's Llama2 (Orig instructions) parameters pretrained on TinyStories dataset
wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories15M.bin wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories42M.bin wget https://huggingface.co/karpathy/tinyllamas/resolve/main/stories110M.bin
-
Open run.html via a WebServer
python -m http.server 8080 open http://localhost:8080/run.html
Tokens/sec measurement on Apple M1
tok/s | 15M | 42M | 110M |
---|---|---|---|
🐢 | ~30 | ~13 | ~5 |
MIT