zeptogpt

A minimal JAX/PyTorch approx-implementation of GPT based on karpathy's 'Let's build GPT'. The goal here is to learn frameworks (JAX, PyTorch), models (GPT, LLama, Gemma), evals (Hellaswag, MMLU) and more. The vision is to be able to train/finetune/infer SOTA small-medium models on (freely-available) TPUs.