NanoGPT Inspired by and Learned from Andrej Karpathy. GPT 2 The gpt 2 this project reproduce is based on the 124M model size, with 12 layers of Transformers and 768 d_model size. The model is written using PyTorch, and Huggingface Transformers.