Skip to content

Files

Latest commit

Jan 24, 2025
2e79d7b · Jan 24, 2025

History

History
This branch is 7 commits ahead of, 97 commits behind rasbt/LLMs-from-scratch:main.

ch05

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
Jan 24, 2025
Jun 19, 2024
Aug 10, 2024
Mar 23, 2024
Aug 6, 2024
Sep 22, 2024
Nov 18, 2024
Oct 14, 2024
Oct 14, 2024

Chapter 5: Pretraining on Unlabeled Data

 

Main Chapter Code

 

Bonus Materials

  • 02_alternative_weight_loading contains code to load the GPT model weights from alternative places in case the model weights become unavailable from OpenAI
  • 03_bonus_pretraining_on_gutenberg contains code to pretrain the LLM longer on the whole corpus of books from Project Gutenberg
  • 04_learning_rate_schedulers contains code implementing a more sophisticated training function including learning rate schedulers and gradient clipping
  • 05_bonus_hparam_tuning contains an optional hyperparameter tuning script
  • 06_user_interface implements an interactive user interface to interact with the pretrained LLM
  • 07_gpt_to_llama contains a step-by-step guide for converting a GPT architecture implementation to Llama 3.2 and loads pretrained weights from Meta AI
  • 08_memory_efficient_weight_loading contains a bonus notebook showing how to load model weights via PyTorch's load_state_dict method more efficiently