Sam Foreman 2023-12-20
Playing with words.
A set of simple, scalable and highly configurable tools for working1 with LLMs.
What started as some simple
modifications to Andrej
Karpathy's nanoGPT
has now grown into the wordplay
project.
If you’re curious…
While nanoGPT
is a great project and an excellent resource; it is,
by design, very minimal2 and limited in its flexibility.
Working through the code I found myself making minor changes here and
there to test new ideas and run variations on different experiments.
These changes eventually built to the point where my
{goals, scope, code}
for the project had diverged significantly from
the original vision.
As a result, I figured it made more sense to move things to a new
project, wordplay
.
I’ve priortized adding functionality that I have found to be useful or interesting, but am absolutely open to input or suggestions for improvement.
Different aspects of this project have been motivated by some of my recent work on LLMs.
- Projects:
ezpz
: Painless distributed training with your favorite{framework, backend}
combo.Megatron-DeepSpeed
: Ongoing research training transformer language models at scale, including: BERT & GPT-2
- Collaboration(s):
- DeepSpeed4Science (2023-09)
- Loooooooong Sequence Lengths
- Project Website
- Preprint Song et al. (2023)
- Blog Post
- Tutorial
- GenSLMs:
- DeepSpeed4Science (2023-09)
- Talks / Workshops:
- DeepSpeed support (✅: 2024-01-03)
- Work with any 🤗 HuggingFace dataset
- Effortless distributed training using
ezpz
- Improved (type-safe) and extensible configuration system (powered
by
hydra
), see #config - Automatic, detailed experiment + metric tracking with Weights & Biases
- Rich informative logging
with
enrich
- Full-Sharded Data-Parallel (FSDP) support
- 3D Parallelism support via:
Grab-n-Go
The easiest way to get the most recent version is to:
python3 -m pip install "git+https://github.com/saforem2/wordplay.git"
Development
If you’d like to work with the project and run / change things yourself, I’d recommend installing from a local (editable) clone of this repository:
git clone "https://github.com/saforem2/wordplay"
cd wordplay
mkdir v venv
python3 -m venv venv --system-site-packages
source venv/bin/activate
python3 -m pip install -e .
Last Updated: 12/20/2023 @ 10:05:31
Song, Shuaiwen Leon, Bonnie Kruft, Minjia Zhang, Conglong Li, Shiyang Chen, Chengming Zhang, Masahiro Tanaka, et al. 2023. “DeepSpeed4Science Initiative: Enabling Large-Scale Scientific Discovery Through Sophisticated AI System Technologies.” https://arxiv.org/abs/2310.04610.