Skip to content

Commit 0feb245

Browse files
authored
Add neurips 2024 slides link
1 parent e0c1ddc commit 0feb245

File tree

1 file changed

+4
-0
lines changed

1 file changed

+4
-0
lines changed

README.md

+4
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11
# Distributed Training Guide
22

3+
<img src="https://lambdalabs.com/hubfs/distriubuted-training-guide.png" width="400px" />
4+
5+
[Neurips 2024 presentation slides here](https://docs.google.com/presentation/d/1ANMmkOGaruYKTvhnsAbZgI9GrdMliNvibWGuNYw6HX8/edit?usp=sharing)
6+
37
Ever wondered how to train a large neural network across a giant cluster? Look no further!
48

59
This is a comprehensive guide on best practices for distributed training, diagnosing errors, and fully utilizing all resources available. It is organized into sequential chapters, each with a `README.md` and a `train_llm.py` script in them. The readme will discuss both the high level concepts of distributed training, and the code changes introduced in that chapter.

0 commit comments

Comments
 (0)