This repository is for AI-Alignment-Cohort, a collection of pre-reads, session recordings, and assignments.
BIRDS x Safety and Alignment groups at C4AI are running the AI ALIGNMENT COHORT from July 2024 through September 2024.
This cohort is based on the ARENA (Alignment Research Engineer Accelerator) curriculum designed by Callum McDougall, drawing heavily from Redwood Research’s Machine Learning for Alignment Bootcamp. It also has overlap with other material (most notably Neel Nanda’s excellent open-source material on mechanistic interpretability of transformers).
We aim to cover the foundational concepts from this curriculum and allow beginners to start their alignment research journey. Having two weekly sessions where leads talk about specific topics from the ARENA course allows participants to learn and discuss the material, ask questions, and learn from others.
All of the resources are arranged session-wise:
- Pre-reads: Use these to go through the course material before the session.
- Session recordings and slides: If you missed the virtual meetings, use the recordings to stay up to date.
- Assignments: Use these to test your learnings. Feel free to use tools to understand the questions/answer them. The ultimate aim here is to grasp the underlying concepts.