This notebook is designed for the CA 4 assignment on Reinforcement Learning from Human Feedback (RLHF) in the context of language models for the Spring 2024 term. It includes a mix of coding tasks and conceptual questions to deepen understanding of RLHF, with practical applications and references for further exploration.
Reinforcement Learning from Human Feedback (RLHF) is a significant method in the development of language models, enabling systems to align more closely with human preferences. This notebook includes tasks that explore the theory and application of RLHF, providing practical coding exercises and written responses to reinforce your knowledge.
-
Introduction to RLHF: This section provides an overview of RLHF, referencing key papers and educational resources to support your learning.
-
Import Libraries and Set Constants: Essential imports and constant definitions for the assignment.
-
Coding and Written Exercises: Exercises are designed to apply RLHF concepts practically. Ensure code execution aligns with provided solutions.
- OpenAI's Research on Fine-Tuning Language Models
- Learning to Summarize from Human Feedback
- Huggingface Deep Reinforcement Learning Course
- Awesome RLHF - GitHub Repository
- Disclosure of AI Assistance: If AI tools are used, disclose this in the final cell of the notebook, including any prompts used.
- Code Verification: During random audits, ensure that all code cells execute as expected to generate the provided answers. Plagiarism in code or responses will result in a zero.
This README should guide you through the assignment requirements and provide necessary resources for completing the tasks. Good luck, and happy learning!