RLHF Assignment Notebook - Spring 2024

This notebook is designed for the CA 4 assignment on Reinforcement Learning from Human Feedback (RLHF) in the context of language models for the Spring 2024 term. It includes a mix of coding tasks and conceptual questions to deepen understanding of RLHF, with practical applications and references for further exploration.

Reinforcement Learning from Human Feedback (RLHF) is a significant method in the development of language models, enabling systems to align more closely with human preferences. This notebook includes tasks that explore the theory and application of RLHF, providing practical coding exercises and written responses to reinforce your knowledge.

Structure of the Notebook

Introduction to RLHF: This section provides an overview of RLHF, referencing key papers and educational resources to support your learning.
Import Libraries and Set Constants: Essential imports and constant definitions for the assignment.
Coding and Written Exercises: Exercises are designed to apply RLHF concepts practically. Ensure code execution aligns with provided solutions.

Resources

OpenAI's Research on Fine-Tuning Language Models
Learning to Summarize from Human Feedback
Huggingface Deep Reinforcement Learning Course
Awesome RLHF - GitHub Repository

Academic Honesty and Submission Guidelines

Disclosure of AI Assistance: If AI tools are used, disclose this in the final cell of the notebook, including any prompts used.
Code Verification: During random audits, ensure that all code cells execute as expected to generate the provided answers. Plagiarism in code or responses will result in a zero.

This README should guide you through the assignment requirements and provide necessary resources for completing the tasks. Good luck, and happy learning!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

RLHF Assignment Notebook - Spring 2024

Table of Contents

Overview

Structure of the Notebook

Resources

Academic Honesty and Submission Guidelines

Files

README.md

Latest commit

History

README.md

File metadata and controls

RLHF Assignment Notebook - Spring 2024

Table of Contents

Overview

Structure of the Notebook

Resources

Academic Honesty and Submission Guidelines