Skip to content

m-salmani78/RLHF-Text-Summarization

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 

Repository files navigation

RLHF Assignment Notebook - Spring 2024

This notebook is designed for the CA 4 assignment on Reinforcement Learning from Human Feedback (RLHF) in the context of language models for the Spring 2024 term. It includes a mix of coding tasks and conceptual questions to deepen understanding of RLHF, with practical applications and references for further exploration.

Table of Contents

  1. Overview
  2. Structure of the Notebook
  3. Resources
  4. Academic Honesty and Submission Guidelines

Overview

Reinforcement Learning from Human Feedback (RLHF) is a significant method in the development of language models, enabling systems to align more closely with human preferences. This notebook includes tasks that explore the theory and application of RLHF, providing practical coding exercises and written responses to reinforce your knowledge.


Structure of the Notebook

  1. Introduction to RLHF: This section provides an overview of RLHF, referencing key papers and educational resources to support your learning.

  2. Import Libraries and Set Constants: Essential imports and constant definitions for the assignment.

  3. Coding and Written Exercises: Exercises are designed to apply RLHF concepts practically. Ensure code execution aligns with provided solutions.


Resources


Academic Honesty and Submission Guidelines

  • Disclosure of AI Assistance: If AI tools are used, disclose this in the final cell of the notebook, including any prompts used.
  • Code Verification: During random audits, ensure that all code cells execute as expected to generate the provided answers. Plagiarism in code or responses will result in a zero.

This README should guide you through the assignment requirements and provide necessary resources for completing the tasks. Good luck, and happy learning!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published