Skip to content
/ SEPO Public

Code for the paper: "Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods"

License

Notifications You must be signed in to change notification settings

ozekri/SEPO

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 

Repository files navigation

Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods

The repository contains the code for the SEPO algorithm presented in the paper:

Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods.

SEPO is an efficient, broadly applicable, and theoretically justified policy gradient algorithm, for fine-tuning discrete diffusion models over general rewards.

The code will be uploaded mid-February 2025...

In the mean time, enjoy this pretty GIF of a denoising diffusion process guided by SEPO, in the discrete case of language!

About

Code for the paper: "Fine-Tuning Discrete Diffusion Models with Policy Gradient Methods"

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published