Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Value Function and corresponding example script to Diffuser implementation #884

Merged
merged 36 commits into from
Oct 21, 2022

Conversation

bglick13
Copy link
Contributor

Builds upon #105 by adding the code necessary for "value guided planning" described in the paper.

Adds an implementation of the Value Function model that adheres to Diffusers api, as well as an example script that shows how to perform value guided planning.

Also included a helper script that converts from the pretrained models provided in the original repo to Diffusers models

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 17, 2022

The documentation is not available anymore as the PR was closed or merged.

@patrickvonplaten
Copy link
Contributor

patrickvonplaten commented Oct 18, 2022

Hey @bglick13,

Super cool PR :-) @natolambert could you review here? :-)

@patrickvonplaten
Copy link
Contributor

BTW @natolambert,

We can also just merge a unet_rl.py file in a first step and the before we release we merge the models, layers to not block contributors here - what do you think?

@natolambert
Copy link
Contributor

natolambert commented Oct 18, 2022

Yeah that would also work. We may be able to make a value function out of a hacked unet1d. Am okay with trying to merge this into my PR then it adds some complications to what I already have.

@patrickvonplaten my main question is if the RL-only model is useful to you. It'll allow people to more easily use diffusers to re-train this algorithm (rather than just inference), but it's very niche (unlikely to be used outside RL).

We could consider having it in a script or a community pipeline?
I've been following the code and working with @bglick13.

@bglick13
Copy link
Contributor Author

Thanks @patrickvonplaten! Like Nathan said the main question is where the RL stuff fits into the HF api. I'm cool with whatever makes the most sense. Just excited to get this code in people's hands.

I do think it would be cool to facilitate training instead of just inference, but that could either be a part of the core api or a community pipeline as Nathan suggested.

Copy link
Contributor

@natolambert natolambert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can move all of the scripts into a folder with multiple scripts. Not sure if it's a pipeline or what.

OR

pipelines for training / eval / inference each (also would be nice).

.DS_Store
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bglick13 to get the PR ready, let's remove things like hub.
We can leave *.mp4 if we add a pipeline, but add a comment saying its for rl so people know.

convert_model.py Outdated
@@ -0,0 +1,59 @@

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's try and consolidate all of these files as much as possible.
Put the convert model into scripts/ with a longer & more descriptive path. (I can do the same for my RL models)

@@ -0,0 +1,267 @@
import os
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would guess just put these in the training file to minimize how many files things are across. Also huge points for getting rid of as many functions as you can (I did some of that in colab).

@@ -0,0 +1,155 @@
import numpy as np
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably the core of a pipeline or example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you want a pipeline to be part of this PR? Or something we add in a small follow up PR?

self.mid_down2 = Downsample1D(mid_dim // 4, use_conv=True)
##
fc_dim = mid_dim // 4
self.final_block = nn.ModuleList(
Copy link
Contributor

@natolambert natolambert Oct 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea to support get_output_block function or if-else statements.
I've mentioned this to you @patrickvonplaten

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with changing the final_block to something cleaner - whatever y'all decide

@@ -283,7 +283,13 @@ def step(
noise = torch.randn(
model_output.size(), dtype=model_output.dtype, layout=model_output.layout, generator=generator
).to(model_output.device)
variance = (self._get_variance(t, predicted_variance=predicted_variance) ** 0.5) * noise
if self.variance_type == "fixed_small_log":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we change the get_variance function to accommodate this? I think the fixed_small_log was added for this?

Copy link
Contributor

@natolambert natolambert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is on the right track. Can we delete the not needed files and clean up some comments? I was going to do more but didn't really know how to clone the remote PR on PR stuff....

examples/community/pipeline.py Outdated Show resolved Hide resolved
for i in tqdm.tqdm(self.scheduler.timesteps):
# create batch of timesteps to pass into model
timesteps = torch.full((batch_size,), i, device=self.unet.device, dtype=torch.long)
# 3. call the sample function
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Numbering starts at 3, can we remove the comments?

prev_x = self.unet(x.permute(0, 2, 1), timesteps).sample.permute(0, 2, 1)
x = self.scheduler.step(prev_x, i, x, predict_epsilon=False)["prev_sample"]

# 4. apply conditions to the trajectory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix comment

examples/community/pipeline.py Outdated Show resolved Hide resolved
# y = network(x, timesteps).sample
return x, y

def __call__(self, obs, batch_size=64, planning_horizon=32, n_guide_steps=2, scale=0.1):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we get some comments and spacing here?

examples/diffuser/run_diffuser.py Outdated Show resolved Hide resolved

scheduler = DDPMScheduler(num_train_timesteps=100, beta_schedule="squaredcos_cap_v2")

# 3 different pretrained models are available for this task.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clean up comments

examples/diffuser/run_diffuser.py Outdated Show resolved Hide resolved
examples/diffuser/run_diffuser_value_guided.py Outdated Show resolved Hide resolved
examples/diffuser/run_diffuser_value_guided.py Outdated Show resolved Hide resolved
@natolambert natolambert merged commit 48a7414 into huggingface:rl Oct 21, 2022
natolambert pushed a commit that referenced this pull request Nov 14, 2022
* re-add RL model code

* match model forward api

* add register_to_config, pass training tests

* fix tests, update forward outputs

* remove unused code, some comments

* add to docs

* remove extra embedding code

* unify time embedding

* remove conv1d output sequential

* remove sequential from conv1dblock

* style and deleting duplicated code

* clean files

* remove unused variables

* clean variables

* add 1d resnet block structure for downsample

* rename as unet1d

* fix renaming

* rename files

* add get_block(...) api

* unify args for model1d like model2d

* minor cleaning

* fix docs

* improve 1d resnet blocks

* fix tests, remove permuts

* fix style

* add output activation

* rename flax blocks file

* Add Value Function and corresponding example script to Diffuser implementation (#884)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* update post merge of scripts

* add mdiblock / outblock architecture

* Pipeline cleanup (#947)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

* clean up comments

* convert older script to using pipeline and add readme

* rename scripts

* style, update tests

* delete unet rl model file

* remove imports in src

Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* Update src/diffusers/models/unet_1d_blocks.py

* Update tests/test_models_unet.py

* RL Cleanup v2 (#965)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

* clean up comments

* convert older script to using pipeline and add readme

* rename scripts

* style, update tests

* delete unet rl model file

* remove imports in src

* add specific vf block and update tests

* style

* Update tests/test_models_unet.py

Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* fix quality in tests

* fix quality style, split test file

* fix checks / tests

* make timesteps closer to main

* unify block API

* unify forward api

* delete lines in examples

* style

* examples style

* all tests pass

* make style

* make dance_diff test pass

* Refactoring RL PR (#1200)

* init file changes

* add import utils

* finish cleaning files, imports

* remove import flags

* clean examples

* fix imports, tests for merge

* update readmes

* hotfix for tests

* quality

* fix some tests

* change defaults

* more mps test fixes

* unet1d defaults

* do not default import experimental

* defaults for tests

* fix tests

* fix-copies

* fix

* changes per Patrik's comments (#1285)

* changes per Patrik's comments

* update conversion script

* fix renaming

* skip more mps tests

* last test fix

* Update examples/rl/README.md

Co-authored-by: Ben Glickenhaus <benglickenhaus@gmail.com>
yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* re-add RL model code

* match model forward api

* add register_to_config, pass training tests

* fix tests, update forward outputs

* remove unused code, some comments

* add to docs

* remove extra embedding code

* unify time embedding

* remove conv1d output sequential

* remove sequential from conv1dblock

* style and deleting duplicated code

* clean files

* remove unused variables

* clean variables

* add 1d resnet block structure for downsample

* rename as unet1d

* fix renaming

* rename files

* add get_block(...) api

* unify args for model1d like model2d

* minor cleaning

* fix docs

* improve 1d resnet blocks

* fix tests, remove permuts

* fix style

* add output activation

* rename flax blocks file

* Add Value Function and corresponding example script to Diffuser implementation (huggingface#884)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* update post merge of scripts

* add mdiblock / outblock architecture

* Pipeline cleanup (huggingface#947)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

* clean up comments

* convert older script to using pipeline and add readme

* rename scripts

* style, update tests

* delete unet rl model file

* remove imports in src

Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* Update src/diffusers/models/unet_1d_blocks.py

* Update tests/test_models_unet.py

* RL Cleanup v2 (huggingface#965)

* valuefunction code

* start example scripts

* missing imports

* bug fixes and placeholder example script

* add value function scheduler

* load value function from hub and get best actions in example

* very close to working example

* larger batch size for planning

* more tests

* merge unet1d changes

* wandb for debugging, use newer models

* success!

* turns out we just need more diffusion steps

* run on modal

* merge and code cleanup

* use same api for rl model

* fix variance type

* wrong normalization function

* add tests

* style

* style and quality

* edits based on comments

* style and quality

* remove unused var

* hack unet1d into a value function

* add pipeline

* fix arg order

* add pipeline to core library

* community pipeline

* fix couple shape bugs

* style

* Apply suggestions from code review

* clean up comments

* convert older script to using pipeline and add readme

* rename scripts

* style, update tests

* delete unet rl model file

* remove imports in src

* add specific vf block and update tests

* style

* Update tests/test_models_unet.py

Co-authored-by: Nathan Lambert <nathan@huggingface.co>

* fix quality in tests

* fix quality style, split test file

* fix checks / tests

* make timesteps closer to main

* unify block API

* unify forward api

* delete lines in examples

* style

* examples style

* all tests pass

* make style

* make dance_diff test pass

* Refactoring RL PR (huggingface#1200)

* init file changes

* add import utils

* finish cleaning files, imports

* remove import flags

* clean examples

* fix imports, tests for merge

* update readmes

* hotfix for tests

* quality

* fix some tests

* change defaults

* more mps test fixes

* unet1d defaults

* do not default import experimental

* defaults for tests

* fix tests

* fix-copies

* fix

* changes per Patrik's comments (huggingface#1285)

* changes per Patrik's comments

* update conversion script

* fix renaming

* skip more mps tests

* last test fix

* Update examples/rl/README.md

Co-authored-by: Ben Glickenhaus <benglickenhaus@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants