Add Value Function and corresponding example script to Diffuser implementation #884

bglick13 · 2022-10-17T19:08:06Z

Builds upon #105 by adding the code necessary for "value guided planning" described in the paper.

Adds an implementation of the Value Function model that adheres to Diffusers api, as well as an example script that shows how to perform value guided planning.

Also included a helper script that converts from the pretrained models provided in the original repo to Diffusers models

HuggingFaceDocBuilderDev · 2022-10-17T19:11:50Z

The documentation is not available anymore as the PR was closed or merged.

patrickvonplaten · 2022-10-18T19:22:42Z

Hey @bglick13,

Super cool PR :-) @natolambert could you review here? :-)

patrickvonplaten · 2022-10-18T19:23:34Z

BTW @natolambert,

We can also just merge a unet_rl.py file in a first step and the before we release we merge the models, layers to not block contributors here - what do you think?

natolambert · 2022-10-18T19:57:59Z

Yeah that would also work. We may be able to make a value function out of a hacked unet1d. Am okay with trying to merge this into my PR then it adds some complications to what I already have.

@patrickvonplaten my main question is if the RL-only model is useful to you. It'll allow people to more easily use diffusers to re-train this algorithm (rather than just inference), but it's very niche (unlikely to be used outside RL).

We could consider having it in a script or a community pipeline?
I've been following the code and working with @bglick13.

bglick13 · 2022-10-18T21:38:05Z

Thanks @patrickvonplaten! Like Nathan said the main question is where the RL stuff fits into the HF api. I'm cool with whatever makes the most sense. Just excited to get this code in people's hands.

I do think it would be cool to facilitate training instead of just inference, but that could either be a part of the core api or a community pipeline as Nathan suggested.

natolambert

Maybe we can move all of the scripts into a folder with multiple scripts. Not sure if it's a pipeline or what.

OR

pipelines for training / eval / inference each (also would be nice).

natolambert · 2022-10-18T21:40:05Z

.gitignore

@bglick13 to get the PR ready, let's remove things like hub.
We can leave *.mp4 if we add a pipeline, but add a comment saying its for rl so people know.

natolambert · 2022-10-18T21:42:32Z

convert_model.py

@@ -0,0 +1,59 @@
+


let's try and consolidate all of these files as much as possible.
Put the convert model into scripts/ with a longer & more descriptive path. (I can do the same for my RL models)

natolambert · 2022-10-18T21:43:14Z

examples/diffuser/helpers.py

@@ -0,0 +1,267 @@
+import os


I would guess just put these in the training file to minimize how many files things are across. Also huge points for getting rid of as many functions as you can (I did some of that in colab).

natolambert · 2022-10-18T21:43:44Z

examples/diffuser/run_diffuser.py

@@ -0,0 +1,155 @@
+import numpy as np


probably the core of a pipeline or example.

do you want a pipeline to be part of this PR? Or something we add in a small follow up PR?

natolambert · 2022-10-18T21:45:03Z

src/diffusers/models/unet_rl.py

+        self.mid_down2 = Downsample1D(mid_dim // 4, use_conv=True)
+        ##
+        fc_dim = mid_dim // 4
+        self.final_block = nn.ModuleList(


Good idea to support get_output_block function or if-else statements.
I've mentioned this to you @patrickvonplaten

I'm okay with changing the final_block to something cleaner - whatever y'all decide

natolambert · 2022-10-18T21:46:40Z

src/diffusers/schedulers/scheduling_ddpm.py

@@ -283,7 +283,13 @@ def step(
            noise = torch.randn(
                model_output.size(), dtype=model_output.dtype, layout=model_output.layout, generator=generator
            ).to(model_output.device)
-            variance = (self._get_variance(t, predicted_variance=predicted_variance) ** 0.5) * noise
+            if self.variance_type == "fixed_small_log":


can we change the get_variance function to accommodate this? I think the fixed_small_log was added for this?

natolambert

this is on the right track. Can we delete the not needed files and clean up some comments? I was going to do more but didn't really know how to clone the remote PR on PR stuff....