Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Dance Diffusion] Add dance diffusion #803

Merged
merged 49 commits into from
Oct 25, 2022
Merged

Conversation

patrickvonplaten
Copy link
Contributor

@patrickvonplaten patrickvonplaten commented Oct 11, 2022

cc @apolinario to monitor progress

Checkpoints are uploaded here: https://huggingface.co/harmonai

Maestro Pipeline can be tested with:

from diffusers import DiffusionPipeline
import scipy.io.wavfile

pipe = DiffusionPipeline.from_pretrained("harmonai/maestro-150k")
pipe = pipe.to("cuda")

audios = pipe(num_inference_steps=100, sample_length_in_s=4.0).audios

scipy.io.wavfile.write("maestro_test.wav", pipe.unet.sample_rate, audios)

It relies on the DanceDiffusionPipeline the IPNDMScheduler and the UNet1DModel classes.

TODO

  • Convert model weights and successfully port model
  • Add scheduler
  • Add pipeline

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 11, 2022

The documentation is not available anymore as the PR was closed or merged.

return self.main(input) + self.skip(input)


def get_down_block(down_block_type, c, c_prev):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natolambert - similar to unet_2d_blocks we'll have a new unet_1d_blocks.py file where you can define very customizable unet classes

@natolambert
Copy link
Contributor

This model is in concurrent development with #105.

@@ -70,8 +70,9 @@ def __init__(
self.sample_size = sample_size

# time
self.time_proj = GaussianFourierProjection(embedding_size=8)
del self.time_proj.W
self.time_proj = GaussianFourierProjection(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we format this like in the 2d class?

Also, no embedding after projection?

if time_embedding_type == "fourier":
self.time_proj = GaussianFourierProjection(embedding_size=block_out_channels[0], scale=16)
timestep_input_dim = 2 * block_out_channels[0]
elif time_embedding_type == "positional":
self.time_proj = Timesteps(block_out_channels[0], flip_sin_to_cos, freq_shift)
timestep_input_dim = block_out_channels[0]
self.time_embedding = TimestepEmbedding(timestep_input_dim, time_embed_dim)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think as soon as we have more than GaussianFourier, let's do it - before it's maybe not necessary

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RL unet1d is different, was prepping for that.

@@ -132,24 +133,24 @@ def forward(
otherwise a `tuple`. When returning a tuple, the first element is the sample tensor.
"""
# 1. time
timestep_embed = self.time_proj(timestep[:, None])[..., None].repeat([1, 1, sample.shape[2]])
timestep_embed = self.time_proj(timestep)[..., None]
timestep_embed = timestep_embed.repeat([1, 1, sample.shape[2]])

sample = torch.cat([sample, timestep_embed], dim=1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic should maybe go in the block rather than the forward?


class UnetModel1DTests(unittest.TestCase):
@slow
def test_unet_1d_maestro(self):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@natolambert this test needs to pass

Copy link
Contributor

@patil-suraj patil-suraj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot for adding this model! Looks very good, just left some comments mostly related to docs.

src/diffusers/models/embeddings.py Show resolved Hide resolved
src/diffusers/models/unet_1d.py Outdated Show resolved Hide resolved
src/diffusers/models/unet_1d.py Outdated Show resolved Hide resolved
src/diffusers/models/unet_1d.py Outdated Show resolved Hide resolved
src/diffusers/models/unet_1d.py Outdated Show resolved Hide resolved
src/diffusers/models/unet_1d.py Outdated Show resolved Hide resolved
src/diffusers/models/unet_1d_blocks.py Outdated Show resolved Hide resolved
src/diffusers/schedulers/scheduling_ipndm.py Outdated Show resolved Hide resolved
src/diffusers/models/unet_1d.py Outdated Show resolved Hide resolved
src/diffusers/models/unet_1d.py Outdated Show resolved Hide resolved
src/diffusers/models/unet_1d.py Show resolved Hide resolved
@patrickvonplaten patrickvonplaten merged commit 88fa6b7 into main Oct 25, 2022
@patrickvonplaten patrickvonplaten deleted the add_dance_diffusion branch October 25, 2022 16:39
@patrickvonplaten
Copy link
Contributor Author

Ran the whole slow tests suite and everything passed

yoonseokjin pushed a commit to yoonseokjin/diffusers that referenced this pull request Dec 25, 2023
* start

* add more logic

* Update src/diffusers/models/unet_2d_condition_flax.py

* match weights

* up

* make model work

* making class more general, fixing missed file rename

* small fix

* make new conversion work

* up

* finalize conversion

* up

* first batch of variable renamings

* remove c and c_prev var names

* add mid and out block structure

* add pipeline

* up

* finish conversion

* finish

* upload

* more fixes

* Apply suggestions from code review

* add attr

* up

* uP

* up

* finish tests

* finish

* uP

* finish

* fix test

* up

* naming consistency in tests

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>

* remove hardcoded 16

* Remove bogus

* fix some stuff

* finish

* improve logging

* docs

* upload

Co-authored-by: Nathan Lambert <nol@berkeley.edu>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Nathan Lambert <nathan@huggingface.co>
Co-authored-by: Anton Lozhkov <anton@huggingface.co>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants