Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recommended number of training steps to achieve example results #63

Open
moih opened this issue Oct 5, 2019 · 2 comments
Open

Recommended number of training steps to achieve example results #63

moih opened this issue Oct 5, 2019 · 2 comments

Comments

@moih
Copy link

moih commented Oct 5, 2019

Hi!

I'm really excited about this implementation of generative audio and have just started training on my gaming GTX 1060 laptop.

My focus is to generate different waveGANs using a dataset out of different folk musics from Africa and Latin America.

I'm really interested in the applications also for live music, as I will be trying to generate sounds in offline (maybe also real time?) for a spatial piece and would like to know if I should use a cloud service to generate my different WaveGAN checkpoints.

Around what is the estimated training step at which the example models started generating the posted results?

Thanks and looking forward how this develops!

Update: I'm at roughly step 20k and the results are coming through quite nicely! Will post some audio soon

@chrisdonahue
Copy link
Owner

Thanks for your interest and sorry for the delay! I would love to hear some sound examples when you have the time.

You can probably run the generation pretty quickly on a regular laptop; you shouldn't need to use a cloud service unless your application requires generating a ton of content. You can get an idea of how fast the model would run on your laptop by going to this web demo and pressing "Change" on one of the sounds: https://chrisdonahue.com/wavegan/

One thing you can do is to take the trained WaveGAN and just generate a ton of sounds from it offline (e.g. 100k). Then you can just take a random sample from set in your real-time application.

We trained the models for the posted examples to between 100k and 200k steps, but yes we observed that even after only 10-20k steps the model was producing reasonable results.

Best of luck and let me know if you have more questions!

@moih
Copy link
Author

moih commented Oct 29, 2019

Thanks for your reply @chrisdonahue .

Here are some of the results from my training experiment from waveGAN.

Here is the dataset, for reference: https://www.youtube.com/watch?v=wXV39pybgJU

And here are my results (stopped at checkpoint 30k or so): https://www.dropbox.com/sh/hp4jk6d7gzuy2qz/AAAfgWpGnuh30LI7EwEAfa8ya?dl=0

IMO, they are pretty good quality and useful as samples for further processing in an electronic music production workflow.

Currently I'm experimenting with mixing heterogeneous datasets, meaning that I use very different .wavs as datasets and see if the model actually "mixes them" in the output as it learns from the them.

Another Issue I'm having is to actually be able to load in my own checkpoints with the example code that is published in the Jupyter notebook.

  • Is it possible to implement it as a stand-alone .py script in which I can batch output, for example 100 .wavs from a random or specific latent space vector?

Currently I am generating each sample one by one using a generator script someone else posted in another issue thread, which is kind of tedious.

Thanks again!

UPDATE: I managed to make a .py script for interpolating between latent space vectors, as in the Google Colab Notebook. Here is one interpolation using the above generated checkpoints: https://soundcloud.com/h-e-x-o-r-c-i-s-m-o-s/espacio_latente?fbclid=IwAR239cENr7yFQQq7Xi8CaOar8_H1k2_yHi7pOiwSQ5QYrM_iGrXdVwMyo-k

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants