One individual cannot handle the continued development of this repo #303

34j · 2023-04-12T08:00:53Z

34j
Apr 12, 2023
Maintainer

Lack of an initial model: cannot train QuickVC at all with an A4000 or so; tried half-precision training on T4, but not fast enough at all; No money to rent a better GPU; (Looking for initial model donors.)
TPU support was too hard and IPU support is doomed; There are no resources to experiment with reducing the size of the model
New models (such as 5.0 or RVC) are released day by day and implementation is not in time. I am looking for contributors, but I'm not even sure what the point is to implement them after the fact.
Real-time support is no longer novel since many repositories have already referenced our repo code and implemented it.
Is there anyone who wants to be a maintainer of this repository in this terrible situation?

sbersier · 2023-04-12T09:51:44Z

sbersier
Apr 12, 2023

Regarding the training:
Have you considered lowering the sampling rate from 44100 to 22050? Wouldn't it help?
To be perfectly honnest, I don't fully understand this choice (i.e. 44k) since human voice extends mostly up to 10k Hz. So, there is not much to gain by have a sampling rate above 22k, is there? And, in fact, I think that 16k Hz would already be quite good.

By doing this you would be able to fit twice as much data in VRAM.

In fact, I'm wondering if the result couldn't be even better since sampling frequency regions that don't contain much useful information might be conterproductive. Also setting mel_fmax to 11025 Hz would increase the spectral resolution. You might even be able to decrease n_mel_channels (if it helps training).

Could collaborative training be a solution?

Like in: https://huggingface.co/blog/collaborative-training

In fact, as I understand it, it is based on hivemind:
https://learning-at-home.github.io/

If you can put in place such a solution, I would be happy to give GPU time.
In this example, they trained a model for bengali language with 40 volonteers (each one having different hardware and bandwidth). I highly recommend you have a look at it if you haven't already had.

Anyway, I hope you call for help will be heard.

Best regards

2 replies

ADR3-N Apr 30, 2023

I would be very interested to see results if you are able to get something running on 22k, since I have a very slow machine. If you do, or anyone else does, please @ me on twitter, OGAdrean. I really only need 22k files, and this fork is working great for me, but I imagine a slightly lower sample rate would not only accelerate training but allow me to increase batch size.

GarrettConway May 2, 2023
Maintainer

Just some technical notes for anyone interested:
You should be able to adjust the sampling rate in the config and train on it. Admittedly, I haven't devoted any time to personally tinkering with this part recently, but I don't believe any changes would have broken that functionality. (If someone runs into any problems, feel free to @ me and I can fix them). "svc pre-resample --sampling-rate" will also let you set that in the preprocessing.
Sidenote: The contentvec model is always fed audio sampled at 16k since that's what their pretrained model uses - the sampling rate for the svc model mostly affects the vocoder and the mel loss.

Cheers,
Garrett

Z3Coder · 2023-05-01T19:17:44Z

Z3Coder
May 1, 2023

How long would it take to train initial models on a RTX3090?
Is dataset available for the models and if yes, how big is it?
Depending on the answers, maybe i could train the models, or at least one.
Thank you for your work! @34j

0 replies

SamueleLorefice · 2023-05-06T01:32:40Z

SamueleLorefice
May 6, 2023

What is needed for QuickVC training? Provided we have a dataset available I could go on and do initial training on my 4090 (or rent multiple 4090s for some hours).

I'd have to look deeper into this to try and maintain the repo, but lending an hand in my (albeit limited) free time is something I could do. Not alone certainly. But I'm up for discussion. And looks like I'm not alone here either.

1 reply

34j May 6, 2023
Maintainer Author

Migrate to #579

Onako2 · 2024-01-27T09:58:42Z

Onako2
Jan 27, 2024

Lack of an initial model: cannot train QuickVC at all with an A4000 or so; tried half-precision training on T4, but not fast enough at all; No money to rent a better GPU; (Looking for initial model donors.)

TPU support was too hard and IPU support is doomed; There are no resources to experiment with reducing the size of the model

New models (such as 5.0 or RVC) are released day by day and implementation is not in time. I am looking for contributors, but I'm not even sure what the point is to implement them after the fact.

Real-time support is no longer novel since many repositories have already referenced our repo code and implemented it.

Is there anyone who wants to be a maintainer of this repository in this terrible situation?

I would like to help but I don't know much programming. I am still in school and will stay for a long time and will complete school in 6 years :(

0 replies

This comment was marked as spam.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

One individual cannot handle the continued development of this repo #303

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

This comment was marked as spam.

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

One individual cannot handle the continued development of this repo #303

34j Apr 12, 2023 Maintainer

Replies: 5 comments · 3 replies

sbersier Apr 12, 2023

ADR3-N Apr 30, 2023

GarrettConway May 2, 2023 Maintainer

This comment was marked as spam.

Z3Coder May 1, 2023

SamueleLorefice May 6, 2023

34j May 6, 2023 Maintainer Author

Onako2 Jan 27, 2024

34j
Apr 12, 2023
Maintainer

Replies: 5 comments 3 replies

sbersier
Apr 12, 2023

GarrettConway May 2, 2023
Maintainer

Z3Coder
May 1, 2023

SamueleLorefice
May 6, 2023

34j May 6, 2023
Maintainer Author

Onako2
Jan 27, 2024