-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to train with custom datset ? #21
Comments
|
@primepake Thanks. But I was looking for the codes to run for each step with directory structure and other related info required. |
you should process your dataset carefully, It will effect your training |
Yeah, thats why looking for a step by step guide from you. It would really help me. Can you provide a doc or readme or something ? So that anybody can just follow the steps and start training. I have gone through and done training using the 96*96 wav2lip repo, but looking for more res results. |
Hi @primepake , Thanks for your comments. |
@donggeon I think he meant "color_syncnet_train.py" this file. If you have figured out previous steps can u tell me what u did exactly ? |
https://github.com/joonson/syncnet_python |
Thanks for your answering. I was wondering if you could give me some more detailed instructions. |
@primepake Can you please share detailed instructions for training ? |
I will release the code, it's a ton of code |
Thanks @primepake, it'd be a great help. |
Thanks 👍🏻 |
do you also have your checkpoint from the avspeech runs, prior to running on your private dataset? I'm interested in comparing how it turned out on your end vs training via the instructions you provide. |
I will public pretrained on avspeech |
Great, thank you. Can you also leave an estimate of the gpu hardware and compute time it took for you to do the checkpoint training and finetuning. |
I used 10 GPU A6000 with nearly 200GB memory of GPU |
the trial and error just in a day |
ok, great. For the public avspeech pretrained checkpoint is it being put in the repo, as a link in the readme, or just here in the issues? |
for some reasons, I will public another day |
@primepake When can you upload detailed training instructions for preprocessing with codes for each step ? |
Can you please detailed instructions for https://github.com/joonson/syncnet_python. How to use this repo. |
Is this the correct order ? |
yes |
Any update on the avspeech only checkpoint? |
hi, I updated my preprocessing step, sorry about missing ordering |
How many videos did you need per the method you wrote for the fine-tuning step? |
can you give more details about this 4th step (split video less than 5s.). Is this step included in clean_data.py. And in the fifth step (using syncnet_python to filter dataset in the range [-3, 3]) should I have to only filter the dataset on the basis of offset given by Syncnet_python or Should I have to correct Synchronization. |
@primepake what do you mean by "split video less than 5s" ? Does it mean split longer videos in smaller videos with duration less than 5 sec ? |
lip-sync expert has many problems, you need to find it. As the author mentioned, it doesn't care about similarity between frames. You need to read the paper to understand more. Does not reflect the real-world usage. As discussed before, |
@primepake I want to buy your model. Can you please share details on : sylvie.nexus11@gmail.com |
just was wondering if I could get this estimate for the fine-tuning after AVspeech (videos and/or minutes of footage) Also any updates on the AVspeech checkpoint? |
When I am running syncnet python getting below error :
Does anybody know how to resolve this ? |
@primepake thanks for the note about fine-tuning ,do you have any updates on: Also I was wondering if fine-tuning generally needed to be done on a per person basis or your propreity data was just alot of people all as one fine-tuned model? |
@primepake I guess the issues are not clear yet, why did you close them ? |
this is the problem in your code, you have to figure it out yourself. Just take a screenshot and leave it here so we can't solve it. thank you |
I am using your exact code, haven't changed it |
No, I did not change it. Have kept it as it is. The only thing I tried changing is |
you need to change args.img_size = 288 in inference.py |
Thanks for your nice work. I want to ask why “split video less than 5s”? What effect does it have on the results? I split videos maximum of 20s, is that ok? |
To understand more, you should read the paper but if the length of video is too long, it can lead to duplicate sound in this video so the positive pair and negative can be the same in a high probability |
Thanks |
how long have you trained expert_syncnet and wav2lip using AVspeech? |
Hi, why we need to split video less than 5s to train the syncnet, what if i train with longer video clips which are about 1 min ? |
|
Hello, I would like to know if the filter dataset in range [-3, 3] you mentioned here refers to offset, conf, or dist in the syncnet_python project?
Is my understanding correct? |
Hi, I am quiet new to this. I am looking for step by step guide to train custom dataset OR Train on AVSpeech dataset and finetune for other videos. Steps can be :
I think such guide will help a lot of people not get confused.
Thank You.
The text was updated successfully, but these errors were encountered: