You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi,
I just wanted to say that I really appreciate the great work on this project! I’m curious if the training code will be open-sourced. If so, do you have an estimated timeline for its release?
Thanks!
The text was updated successfully, but these errors were encountered:
Hi, sorry for the late reply; I've been really busy lately :(
Regarding your previous question, fine-tuning has to be done for each specific scene (with a relevant video). As mentioned in the appendix of the paper, we later observed that initializing with the base CelebV-Text trained model resulted in similar results as SINTEL & FlyingChair initialized ones. And fine-tuning approach requires 1) careful data selection & preprocessing and 2) additional training. So based on our empirical results, the generalizability of the base models is usually more useful than the fine-tuning approach (which heavily depends on the data selection and requires additional training).
While we currently do plan to release the training code for reference, please understand that it could take a while (like December or later) as we are working on other projects and are really busy these days :(
About the second question, yes, you can use the Pix2Pix codebase. We have optimized it to get it running with the HF accelerator environment with larger batches and multi-GPU support, but the Pix2Pix codebase is a good starting point.
Hi,
I just wanted to say that I really appreciate the great work on this project! I’m curious if the training code will be open-sourced. If so, do you have an estimated timeline for its release?
Thanks!
The text was updated successfully, but these errors were encountered: