-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training concept Issue: Use of Repetition for Short Motion Sequences #31
Comments
Hi, thank you for your question. Yes, simply repeating the motion data may be one of the reasons for unsmooth motions. In the latest version of the model, we removed data less than 4s, which means repeating strategy will not work during training. As for the two alternatives you mentioned, we did not try yet, you can try them on. Looking forward to your feedback. |
Another strategy that is worth trying is to smooth the motions generated in Liveportrait during data preparation, inspired by this issue: KwaiVGI/LivePortrait#439 |
@xuyangcao Thanks for the reply As i was testing on replacing the repetitive data, with some new data structure where instead of cropping 200 frames. I am using a sliding window approach as the video data i am using are quite long. Where i am also adding 0 padding at the end of the data to make it a multple of 100 frames similar to as we are doing it in inference. Now given I am seeing a convergence of validation loss. For exp smooth loss, it is rising eventhough the value is in range e-6. Whereas i see some zig zag pattern in exp velocity loss as it still converging. Thank you |
in vasa paper - they use 50 frames with stride of 25 to augment data. did removing the alignment mask do anything for training? |
Thanking you again for the training script, just have some doubt regarding the training approach
In the current implementation, short motion sequences are handled by repeating the motion data to match the required length of 2 * n_motions. While this ensures a uniform batch size, it introduces potential issues with motion continuity and smoothness. Specifically, repeating the same motion clip can lead to a discontinuity between the repeated frames, where the first frame of the repeated clip may have a large difference from the last frame of the original clip.
Potential Issue:
When the repeated frames are processed during training, the model might struggle to maintain smooth transitions, resulting in artifacts or jitter in the generated motion. This discontinuity could negatively impact the model's ability to learn realistic and smooth motion sequences.
Whether Instead of repeating the motion data, using one of the following approaches would be better:
Neutral Source Motion Padding: Use a predefined neutral motion state (e.g., rest pose) for padding and an indicator to mark these as non-informative frames.
As it will preserve the continuity of the original motion data and prevent the model from learning unrealistic transitions.
Questions:
What was the rationale behind using repetition instead of padding?
Have you tested it with zero padding or neutral source padding, and not opted to go with it?
Looking forward to your thoughts on this!
The text was updated successfully, but these errors were encountered: