-
Notifications
You must be signed in to change notification settings - Fork 27.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VITS] Fix init test #25945
[VITS] Fix init test #25945
Conversation
@@ -135,7 +147,7 @@ def create_and_check_model_forward(self, config, inputs_dict): | |||
attention_mask = inputs_dict["attention_mask"] | |||
|
|||
result = model(input_ids, attention_mask=attention_mask) | |||
self.parent.assertEqual(result.waveform.shape, (self.batch_size, 11008)) | |||
self.parent.assertEqual((self.batch_size, 624), result.waveform.shape) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Output dims change after modifying the HiFi GAN vocoder
The documentation is not available anymore as the PR was closed or merged. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me.
I ran it 100 times and got 8 failures.
Maybe add something like
@is_flaky(max_attempts=3, description="...(some comment)...")
Hi @sanchit-gandhi , I'm not sure how accumulation of numerical errors could appear during initialization, could you expand on this a bit? Also, would it be possible that the error comes from the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing!
The failing module is actually a vanilla conv1d layer (rather than a conv transpose):
I haven't looked too deep into it, but I presumed the error was occurring due to using the kaiming normal intialiser for the conv layers:
Feel free to dive into it more if you want to find the source error! But based on the values we're getting it just looks like an instance of flakiness (the mean weights are 1.03 instead of 1.0) |
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* [VITS] Fix init test * add flaky decorator * style * max attempts Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * style --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* [VITS] Fix init test * add flaky decorator * style * max attempts Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * style --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
* [VITS] Fix init test * add flaky decorator * style * max attempts Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com> * style --------- Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
What does this PR do?
Fixes #24085 (comment). In short, the VITS initialisation test was flaky: some of the parameters initialised with uniform values would exceed the initialiser range
[-1, 1]
.These violating parameters would always be in the latter stages of the HiFi GAN vocoder, so we can assume this was due to an accumulation of numerical errors.
This PR reduces the size of the HiFi GAN vocoder by a factor of 2, negating the accumulation of these errors. The test now passes over 20 iterations, but we should watch out if it turns out flaky over a larger range.