Why is the model trained with audio channels as 2? Although training data is LJSpeech. #4

Ashigarg123 · 2024-10-02T22:12:13Z

Am I missing something?

signofthefour · 2024-11-12T07:32:51Z

@Ashigarg123 Hi, Sorry for late replying.
It is my bad for laziness, I reused the named of parameters as in PriorGrad. Here is why audio channels = 2.

First, FreGrad predicts Wavelet features instead of Waveform. Therefore, the output will be [low_freq, high_freq] instead of [waveform]. So that, the number of output channel is 2 (each for low and high separately) instead of 1 (mono-channel waveform).

Please discuss more if it remain ambiguous.

signofthefour pinned this issue Nov 12, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why is the model trained with audio channels as 2? Although training data is LJSpeech. #4

Why is the model trained with audio channels as 2? Although training data is LJSpeech. #4

Ashigarg123 commented Oct 2, 2024

signofthefour commented Nov 12, 2024

Why is the model trained with audio channels as 2? Although training data is LJSpeech. #4

Why is the model trained with audio channels as 2? Although training data is LJSpeech. #4

Comments

Ashigarg123 commented Oct 2, 2024

signofthefour commented Nov 12, 2024