-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add function of am streaming inference #84
base: main
Are you sure you want to change the base?
Conversation
试了试 合成后会有一些“噗噗”声,是声码器还需要做什么配置吗? |
@lancelee98 我尝试的还好,没有噗噗声。你那里用的什么版本模型? |
@wawaa 我这边是自己微调的模型,可能是我模型非流式也会有一些噪音 |
@lancelee98 听了一下我的还是也有噗噗声 |
这部分只是 AM 模型推理的流式改造, Vocoder 也要做相应的改造才能与非流式的效果对等 |
是的Vocoder也对AM的输出先做了pad但是还是有噗噗声,您那里有方便推荐的输入给Vocoder的chunk size和pad设置参数嘛? |
pad 设置到 12 帧(含)以上,且需要确保你的 vocoder 是 casual cnn 而非 cnn,chunk size 其实并不影响 |
另外你可以测试下,将这个脚本生成的 mel 特征全部输入到 vocoder 中,看是否还有噗噗声,来验证下am 流式推理部分是不是好的,也辛苦反馈一下结果。后面会将 vocoder 流式改造也上传。 |
谢谢关于 casual cnn 的提醒。调整这一点后: |
请问pad的修改是在models\hifigan中hifigan.py吗?具体怎么改能告知一下吗?感谢大佬! |
请问pad的修改是在models\hifigan中hifigan.py吗?具体怎么改能告知一下吗?感谢大佬! |
Adding chunk_forward function for FsmnEncoderV2 and MemoryBlockV2 module, which is based on cache and implement streaming inference chunk by chunk;
Reconstruct the forward function of KanTtsSAMBERT, extract the common part into the pre_forward function, and use it as a common pre-module for the forward and forward_chunk functions to reduce the amount of redundant code; among them, chunk_forward implements The frame-level streaming inference function, which can control the mel length of each inference by changing the mel_chunk_size parameter;
In the infer_sambert.py script, add the --inference_type and --mel_chunk_size parameters. Among them, --inference_type controls am's inference method, --mel_chunk_size specifies the chunk size of streaming inference (need to specify --inference_type == "streaming" at the same time)
This update is an incremental update, and existing training and inference scripts and commands can run normally; the results of streaming inference and non-streaming inference have passed the consistency test, and the code has passed the pre-commit check.