Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chat.sample_audio_speaker(wav) 功能为从音频文件提取音色吗?是否可以替代 chat.sample_random_speaker() 生成的speaker #782

Open
flystarhe opened this issue Oct 12, 2024 · 1 comment
Labels
algorithm Algorithm improvements & issues documentation Improvements or additions to documentation

Comments

@flystarhe
Copy link

flystarhe commented Oct 12, 2024

代码:

    filename = f"chattts-rand-speaker-{i:03d}.se.wav"

    wav, sample_rate = torchaudio.load(filename)
    wav = wav[0]

    speaker = chat.sample_audio_speaker(wav)

    params_infer_code = ChatTTS.Chat.InferCodeParams(
        # spk_emb=speaker,   # add sampled speaker
        temperature=0.3,   # using custom temperature
        top_P=0.7,         # top P decode
        top_K=20,          # top K decode
        show_tqdm=False,   # no tqdm
        manual_seed=1234,  # seed
    )

    params_refine_text = ChatTTS.Chat.RefineTextParams(
        prompt="[oral_0][laugh_0][break_6]",
        show_tqdm=False,
        manual_seed=1234,
    )

    wavs = chat.infer(
        [text],
        params_refine_text=params_refine_text,
        params_infer_code=params_infer_code,
    )

报错:

File /opt/conda/envs/dev-chattts/lib/python3.11/site-packages/ChatTTS/core.py:220, in Chat.infer(self, text, stream, lang, skip_refine_text, refine_text_only, use_decoder, do_text_normalization, do_homophone_replacement, params_refine_text, params_infer_code)
    218     return res_gen
    219 else:
--> 220     return next(res_gen)

StopIteration: 
@2noise 2noise deleted a comment from rose07 Oct 15, 2024
@fumiama
Copy link
Member

fumiama commented Oct 15, 2024

不可以,两个用法不同,编码也不同。chat.sample_random_speaker()生成的是音色信息,其长度永远不变。chat.sample_audio_speaker(wav)生成的则是音频的token编码,其长度与音频长度正相关。

@fumiama fumiama added documentation Improvements or additions to documentation algorithm Algorithm improvements & issues labels Oct 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
algorithm Algorithm improvements & issues documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

2 participants