Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Misjudgment of is_multi_lingual When Loading Multilingual Model via model_path #3273

Merged
merged 2 commits into from
Nov 24, 2023
Merged

Conversation

TITC
Copy link
Contributor

@TITC TITC commented Nov 20, 2023

Story

first problem-without model_name

The logic in api.py to sentence a model whether is multi-lingual or not is through model_name.

TTS/TTS/api.py

Lines 109 to 110 in 29dede2

if isinstance(self.model_name, str) and "xtts" in self.model_name:
return True

second problem-with model_name

but the problem is if the model_name is given then the ModelManager will try to download the model to the output_prefix, which is default to /home/yhtao/.local/share/tts in my system.

tts = TTS(model_path="/mnt/f/model/XTTS-v2_cache/tts_models--multilingual--multi-dataset--xtts_v2/",
          config_path="/mnt/f/model/XTTS-v2_cache/tts_models--multilingual--multi-dataset--xtts_v2//config.json").to(device)

def get_user_data_dir(appname):
TTS_HOME = os.environ.get("TTS_HOME")
XDG_DATA_HOME = os.environ.get("XDG_DATA_HOME")
if TTS_HOME is not None:
ans = Path(TTS_HOME).expanduser().resolve(strict=False)
elif XDG_DATA_HOME is not None:
ans = Path(XDG_DATA_HOME).expanduser().resolve(strict=False)
elif sys.platform == "win32":
import winreg # pylint: disable=import-outside-toplevel
key = winreg.OpenKey(
winreg.HKEY_CURRENT_USER, r"Software\Microsoft\Windows\CurrentVersion\Explorer\Shell Folders"
)
dir_, _ = winreg.QueryValueEx(key, "Local AppData")
ans = Path(dir_).resolve(strict=False)
elif sys.platform == "darwin":
ans = Path("~/Library/Application Support/").expanduser()
else:
ans = Path.home().joinpath(".local/share")
return ans.joinpath(appname)

of course, there is no model under /home/yhtao/.local/share/tts, and then the ModelManager re-download model to this path.

TTS/TTS/utils/manage.py

Lines 378 to 402 in 29dede2

output_path = os.path.join(self.output_prefix, model_full_name)
if os.path.exists(output_path):
if md5sum is not None:
md5sum_file = os.path.join(output_path, "hash.md5")
if os.path.isfile(md5sum_file):
with open(md5sum_file, mode="r") as f:
if not f.read() == md5sum:
print(f" > {model_name} has been updated, clearing model cache...")
self.create_dir_and_download_model(model_name, model_item, output_path)
else:
print(f" > {model_name} is already downloaded.")
else:
print(f" > {model_name} has been updated, clearing model cache...")
self.create_dir_and_download_model(model_name, model_item, output_path)
# if the configs are different, redownload it
# ToDo: we need a better way to handle it
if "xtts" in model_name:
try:
self.check_if_configs_are_equal(model_name, model_item, output_path)
except:
pass
else:
print(f" > {model_name} is already downloaded.")
else:
self.create_dir_and_download_model(model_name, model_item, output_path)

Usage

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")

model_path = "/mnt/f/model/XTTS-v2_cache/tts_models--multilingual--multi-dataset--xtts_v2/"

tts = TTS(model_path="/mnt/f/model/XTTS-v2_cache/tts_models--multilingual--multi-dataset--xtts_v2/",
          config_path="/mnt/f/model/XTTS-v2_cache/tts_models--multilingual--multi-dataset--xtts_v2//config.json").to(device)


print("tts.is_multi_lingual", tts.is_multi_lingual)

speaker_wav = "/mnt/c/Users/yh.tao/Downloads/灿烂千阳-录音-声源.mp3"
text = "是的,本质上这是一个成本和代价的问题:1、如果追查者拥有很大的权力,例如国际刑警组织或者是国际CERT组织,可以协调全球各运营商,通过官方渠道一级一级追踪回去,只要时间足够,是可以追查到最后一级的真实接入的,凯文米特尼克就是这么被下村勉抓到的。2、如果追查者拥有很高的黑客技术,一级一级反向破解回去,入侵跳板机器并且监听链接,也是可以追查到最后一级的真实接入。3、有些跳板/VPN/代理/肉机等等本身就是陷阱,用了的话等于自我暴露。有些加密通道已经不再安全,例如SSL和tor,过于信任这种技术也会导致轻易暴露。4、大数据分析和行为模式分析有可能可以定位到具体的人,哪怕并不清楚这么多层的链接究竟是怎么构建的。"
file_path = "/mnt/c/Users/yh.tao/Downloads/灿烂千阳-录音-clone-知乎回答.mp3"

tts.tts_to_file(text=text, speaker_wav=speaker_wav,
                language="zh-cn", file_path=file_path)

@CLAassistant
Copy link

CLAassistant commented Nov 20, 2023

CLA assistant check
All committers have signed the CLA.

@TITC TITC changed the title Misjudgment of 'is_multi_lingual' When Loading Multilingual Model via 'model_path Misjudgment of is_multi_lingual When Loading Multilingual Model via model_path Nov 20, 2023
@erogol
Copy link
Member

erogol commented Nov 22, 2023

I understand the issue but your solution assumes that the custom path has the model name as the last folder. It is not always true. Is there a different way to solve this? Though it is not obvious to me.

@TITC
Copy link
Contributor Author

TITC commented Nov 22, 2023

There is a parameter in config.json

"model": "xtts",

Is this should always be true? if it is then the model_name could be set with it if not given.

besides this way, we can modify is_multi_lingual logic, there is a language list in config.json, so is_multi_lingual is true when the number of the list elements more than 1

    "languages": [
        "en",
        "es",
        "fr",
        "de",
        "it",
        "pt",
        "pl",
        "tr",
        "ru",
        "nl",
        "cs",
        "ar",
        "zh-cn",
        "hu",
        "ko",
        "ja"
    ],

@erogol erogol merged commit 4d0f53d into coqui-ai:dev Nov 24, 2023
53 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants