Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] HF transformers update breaks XTTS streaming #31

Closed
eginhard opened this issue May 28, 2024 · 3 comments · Fixed by #46
Closed

[Bug] HF transformers update breaks XTTS streaming #31

eginhard opened this issue May 28, 2024 · 3 comments · Fixed by #46
Assignees
Labels
bug Something isn't working good first issue Good for newcomers XTTS

Comments

@eginhard
Copy link
Member

Describe the bug

huggingface/transformers#30624 broke the XTTS streaming code (https://github.com/idiap/coqui-ai-TTS/blob/df088e99dfda8976c47235626d3afb7d7d70fea2/TTS/tts/layers/xtts/stream_generator.py) as reported in huggingface/transformers#31040. The streaming code needs to be updated accordingly.

To Reproduce

https://coqui-tts.readthedocs.io/en/latest/models/xtts.html#streaming-manually

Expected behavior

No response

Logs

Loading model...
DEBUG:fsspec.local:open file: ~/.local/share/tts/tts_models--multilingual--multi-dataset--xtts_v2/model.pth
Computing speaker latents...
Inference...
~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/TTS/tts/layers/xtts/stream_generator.py:138: UserWarning: You have modified the pretrained model configuration to control generation. This is a deprecated strategy to control generation and will be removed soon, in a future version. Please use a generation configuration file (see https://huggingface.co/docs/transformers/main_classes/text_generation)
  warnings.warn(
Traceback (most recent call last):
  File "~/projects/testing/xtts.py", line 33, in <module>
    for i, chunk in enumerate(chunks):
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 35, in generator_context
    response = gen.send(None)
               ^^^^^^^^^^^^^^
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/TTS/tts/models/xtts.py", line 657, in inference_stream
    gpt_generator = self.gpt.get_generator(
                    ^^^^^^^^^^^^^^^^^^^^^^^
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/TTS/tts/layers/xtts/gpt.py", line 602, in get_generator
    return self.gpt_inference.generate_stream(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/TTS/tts/layers/xtts/stream_generator.py", line 186, in generate
    model_kwargs["attention_mask"] = self._prepare_attention_mask_for_generation(
                                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "~/applications/miniconda3/envs/coqui-3.12/lib/python3.12/site-packages/transformers/generation/utils.py", line 473, in _prepare_attention_mask_for_generation
    torch.isin(elements=inputs, test_elements=pad_token_id).any()
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: isin() received an invalid combination of arguments - got (test_elements=int, elements=Tensor, ), but expected one of:
 * (Tensor elements, Tensor test_elements, *, bool assume_unique, bool invert, Tensor out)
 * (Number element, Tensor test_elements, *, bool assume_unique, bool invert, Tensor out)
 * (Tensor elements, Number test_element, *, bool assume_unique, bool invert, Tensor out)

Environment

{
    "CUDA": {
        "GPU": [],
        "available": false,
        "version": "12.1"
    },
    "Packages": {
        "PyTorch_debug": false,
        "PyTorch_version": "2.3.0+cu121",
        "TTS": "0.24.0",
        "numpy": "1.26.4"
    },
    "System": {
        "OS": "Linux",
        "architecture": [
            "64bit",
            "ELF"
        ],
        "processor": "x86_64",
        "python": "3.12.3",
        "version": "#35~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Tue May  7 09:00:52 UTC 2"
    }
}

Additional context

No response

@eginhard eginhard added bug Something isn't working good first issue Good for newcomers labels May 28, 2024
eginhard added a commit that referenced this issue May 28, 2024
transformers>=4.41 break XTTS streaming, see #31
@eginhard
Copy link
Member Author

coqui-tts version 0.24.1 limits transformers to lower versions for a temporary fix until someone properly updates the streaming code.

@pseudotensor
Copy link

https://github.com/h2oai/h2ogpt/blob/main/docs/xtt.patch

@eginhard
Copy link
Member Author

https://github.com/h2oai/h2ogpt/blob/main/docs/xtt.patch

@pseudotensor Cool, thanks! Do you want to send a PR?

@eginhard eginhard self-assigned this Jun 15, 2024
eginhard added a commit that referenced this issue Jun 16, 2024
….41.1

Fixes #31. The handling of special tokens in `transformers` was changed in
huggingface/transformers#30624 and
huggingface/transformers#30746. This updates the XTTS
streaming code accordingly.
eginhard added a commit that referenced this issue Jun 16, 2024
….41.1

Fixes #31. The handling of special tokens in `transformers` was changed in
huggingface/transformers#30624 and
huggingface/transformers#30746. This updates the XTTS
streaming code accordingly.
eginhard added a commit that referenced this issue Jun 17, 2024
….41.1

Fixes #31. The handling of special tokens in `transformers` was changed in
huggingface/transformers#30624 and
huggingface/transformers#30746. This updates the XTTS
streaming code accordingly.
@eginhard eginhard added the XTTS label Dec 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working good first issue Good for newcomers XTTS
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants