optimize seamless-m4t/vits model for text-to-speech generation #825

sywangyi · 2024-03-21T08:43:58Z

What does this PR do?

Fixes # (issue)

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you make sure to update the documentation with your changes?
Did you write any new necessary tests?

HuggingFaceDocBuilderDev · 2024-03-21T08:48:18Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

sywangyi · 2024-03-21T11:11:52Z

pipeline running issue addressed by huggingface/transformers#29722

sywangyi · 2024-03-21T11:29:50Z

perf	A100	Gaudi2
BF16	528ms	250ms
FP32	410ms	275ms

in my env

python3 run_pipeline.py
--model_name_or_path facebook/hf-seamless-m4t-medium
--text "Hello, my dog is cooler than you!"
--use_hpu_graphs
--n_iterations 10

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

sywangyi · 2024-03-21T11:52:49Z

@libinta please help review.

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

sywangyi · 2024-03-25T02:10:06Z

perf	A100	Gaudi2
BF16	87ms	11ms
FP32	102ms	13ms

in my env

python3 run_pipeline.py
--model_name_or_path facebook/mms-tts-eng
--text "Hello, my dog is cooler than you!"
--use_hpu_graphs
--n_iterations 10

skaulintel · 2024-04-18T19:19:08Z

@sywangyi Can you please add CI tests

sywangyi · 2024-04-23T01:19:25Z

CI tests will be added after #834 is merged since it will be added in the same file following similar style.

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

tests/test_pipeline.py

Makefile

examples/text-to-speech/run_pipeline.py

optimum/habana/transformers/generation/utils.py

tests/test_pipeline.py

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

ssarkar2 · 2024-05-31T21:30:23Z

Merged to master (with transformer 4.40), ran python3 run_pipeline.py --model_name_or_path facebook/hf-seamless-m4t-medium --text "Hello, my dog is cooler than you!" --use_hpu_graphs --n_iterations 10

finished:

05/31/2024 21:29:17 - INFO - __main__ - speech = [{'audio': array([[-0.00172372, -0.00146204, -0.00160711, ...,  0.01203394,
         0.01085523,  0.00866978]], dtype=float32), 'sampling_rate': 16000}] time = 2362.2106075286865ms

However generated wav file doesn't sound right

sywangyi · 2024-06-01T12:18:40Z

generated wav file doesn't sound right. this issue is fixed by #1034

ssarkar2

lgtm

…ngface#825) Signed-off-by: Wang, Yi A <yi.a.wang@intel.com> Co-authored-by: Sayantan Sarkar <sasarkar@habana.ai>

sywangyi requested review from ssarkar2, bhargaveede, vivekgoe and regisss as code owners March 21, 2024 08:43

sywangyi marked this pull request as draft March 21, 2024 08:49

sywangyi force-pushed the seamless_m4t branch from 0a511c4 to 1bb3c38 Compare March 21, 2024 11:09

sywangyi marked this pull request as ready for review March 21, 2024 11:09

optimize seamless-m4t model for text-to-speech generation

8da2db0

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

sywangyi force-pushed the seamless_m4t branch from 1bb3c38 to 8da2db0 Compare March 21, 2024 11:33

libinta requested a review from jiminha March 21, 2024 18:20

add vits optimization

5094c3a

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

sywangyi changed the title ~~optimize seamless-m4t model for text-to-speech generation~~ optimize seamless-m4t/vits model for text-to-speech generation Mar 25, 2024

sywangyi added 5 commits April 24, 2024 09:28

Merge branch 'main' into seamless_m4t

d015056

add ci test for text-to-speech

a470049

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

add pipeline test to Makefile

c33fa04

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Merge branch 'main' into seamless_m4t

0fbcc0c

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Merge branch 'main' into seamless_m4t

ead05fa

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

libinta added the run-test Run CI for PRs from external contributors label May 7, 2024

skaulintel reviewed May 7, 2024

View reviewed changes

tests/test_pipeline.py Outdated Show resolved Hide resolved

regisss reviewed May 10, 2024

View reviewed changes

sywangyi and others added 3 commits May 11, 2024 00:26

update test and example

c37d0d8

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Merge branch 'main' into seamless_m4t

49f186a

style

efb25dd

Enable test

4401ea4

ssarkar2 reviewed Jun 5, 2024

View reviewed changes

ssarkar2 added the synapse1.16 label Jun 5, 2024

Merge branch 'main' into seamless_m4t

ab56123

regisss approved these changes Jun 6, 2024

View reviewed changes

regisss merged commit 4dd1507 into main Jun 6, 2024
9 checks passed

regisss deleted the seamless_m4t branch June 6, 2024 17:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

optimize seamless-m4t/vits model for text-to-speech generation #825

optimize seamless-m4t/vits model for text-to-speech generation #825

sywangyi commented Mar 21, 2024

HuggingFaceDocBuilderDev commented Mar 21, 2024

sywangyi commented Mar 21, 2024

sywangyi commented Mar 21, 2024 •

edited

Loading

sywangyi commented Mar 21, 2024

sywangyi commented Mar 25, 2024

skaulintel commented Apr 18, 2024 •

edited

Loading

sywangyi commented Apr 23, 2024 •

edited

Loading

ssarkar2 commented May 31, 2024 •

edited

Loading

sywangyi commented Jun 1, 2024

ssarkar2 left a comment

optimize seamless-m4t/vits model for text-to-speech generation #825

optimize seamless-m4t/vits model for text-to-speech generation #825

Conversation

sywangyi commented Mar 21, 2024

What does this PR do?

Before submitting

HuggingFaceDocBuilderDev commented Mar 21, 2024

sywangyi commented Mar 21, 2024

sywangyi commented Mar 21, 2024 • edited Loading

sywangyi commented Mar 21, 2024

sywangyi commented Mar 25, 2024

skaulintel commented Apr 18, 2024 • edited Loading

sywangyi commented Apr 23, 2024 • edited Loading

ssarkar2 commented May 31, 2024 • edited Loading

sywangyi commented Jun 1, 2024

ssarkar2 left a comment

Choose a reason for hiding this comment

sywangyi commented Mar 21, 2024 •

edited

Loading

skaulintel commented Apr 18, 2024 •

edited

Loading

sywangyi commented Apr 23, 2024 •

edited

Loading

ssarkar2 commented May 31, 2024 •

edited

Loading