How were the speaker consistency scores generated in the documentation? #178

0xDigest · 2024-12-18T18:24:36Z

Hi all,

The speaker consistency scores listed in INFERENCE.md appear to have been updated via pull request 143. How were these embeddings/scores generated? Is it possible to reproduce or validate this somehow?

0xDigest · 2024-12-18T19:09:32Z

Was this done by pulling the generation.decoder_hidden_states, using the last hidden state of each tuple, pooling, and doing something like cosine similarity?

ylacombe · 2025-01-15T08:37:07Z

I used this script to create the speaker similarity dataset per sample and then computed the stats with classic tools. Hope it helps

0xDigest changed the title ~~How were the voice consistency scores generated in the documentation?~~ How were the speaker consistency scores generated in the documentation? Dec 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How were the speaker consistency scores generated in the documentation? #178

How were the speaker consistency scores generated in the documentation? #178

0xDigest commented Dec 18, 2024 •

edited

Loading

0xDigest commented Dec 18, 2024 •

edited

Loading

ylacombe commented Jan 15, 2025

How were the speaker consistency scores generated in the documentation? #178

How were the speaker consistency scores generated in the documentation? #178

Comments

0xDigest commented Dec 18, 2024 • edited Loading

0xDigest commented Dec 18, 2024 • edited Loading

ylacombe commented Jan 15, 2025

0xDigest commented Dec 18, 2024 •

edited

Loading

0xDigest commented Dec 18, 2024 •

edited

Loading