Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How were the speaker consistency scores generated in the documentation? #178

Open
0xDigest opened this issue Dec 18, 2024 · 2 comments
Open

Comments

@0xDigest
Copy link

0xDigest commented Dec 18, 2024

Hi all,

The speaker consistency scores listed in INFERENCE.md appear to have been updated via pull request 143. How were these embeddings/scores generated? Is it possible to reproduce or validate this somehow?

@0xDigest 0xDigest changed the title How were the voice consistency scores generated in the documentation? How were the speaker consistency scores generated in the documentation? Dec 18, 2024
@0xDigest
Copy link
Author

0xDigest commented Dec 18, 2024

Was this done by pulling the generation.decoder_hidden_states, using the last hidden state of each tuple, pooling, and doing something like cosine similarity?

@ylacombe
Copy link
Contributor

I used this script to create the speaker similarity dataset per sample and then computed the stats with classic tools. Hope it helps

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants