Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extracting acoustic tokens #2

Closed
jpc opened this issue Feb 20, 2023 · 1 comment
Closed

Extracting acoustic tokens #2

jpc opened this issue Feb 20, 2023 · 1 comment
Labels
goal Main sub-tasks of the project

Comments

@jpc
Copy link
Contributor

jpc commented Feb 20, 2023

We have a notebook that shows how to extract acoustic tokens.

We are using the 1,5kbps codec model for now despite the fact that the speech quality is terrible. Google generation examples have a lot better quality - one explanation is that they have trained a special purpose VQ codec on the speech-only LibreLight dataset and it's providing better quality at lower bitrates than the general purpose audio codecs trained on speech, music and other sounds.

Sound quality is something we can fix with more training later on, after we prove the whole pipeline works, so for now we should cut everything and focus on making training easiest.

@jpc jpc added the goal Main sub-tasks of the project label Feb 20, 2023
@jpc jpc mentioned this issue Feb 28, 2023
9 tasks
@jpc
Copy link
Contributor Author

jpc commented Mar 29, 2023

That works well, further work will be tracked in #10 .

@jpc jpc closed this as completed Mar 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
goal Main sub-tasks of the project
Projects
None yet
Development

No branches or pull requests

1 participant