Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The data for training Mimi #147

Open
1 task done
zruiii opened this issue Oct 30, 2024 · 2 comments
Open
1 task done

The data for training Mimi #147

zruiii opened this issue Oct 30, 2024 · 2 comments
Labels
question Further information is requested

Comments

@zruiii
Copy link

zruiii commented Oct 30, 2024

Due diligence

  • I have done my due diligence in trying to find the answer myself.

Topic

The paper

Question

It seems that Mimi was trained independently of Moshi, but I couldn’t find the dataset used to train Mimi. Did I miss something?

@zruiii zruiii added the question Further information is requested label Oct 30, 2024
@poupinel
Copy link

Hi, thanks for the great work Moshi team! I also have the same question, are the data used to train Mimi similar to the 7 million hours used to pre-train Moshi, or similar to data used to train encodec or soundstream? @LaurentMazare @adefossez

Thanks!

@isjwdu
Copy link

isjwdu commented Nov 25, 2024

Same question! I only saw that training data included 7 million hours of audio, but there is no specific information related to the Mini training dataset.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants