-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add a basic trainer and dataset #1
Conversation
manmay-nakhashi
commented
Jul 12, 2024
- huggingface dataset
- basic trainer
to test or train just run
Note: need to add a path to vocab.json |
@manmay-nakhashi Manmay! i remember you now from the natural speech work we did together some time ago thanks for the PR! I will check it out tomorrow morning 😄 |
@manmay-nakhashi hey, looks good! 😄 do you want to try pulling and integrating the text as well? |
Sure I'll do that. |
@lucidrains it's ready |
i'll write a inference script next so we can do some quick experiments. |
nice! it looks good, but in the paper, they didn't use a tokenizer and just went character level i was thinking we could just use utf character ids? (could remove the tokenizer and |
@lucidrains changes are done |
@lucidrains resolved all the suggestions |
@manmay-nakhashi thank you Manmay! |