-
Notifications
You must be signed in to change notification settings - Fork 254
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bloom tokenizer #1403
Bloom tokenizer #1403
Conversation
The Preset test will fail until I have the permission to make the model public at kaggle. I tried loading the preset in a kaggle notebook because I have access to the link and it worked fine. |
@abuelnasr0 can you share it with me? https://www.kaggle.com/matthewdwatson Public sharing is not rolled out to Kaggle quite yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really awesome work! If you can share the one conversion you did with me. I will move it over to the Keras team.
We can probably just go with the one preset as we finish the model code. E.g. add the CausalLM
. Once we have a full end to end demo with 560m, we can upload the rest.
@mattdangerw I have shared the model with you now. |
69b8cdf
to
62a2db5
Compare
@abuelnasr0 Thanks! I think this is all ready to go. Just copied the model you shared over to the Keras team. If tests look green, I will merge this in! |
Huh, need to figure out how to make the Keras model public. Will check on this tomorrow! |
Finally pulling this in! Sorry about the delays. Can you add some generative stuff on top? E.g. the CausalLM class and co? |
@mattdangerw of course. I will start by adding preprocessor then casual_lm_preprocessor and casual_lm. |
The conversion script worked alright and the backbone and tokenizer generated the same output as hugging face.
here is a gist: https://colab.research.google.com/gist/abuelnasr0/a603fd2077b6cc579b51876b8cefecd9/bloom.ipynb
I have also changed the preset names. @mattdangerw please take a look at them and tell me If you want to change them.
regarding uploading the model variations into kaggle. I have an issue. first of all I can't upload it from my local machine because I have bad internet connection. and we have limited internet. my internet quota will end If I tried to upload or download models. I only work at colab or kaggle.
so I tried to put a link to a google drive folder containing the preset model and it didn't work.
I also tried to add variation from a notebook output. but kaggle doesn't show the notebook that outputs the preset as option to me, although it's public and has an Output Data. the notebook link: https://www.kaggle.com/code/mohamedabuelnasr/bloom-model-saving
any suggestions to upload the preset ?
UPDATE
I managed to upload the preset correctly after zipping the "assets" directory in the Kaggle output and copying it's link and other files link from kaggle notebook output.
But it would be useful if someone could tell me why my notebook output doesn't appear in the "Notebook output" section while uploading the model.