-
Notifications
You must be signed in to change notification settings - Fork 27.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert T5x models to PyTorch #15464
Comments
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
Think @stefan-it has a working script :-) |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
@stefan-it, can you share that script? |
@stefan-it , hey could you please tell me how exactly does the conversion script works. Actually i tired run the conversion script and I seems like the config file in t5x is in . gin format and the script expects the config file to be in .json format. Hence I was stuck from converting my t5x model to HF. Could you please show me how it's done and provide some details |
Hi @StephennFernandes , could you please try to use these steps, mentioned in the corresponding PR: The config file needs to be in JSON format, yes :) |
If you get any errors, please post them here, so we can try to find a solution 🤗 |
@stefan-it , thanks for replying. I followed the steps as instructed in #16853 and tried converting my pretrained t5_1_1_base model to hugginface. But i get the following error:
|
Hi @StephennFernandes could you try to install:
The |
@stefan-it , hey i tried that but i didnt work for me, i still get the same error. I came across this issue in the t5x repo #452 i am currently using ubuntu 20.04 with linux kernel 5.13.0 |
Hi @StephennFernandes , I think I have a working solution now. I installed everything in a fresh new virtual environment, but I got bazel errors (hopefully Google will stop using bazel someday...) when trying to build What I did then: pip3 install --upgrade tensorstore to install latest version of python3 convert_t5x_checkpoint_to_flax.py --t5x_checkpoint_path ./t5_1_1_small --config_name ./config_1_1.json --flax_dump_folder_path ./t5x_1_1_exported But realpath ./t5_1_1_small this returns something like: /home/stefan/transformers/src/transformers/models/t5/t5_1_1_small then use this path for the I hope this works! It worked under my local setup. (Oh, and in case you get some strange |
@stefan-it , it worked 🎉 Thanks a ton for all the help 🙏 Actually i still have a couple of other questions:
|
@StephennFernandes Here is a link to a convenience script that I am using for creating the PyTorch and TF models. https://github.com/peregilk/north-t5/blob/main/create_pytorch_tf_and_vocab.py Do not expect it to run directly though. It was really not meant for the public. However, it should give you the basic idea about how to load the models and then save them in the correct format. |
@peregilk , thanks for sharing. actually the link isnt available, apparently i believe its private. could you please check and confirm. |
@StephennFernandes Sorry about that. Now it is public. As a side note, especially to @patrickvonplaten: Wouldnt it be nice to put a wrapper around the great script that @stefan-it have made. A script that also loads the models in HuggingFace and saves them in PyTorch and TF format, as well as creates the necessary tokenizers. Maybe it can even copy over the training-logs that are saved in the t5x-checkpoint directory. I have done this manually on these models: https://huggingface.co/north/t5_large_NCC. As you see, the tensorboard logs from t5x integrates nicely with the Training Metrics in HF. |
I think this would indeed be a great idea! Maybe we can open a |
@stefan-it @patrickvonplaten actualy i have pretrained a mt5-base But i am unable to convert it to huggingface. i tried several huggingface config.json files from the t5-efficient-base but none-of them worked. the following is my error when converting:
|
Hi @StephennFernandes , really interesting, I haven't tried it with the Scaled T5X models yet (Those efficient T5 models that can be found on the Model Hub are converted from the TensorFlow checkpoints, because they are trained with the official T5 implementation and not with T5X). Please give me some time to investigate that :) |
Does this script support the transformation of XL or XXL models? |
@joytianya I have been using this script a lot for converting both XL and XXL models. Works fine. |
@peregilk thank your answer. I tried it and generated the following files in /content/flan_t5x_xl_exported, and then I used this below code (T5ForConditionalGeneration) to load the dir and happen error(Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found). How do I solve it? model = T5ForConditionalGeneration.from_pretrained("/content/flan_t5x_xl_exported", from_flax=True)
# Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in
# directory /content/flan_t5x_xl_exported. /content/flan_t5x_xl_exported: |
@stefan-it |
@joytianya Try open the files here: https://huggingface.co/north/t5_xl_NCC. All these are converted using the script written by @stefan-it. Note that the large PyTorch files are split into multiple smaller files. |
@peregilk |
@joytianya. I do not think this splitting really is related to the conversion script that @stefan-it wrote. Transformers does this automatically with large files. |
ok, thank you |
🚀 Feature request
Googles new Flax implementation of T5, called T5x is creating models/checkpoints in a custom format.
The config is stored in .gin files, and the current T5 conversion scripts like this byT5 conversion script is not working.
Would it be possible to create a script for converting the T5x checkpoints/models?
@patrickvonplaten
@anton-l
The text was updated successfully, but these errors were encountered: