Converting a composer seq2seq t5 model throws an exception #754

timsteuer · 2023-11-21T07:59:08Z

Environment

llm-foundry: latest

To reproduce

Steps to reproduce the behavior:

train a hf_t5 model
download the composer checkpoint
try to convert it back to huggingface via scripts/inference/convert_composer_to_hf.py
The script crashes when trying to load the saved model as AutoModelForCausalLM

Expected behavior

The model is saved as a HuggingFace snapshot without any issue

Additional context

Locally, I fixed this via simply loading with AutoModel and not via AutoModelForCausalLM.
I guess this is fine.

The text was updated successfully, but these errors were encountered:

dakinggg · 2023-11-21T08:13:38Z

Ah yes, that script only support causal lms right now. A note on your solution, I'm not certain, but AutoModel here may give you a T5Model rather than a T5ForConditionalGeneration as you may want. Probably worth double checking that.

timsteuer · 2023-11-21T10:03:18Z

That was an interesting hint.

Just double checked and the model was indeed marked as a T5Model and not as a T5ForConditionalGeneration.

So I changed that in the conversion script, such that it yields the right config. However, loading the final model via AutoModel still results in a T5Model even though the config now explicitly states the correct model type.

On the other hand, if I load via AutoModelForSeq2SeqLM it loads the lm_head. So, I guess that is a HF specific thing and not related to the conversion script per se.

dakinggg · 2023-11-21T17:33:41Z

Yeah, AutoModel generally gives you the backbone model, while the AutoModelForXYZ will give you the model with adaptation/head for XYZ.

timsteuer added the bug Something isn't working label Nov 21, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Converting a composer seq2seq t5 model throws an exception #754

Converting a composer seq2seq t5 model throws an exception #754

timsteuer commented Nov 21, 2023

dakinggg commented Nov 21, 2023

timsteuer commented Nov 21, 2023

dakinggg commented Nov 21, 2023

Converting a composer seq2seq t5 model throws an exception #754

Converting a composer seq2seq t5 model throws an exception #754

Comments

timsteuer commented Nov 21, 2023

Environment

To reproduce

Expected behavior

Additional context

dakinggg commented Nov 21, 2023

timsteuer commented Nov 21, 2023

dakinggg commented Nov 21, 2023