-
Notifications
You must be signed in to change notification settings - Fork 27.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError: 'GenerationConfig' object has no attribute 'task_to_id' #25084
Comments
Also cc @sanchit-gandhi since it comes from the audio course. |
Can you link the full stacktrace if possible ? This might help us narrow it down faster. |
+1 on the full stack-trace. It might require an update to your generation config since this is a fine-tuned checkpoint and the API was updated to take the |
Here's a colab notebook to reproduce the error https://colab.research.google.com/drive/1kLjKWZSKmvPwBqnaN-NJxy6Hv4gG5oDJ?usp=sharing |
Thanks for the notebook @AmgadHasan! The generation config for this model is indeed missing, meaning it is created automatically from the config in the call to from transformers import pipeline
pipe = pipeline("automatic-speech-recognition", model='arbml/whisper-largev2-ar')
print(asr.model.generation_config) Print Output:
If we compare this to the most recent generation config, i.e the one for Whisper large-v2, we see that the generation config is missing many both the language and task token id mappings:
These language/task token mappings are used in the call to
Since using the language/task arguments as input to the Probably what we can do here @ArthurZucker is throw an error when the user tries to call A quick fix for this issue @AmgadHasan is updating the generation config for the model checkpoint (as per my previous comment) |
Thanks @sanchit-gandhi ! This solved the issue. |
The simplest way of updating the generation config is as follows: from transformers import GenerationConfig
MODEL_ID = "arbml/whisper-largev2-ar" # set to your model id on the Hub
MULTILINGUAL = True # set True for multilingual models, False for English-only
if MULTILINGUAL:
generation_config = GenerationConfig.from_pretrained("openai/whisper-large-v2")
else:
generation_config = GenerationConfig.from_pretrained("openai/whisper-medium.en")
generation_config.push_to_hub(model_id) |
System Info
I am following the Audio course course and tried to perform translation using the automatic speech recognition pipeline but got a weird error.
Code:
Error:
AttributeError: 'GenerationConfig' object has no attribute 'task_to_id'
This was using Colab free tier on T4
transformers version:
This error arises when using
generate_kwargs={"task": "translate"}
orgenerate_kwargs={"task": "transcribe"}
Tagging @Narsil to help with pipeline issues.
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
from transformers import pipeline
asr = pipeline("automatic-speech-recognition", model='arbml/whisper-largev2-ar', device=0)
res = asr(
audio_file_path,
max_new_tokens=256,
generate_kwargs={"task": "translate"},
chunk_length_s=30,
batch_size=8,
)
Expected behavior
Should return a python
dict
with key namedtext
that holds the English text.The text was updated successfully, but these errors were encountered: