You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In my setup I am initializing a tokenizer and want to pass it to the pipeline. My expectation is that if I set the padding_side directly on tokenizer instance, the pipeline should not print the warning A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set 'padding_side="left"' when initializing the tokenizer.
Alternatively, I would like to add it as a parameter to the pipeline instantiation, but I think passing tokenizer parameters in the pipeline generator is currently not envisioned, as discussed in #12039#24707#22995
Possibly similar issue origin as reported in #29378
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
fromtransformersimportAutoTokenizer, pipeline, AutoModelForCausalLM, Conversationmsg='The capital of France 'modelname='microsoft/DialoGPT-small'# init model and tokenizermodel=AutoModelForCausalLM.from_pretrained(modelname)
tokenizer=AutoTokenizer.from_pretrained(modelname, padding_side='left')
tokenizer.pad_token_id=tokenizer.eos_token_id# hand over model & tokenizer instances to pipelinechatbot=pipeline(task='conversational', model=model, tokenizer=tokenizer, framework='pt')
messages= ([{"role": "system", "content": 'You are a helpful assistant'}, {"role": "user", "content": msg}])
response=chatbot(Conversation(messages=messages), pad_token_id=chatbot.tokenizer.eos_token_id)
This prints the warning message:
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set padding_side='left' when initializing the tokenizer.
Expected behavior
The warning message should not be printed as padding_side was set during tokenizer initialization.
Alternatively, I would expect to be able to provide padding_side parameter directly to the pipeline generation.
The text was updated successfully, but these errors were encountered:
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.
Please note that issues that do not follow the contributing guidelines are likely to be ignored.
System Info
System Info
Python=3.11.5
Transformers= '4.37.2'
In my setup I am initializing a tokenizer and want to pass it to the pipeline. My expectation is that if I set the padding_side directly on tokenizer instance, the pipeline should not print the warning
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set 'padding_side="left"' when initializing the tokenizer.
Alternatively, I would like to add it as a parameter to the pipeline instantiation, but I think passing tokenizer parameters in the pipeline generator is currently not envisioned, as discussed in #12039 #24707 #22995
Possibly similar issue origin as reported in #29378
@Narsil
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
This prints the warning message:
A decoder-only architecture is being used, but right-padding was detected! For correct generation results, please set
padding_side='left'
when initializing the tokenizer.Expected behavior
The warning message should not be printed as
padding_side
was set during tokenizer initialization.Alternatively, I would expect to be able to provide
padding_side
parameter directly to the pipeline generation.The text was updated successfully, but these errors were encountered: