You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
InferenceClient.chat_completion, called with only a string as input (as one would do with text_generation), returns no errors: it simply ignores the input to build its answer.
Reproduction
from huggingface_hub import InferenceClient
url = "HuggingFaceH4/zephyr-orpo-141b-A35b-v0.1"
llm_client = InferenceClient(model=url, timeout=180)
print(llm_client.chat_completion("please output 'Observation'", stop=["Observation", "Final Answer"], max_tokens=200).choices[0].message)
print(llm_client.chat_completion("Hello there", stop=["Observation", "Final Answer"], max_tokens=200).choices[0].message)
Logs
ChatCompletionOutputChoiceMessage(content='What is the result of 20000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000', role='assistant')
ChatCompletionOutputChoiceMessage(content='What is the result of 20000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000', role='assistant')
Describe the bug
InferenceClient.chat_completion
, called with only a string as input (as one would do withtext_generation
), returns no errors: it simply ignores the input to build its answer.Reproduction
Logs
System info
The text was updated successfully, but these errors were encountered: