Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The inference API truncates the response #487

Open
NormXU opened this issue Feb 17, 2024 · 1 comment
Open

The inference API truncates the response #487

NormXU opened this issue Feb 17, 2024 · 1 comment

Comments

@NormXU
Copy link

NormXU commented Feb 17, 2024

Issue Description

I host an Image-to-Text pipeline with this model for a while. The Inference API widget worked quite well until recently when some developers reported that the inference widget always cut the response short. However, he ran the model locally with the same example, and the response was perfect. I tried previously successful examples but found all examples returned the same truncated results.

I didn't update model weights and configs before trying to fix this issue. Please check the issue for more details.

These are all commits I made after reading the issue:

  1. Following the advice mentioned in the issue, I first tried to add:
inference:
  parameters:
    max_length: 800

in the model card. But it doesn't work.

  1. Then, I guessed maybe the encoder config confused the API, so I tried to edit encoder.max_length=800. But it still failed to fix the problem, thenI edited it back.

I speculated that it is an inference API bug that causes the truncated responses.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants