Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix: RateLimit requests were not released when a streaming generation exception occurred #11540

Merged
merged 2 commits into from
Dec 11, 2024

Conversation

liuzhenghua
Copy link
Contributor

Summary

When you set the APP_MAX_ACTIVE_REQUESTS environment variable and call the streaming interface, an exception can cause the request count to not be released, eventually leading to the assistant being rate-limited due to an excessive number of active requests.

For example, when calling the chat-messages interface, setting the query parameter to an empty string and the response_mode to streaming can reproduce this issue.

Tip

Close issue syntax: Fixes #<issue number> or Resolves #<issue number>, see documentation for more details.

Screenshots

Before After
... ...

Checklist

Important

Please review the checklist below before submitting your pull request.

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

@dosubot dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. 🐞 bug Something isn't working labels Dec 11, 2024
@crazywoola crazywoola requested a review from laipz8200 December 11, 2024 02:55
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 11, 2024
@crazywoola crazywoola merged commit d05f189 into langgenius:main Dec 11, 2024
5 checks passed
iamjoel pushed a commit that referenced this pull request Dec 16, 2024
AlwaysBluer pushed a commit to AlwaysBluer/dify that referenced this pull request Dec 18, 2024
…m-vdb

* 'lindorm-vdb' of github.com:AlwaysBluer/dify:
  Fix/pdf preview in build (langgenius#11621)
  feat(devcontainer): add alias to stop Docker containers (langgenius#11616)
  ci: better print version for ruff to check the change (langgenius#11587)
  feat(model): add vertex_ai Gemini 2.0 Flash Exp (langgenius#11604)
  fix: name of llama-3.3-70b-specdec (langgenius#11596)
  Added new models and Removed the deleted ones for Groq langgenius#11455 (langgenius#11456)
  [ref] use one method to get boto client for aws bedrock (langgenius#11506)
  chore: translate i18n files (langgenius#11577)
  fix: support mdx files close langgenius#11557 (langgenius#11565)
  fix: change workflow trace id (langgenius#11585)
  Feat: dark mode for logs and annotations (langgenius#11575)
  Lindorm vdb (langgenius#11574)
  feat: add gemini-2.0-flash-exp (langgenius#11570)
  fix: better opendal tests (langgenius#11569)
  Fix: RateLimit requests were not released when a streaming generation exception occurred (langgenius#11540)
  chore: translate i18n files (langgenius#11545)
  fix: workflow continue on error doc link (langgenius#11554)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working lgtm This PR has been approved by a maintainer size:XS This PR changes 0-9 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants