Fix: RateLimit requests were not released when a streaming generation exception occurred #11540

liuzhenghua · 2024-12-11T02:53:22Z

Summary

When you set the APP_MAX_ACTIVE_REQUESTS environment variable and call the streaming interface, an exception can cause the request count to not be released, eventually leading to the assistant being rate-limited due to an excessive number of active requests.

For example, when calling the chat-messages interface, setting the query parameter to an empty string and the response_mode to streaming can reproduce this issue.

Tip

Close issue syntax: Fixes #<issue number> or Resolves #<issue number>, see documentation for more details.

Screenshots

Before	After
...	...

Checklist

Important

Please review the checklist below before submitting your pull request.

This change requires a documentation update, included: Dify Document
I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
I've updated the documentation accordingly.
I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

… exception occurred (#11540)

…m-vdb * 'lindorm-vdb' of github.com:AlwaysBluer/dify: Fix/pdf preview in build (langgenius#11621) feat(devcontainer): add alias to stop Docker containers (langgenius#11616) ci: better print version for ruff to check the change (langgenius#11587) feat(model): add vertex_ai Gemini 2.0 Flash Exp (langgenius#11604) fix: name of llama-3.3-70b-specdec (langgenius#11596) Added new models and Removed the deleted ones for Groq langgenius#11455 (langgenius#11456) [ref] use one method to get boto client for aws bedrock (langgenius#11506) chore: translate i18n files (langgenius#11577) fix: support mdx files close langgenius#11557 (langgenius#11565) fix: change workflow trace id (langgenius#11585) Feat: dark mode for logs and annotations (langgenius#11575) Lindorm vdb (langgenius#11574) feat: add gemini-2.0-flash-exp (langgenius#11570) fix: better opendal tests (langgenius#11569) Fix: RateLimit requests were not released when a streaming generation exception occurred (langgenius#11540) chore: translate i18n files (langgenius#11545) fix: workflow continue on error doc link (langgenius#11554)

liuzhenghua added 2 commits December 11, 2024 10:42

Update app_generate_service.py

1c28e13

Update rate_limit.py

d84d413

dosubot bot added size:XS This PR changes 0-9 lines, ignoring generated files. 🐞 bug Something isn't working labels Dec 11, 2024

crazywoola requested a review from laipz8200 December 11, 2024 02:55

laipz8200 approved these changes Dec 11, 2024

View reviewed changes

dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 11, 2024

crazywoola merged commit d05f189 into langgenius:main Dec 11, 2024
5 checks passed

iamjoel pushed a commit that referenced this pull request Dec 16, 2024

Fix: RateLimit requests were not released when a streaming generation…

ec1efe5

… exception occurred (#11540)

laipz8200 mentioned this pull request Dec 16, 2024

chore: bump version to 0.14.0 #11679

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix: RateLimit requests were not released when a streaming generation exception occurred #11540

Fix: RateLimit requests were not released when a streaming generation exception occurred #11540

liuzhenghua commented Dec 11, 2024

Fix: RateLimit requests were not released when a streaming generation exception occurred #11540

Fix: RateLimit requests were not released when a streaming generation exception occurred #11540

Conversation

liuzhenghua commented Dec 11, 2024

Summary

Screenshots

Checklist