Concurrent API requests to Gemini vision model cause non-responsive behavior and block all tasks forever #9273

Allamss · 2024-10-12T07:56:26Z

Self Checks

This is only for bug report, if you would like to ask a question, please head to Discussions.
I have searched for existing issues search for existing issues, including closed ones.
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template :) and fill in all the required fields.

Dify version

0.9.1 / 0.9.2 / 0.11.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Configure Gemini 1.5 Pro 001
Create a new workflow using LLM with visual capabilities to briefly analyze images
Upload an image via API, then make 50 concurrent API calls to execute the image analysis workflow (I m using java, and 20 concurrent calls is fine). The API calls will hang indefinitely without returning, and all LLM operations on the site will become blocked
image:
According to the workflow logs, LLM requests are successfully returned
This issue does not occur with GPT vision model

✔️ Expected Behavior

Results should be returned normally

❌ Actual Behavior

Results are not returned as expected, and the entire site becomes globally blocked. The site can only be restored by restarting

dosubot · 2024-10-12T08:00:20Z

The issue you're experiencing with the Gemini vision model might be related to how it handles concurrency. Here are a few steps you can take to troubleshoot and potentially resolve the problem:

Concurrency Limits: Check if there are any concurrency limits configured for the Gemini model. If not documented, you might need to experiment with reducing the number of concurrent requests to see if it alleviates the issue.
Error Handling: Ensure that your application is correctly handling any errors returned by the API. The Gemini model has specific error handling mechanisms that map various errors to unified types ^[1].
Resource Allocation: Compare the resource allocation settings between the Gemini and GPT models. Differences in resource handling might be causing the blockage.
Logs and Monitoring: Utilize Dify's logging and monitoring features to diagnose the issue. The logs can provide insights into what happens during the API calls and why they might be hanging ^[2].
Timeouts and Retries: Implement timeouts and retries in your API requests to prevent indefinite hanging. This can help manage requests that take too long to process.
Version and Configuration: Ensure that your Dify version and configuration are up to date and correctly set up for high-concurrency scenarios.

If these steps don't resolve the issue, consider isolating the problem by testing with fewer concurrent requests or different configurations to pinpoint the cause.

To continue talking to Dosu, mention @dosu.

Allamss · 2024-10-12T08:17:17Z

The issue you're experiencing with the Gemini vision model might be related to how it handles concurrency. Here are a few steps you can take to troubleshoot and potentially resolve the problem:

Concurrency Limits: Check if there are any concurrency limits configured for the Gemini model. If not documented, you might need to experiment with reducing the number of concurrent requests to see if it alleviates the issue.

Error Handling: Ensure that your application is correctly handling any errors returned by the API. The Gemini model has specific error handling mechanisms that map various errors to unified types [1].

Resource Allocation: Compare the resource allocation settings between the Gemini and GPT models. Differences in resource handling might be causing the blockage.

Logs and Monitoring: Utilize Dify's logging and monitoring features to diagnose the issue. The logs can provide insights into what happens during the API calls and why they might be hanging [2].

Timeouts and Retries: Implement timeouts and retries in your API requests to prevent indefinite hanging. This can help manage requests that take too long to process.

Version and Configuration: Ensure that your Dify version and configuration are up to date and correctly set up for high-concurrency scenarios.

If these steps don't resolve the issue, consider isolating the problem by testing with fewer concurrent requests or different configurations to pinpoint the cause.

To continue talking to Dosu, mention @dosu.

The Gemini model has no extra configuration. Crucially, the logs show that Dify's requests to Gemini are returning (with execution time and token usage parameters), indicating that the issue is not caused by limitations.
Firstly, the API requests don't return at all but remain blocked indefinitely. I've checked the logs for the web, API, and sandbox containers, and there are no error messages.
As per point 1.
As per point 1.
After the blockage occurs, any API or site LLM requests become blocked as well.
I'll try this suggestion later.
@dosu

dosubot · 2024-10-12T08:20:04Z

Given the situation, here are some additional steps you might consider:

Server Configuration: Ensure that your server settings are optimized for high concurrency. Adjust settings like SERVER_WORKER_AMOUNT and SQLALCHEMY_POOL_SIZE to handle more simultaneous connections ^[1].
Gevent Worker Class: Since you're using Docker, confirm that the SERVER_WORKER_CLASS is set to gevent to efficiently manage multiple connections ^[1].
Database Bottlenecks: Check if there are any database bottlenecks. Increasing the database connection pool size might help if the database is a limiting factor.
Network and Firewall: Verify that there are no network or firewall settings that might be limiting the number of concurrent connections.
Resource Monitoring: Use monitoring tools to check CPU, memory, and network usage during the concurrent requests to identify any resource exhaustion.
Isolation Testing: Try running the Gemini model in isolation with a reduced number of requests to see if the issue persists, which might help identify if the problem is with the model or the infrastructure.

These steps should help you further diagnose and potentially resolve the issue.

To continue talking to Dosu, mention @dosu.

Allamss · 2024-10-16T03:07:07Z

I’ve tried the latest version 0.9.2, and the issue is still reproducible @crazywoola @dosu

hw872715125 · 2024-10-18T02:56:48Z

I also encountered this problem @crazywoola

dosubot · 2024-11-21T16:03:25Z

Hi, @Allamss. I'm Dosu, and I'm helping the Dify team manage their backlog. I'm marking this issue as stale.

Issue Summary:

The Gemini vision model becomes non-responsive with 50 concurrent API requests.
Despite successful log returns, tasks are blocked, unlike the GPT vision model.
Troubleshooting steps included checking concurrency limits, error handling, and resource allocation.
You confirmed the issue persists without error messages and affects all API requests post-blockage.
The issue remains unresolved in version 0.9.2, as confirmed by you and another user.

Next Steps:

Please let us know if this issue is still relevant to the latest version of the Dify repository. If so, you can keep the discussion open by commenting on this issue.
Otherwise, this issue will be automatically closed in 15 days.

Thank you for your understanding and contribution!

Allamss · 2024-11-22T03:34:49Z

@dosu, In the latest version 0.11.2, the problem can still be reproduced

dosubot bot added the 🐞 bug Something isn't working label Oct 12, 2024

crazywoola added the 🙏 help wanted Extra attention is needed label Oct 12, 2024

dosubot bot added the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 21, 2024

dosubot bot removed the stale Issue has not had recent activity or appears to be solved. Stale issues will be automatically closed label Nov 22, 2024

hjlarry mentioned this issue Dec 10, 2024

feat: enhance gemini models #11497

Merged

5 tasks

laipz8200 closed this as completed in #11497 Dec 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Concurrent API requests to Gemini vision model cause non-responsive behavior and block all tasks forever #9273

Concurrent API requests to Gemini vision model cause non-responsive behavior and block all tasks forever #9273

Allamss commented Oct 12, 2024 •

edited

Loading

dosubot bot commented Oct 12, 2024

Allamss commented Oct 12, 2024

dosubot bot commented Oct 12, 2024

Allamss commented Oct 16, 2024 •

edited

Loading

hw872715125 commented Oct 18, 2024 •

edited

Loading

dosubot bot commented Nov 21, 2024

Allamss commented Nov 22, 2024

Concurrent API requests to Gemini vision model cause non-responsive behavior and block all tasks forever #9273

Concurrent API requests to Gemini vision model cause non-responsive behavior and block all tasks forever #9273

Comments

Allamss commented Oct 12, 2024 • edited Loading

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

✔️ Expected Behavior

❌ Actual Behavior

dosubot bot commented Oct 12, 2024

Allamss commented Oct 12, 2024

dosubot bot commented Oct 12, 2024

Allamss commented Oct 16, 2024 • edited Loading

hw872715125 commented Oct 18, 2024 • edited Loading

dosubot bot commented Nov 21, 2024

Allamss commented Nov 22, 2024

Allamss commented Oct 12, 2024 •

edited

Loading

Allamss commented Oct 16, 2024 •

edited

Loading

hw872715125 commented Oct 18, 2024 •

edited

Loading