The Q&A mode always fails. #12142

gongshaojie12 · 2024-12-27T02:29:17Z

Self Checks

This is only for bug report, if you would like to ask a question, please head to Discussions.
I have searched for existing issues search for existing issues, including closed ones.
I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
[FOR CHINESE USERS] 请务必使用英文提交 Issue，否则会被关闭。谢谢！:）
Please do not modify this template :) and fill in all the required fields.

Dify version

0.14.2

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

The configuration after I upload files to the knowledge base is as follows.

When I click "Save and Process," the following issue occurs.

2024-12-27 02:26:50,448.448 ERROR [Thread-1123 (_format_qa_document)] [http_request.py:181] - Request: https://dashscope.aliyuncs.com/api/v1/services/aigc/multimodal-generation/generation failed, status: 429, message: Requests rate limit exceeded, please try again later.
2024-12-27 02:26:50,450.450 ERROR [Thread-1123 (_format_qa_document)] [qa_index_processor.py:159] - Failed to format qa document
Traceback (most recent call last):
  File "/app/api/core/model_runtime/model_providers/__base/large_language_model.py", line 110, in invoke
    result = self._invoke(
             ^^^^^^^^^^^^^
  File "/app/api/core/model_runtime/model_providers/tongyi/llm/llm.py", line 84, in _invoke
    return self._generate(model, credentials, prompt_messages, model_parameters, tools, stop, stream, user)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/core/model_runtime/model_providers/tongyi/llm/llm.py", line 205, in _generate
    return self._handle_generate_response(model, credentials, response, prompt_messages)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/core/model_runtime/model_providers/tongyi/llm/llm.py", line 220, in _handle_generate_response
    raise ServiceUnavailableError(response.message)
dashscope.common.error.ServiceUnavailableError: Requests rate limit exceeded, please try again later.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/app/api/core/rag/index_processor/processor/qa_index_processor.py", line 146, in _format_qa_document
    response = LLMGenerator.generate_qa_document(tenant_id, document_node.page_content, document_language)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/core/llm_generator/llm_generator.py", line 306, in generate_qa_document
    response = model_instance.invoke_llm(
               ^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/core/model_manager.py", line 127, in invoke_llm
    return self._round_robin_invoke(
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/core/model_manager.py", line 310, in _round_robin_invoke
    return function(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/app/api/core/model_runtime/model_providers/__base/large_language_model.py", line 134, in invoke
    raise self._transform_invoke_error(e)
core.model_runtime.errors.invoke.InvokeServerUnavailableError: [tongyi] Server Unavailable Error, Requests rate limit exceeded, please try again later.

How can I resolve this issue?

✔️ Expected Behavior

No response

❌ Actual Behavior

No response

The text was updated successfully, but these errors were encountered:

dosubot · 2024-12-27T02:31:54Z

To resolve the "Requests rate limit exceeded" error in your self-hosted Dify setup, you can try the following solutions:

Reduce Concurrent Processing: Limit the number of files processed simultaneously or break them into smaller batches to reduce the load on the embedding service.
Implement Retry Logic: Use retry logic with exponential backoff when rate limit errors occur to manage the request flow more effectively. Dify implements exponential backoff by using a backoff factor and retrying the request with increasing delays ^[1].
Use Multiple API Keys: Consider using multiple API keys to distribute the load across different keys, which can help in managing the rate limits better.
Upgrade Machine Resources: Increase your machine's memory capacity to handle more processes without hitting rate limits.
Use a Different Embedding Model Provider: Consider switching to an embedding model provider that offers higher rate limits to accommodate your processing needs.
Check Rate Limits: Ensure that the rate limit for the Embedding model API key used has not been reached. Refer to the official documentation for details on rate limits.

These steps are based on handling similar rate limit issues in Dify version 0.13.2 and the Tongyi embedding service ^[2].

To continue talking to Dosu, mention @dosu.

^{Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other}

yihong0618 · 2024-12-27T02:35:19Z

will check today thanks

gongshaojie12 · 2024-12-27T02:46:30Z

I only uploaded one file, and the number of files is not large.
How exactly should the retry logic be implemented?
How should the "Use Multiple API Keys" option be configured?
Currently, the machine resources are quite sufficient. It's processing just one file, and 16GB of memory should be enough, right?
I tried OpenAI's text-embedding-3-large, but the same issue occurred.
Regarding rate limits, I only uploaded one file, which is 27MB in size. This shouldn't trigger the limit, right?

To resolve the "Requests rate limit exceeded" error in your self-hosted Dify setup, you can try the following solutions:

Reduce Concurrent Processing: Limit the number of files processed simultaneously or break them into smaller batches to reduce the load on the embedding service.

Implement Retry Logic: Use retry logic with exponential backoff when rate limit errors occur to manage the request flow more effectively. Dify implements exponential backoff by using a backoff factor and retrying the request with increasing delays [1].

Use Multiple API Keys: Consider using multiple API keys to distribute the load across different keys, which can help in managing the rate limits better.

Upgrade Machine Resources: Increase your machine's memory capacity to handle more processes without hitting rate limits.

Use a Different Embedding Model Provider: Consider switching to an embedding model provider that offers higher rate limits to accommodate your processing needs.

Check Rate Limits: Ensure that the rate limit for the Embedding model API key used has not been reached. Refer to the official documentation for details on rate limits.

These steps are based on handling similar rate limit issues in Dify version 0.13.2 and the Tongyi embedding service [2].

To continue talking to Dosu, mention @dosu.

Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

gongshaojie12 · 2024-12-27T02:47:26Z

will check today thanks

Alright, looking forward to the results. Thank you.

AugNSo · 2024-12-27T04:15:09Z

According to the log, the issue is not on the embedding side but on the LLM side. Q&A mode is passing your chunked document content to LLM generate Q&A pairs. My guess is that your document has a bit more chunks than the rate limit and triggered your LLM provider's error.

gongshaojie12 · 2024-12-27T05:56:59Z

According to the log, the issue is not on the embedding side but on the LLM side. Q&A mode is passing your chunked document content to LLM generate Q&A pairs. My guess is that your document has a bit more chunks than the rate limit and triggered your LLM provider's error.

I uploaded a PDF with only two pages, but the same issue occurred. The PDF is as follows.
mobile-aloha.pdf

yihong0618 · 2024-12-27T06:05:43Z

According to the log, the issue is not on the embedding side but on the LLM side. Q&A mode is passing your chunked document content to LLM generate Q&A pairs. My guess is that your document has a bit more chunks than the rate limit and triggered your LLM provider's error.

I uploaded a PDF with only two pages, but the same issue occurred. The PDF is as follows. mobile-aloha.pdf

This pdf works fine in my side..

gongshaojie12 · 2024-12-27T06:14:38Z

That's strange. I only modified a few parameters in the .env and docker-compose.yaml files, as shown below.

.env

UPLOAD_FILE_SIZE_LIMIT=1024
UPLOAD_FILE_BATCH_LIMIT=500
ETL_TYPE=Unstructured
UNSTRUCTURED_API_URL=http://unstructured:8000/general/v0/general

docker-compose.yaml

UPLOAD_FILE_SIZE_LIMIT: ${UPLOAD_FILE_SIZE_LIMIT:-1024}
UPLOAD_FILE_BATCH_LIMIT: ${UPLOAD_FILE_BATCH_LIMIT:-500}
environment:
  NGINX_CLIENT_MAX_BODY_SIZE: 1024M

I haven't made any other changes. @yihong0618

AugNSo · 2024-12-27T07:12:31Z

Made the same change to .env and docker-compose.yaml, still cannot replicated your problem with the pdf file provided.

gongshaojie12 · 2024-12-27T07:17:05Z

That's so strange.

AugNSo · 2024-12-27T07:20:51Z

What is your current dify's system model? Can you try changing your dify's system reasoning model to a model with a higher rate limit or a locally deployed one and see whether the problem persists?

gongshaojie12 · 2024-12-27T08:44:02Z

dify's system model:qwen-vl-max。After switching to qwen-max, it was able to process normally, but ultra-long PDF files still fail to process. This should be related to the model's rate limit. Currently, local model deployment is not possible. Can Dify control the indexing speed during the indexing process?

dosubot bot added the 🐞 bug Something isn't working label Dec 27, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The Q&A mode always fails. #12142

The Q&A mode always fails. #12142

gongshaojie12 commented Dec 27, 2024

dosubot bot commented Dec 27, 2024

yihong0618 commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024

AugNSo commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024

yihong0618 commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024 •

edited

Loading

AugNSo commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024

AugNSo commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024

The Q&A mode always fails. #12142

The Q&A mode always fails. #12142

Comments

gongshaojie12 commented Dec 27, 2024

Self Checks

Dify version

Cloud or Self Hosted

Steps to reproduce

✔️ Expected Behavior

❌ Actual Behavior

dosubot bot commented Dec 27, 2024

yihong0618 commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024

AugNSo commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024

yihong0618 commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024 • edited Loading

AugNSo commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024

AugNSo commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024

gongshaojie12 commented Dec 27, 2024 •

edited

Loading