Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A huge of fail request indexing-status when upload knowledge file #3207

Closed
4 tasks done
trongnichi opened this issue Apr 9, 2024 · 11 comments
Closed
4 tasks done

A huge of fail request indexing-status when upload knowledge file #3207

trongnichi opened this issue Apr 9, 2024 · 11 comments
Assignees
Labels
🐞 bug Something isn't working

Comments

@trongnichi
Copy link

Self Checks

  • This is only for bug report, if you would like to ask a quesion, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • Pleas do not modify this template :) and fill in all the required fields.

Dify version

0.6.0-fix1

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

Go to Knowledge -> Upload file

✔️ Expected Behavior

The knowledge file can be uploaded with success message.

❌ Actual Behavior

image

  • It sent multiple indexing-status requests parallel as a time and fail after that.

image

  • The knowledge was stopped upload, but it continued indexing and send error messages to the screen.
Copy link

dosubot bot commented Apr 9, 2024

Xin chào @trongnichietsu! Rất vui được gặp lại bạn trên đây. Hy vọng mọi thứ đều ổn với bạn!

Based on the information you've provided and the context from similar issues, it seems like the problem might be related to the handling of parallel requests and the management of long-running operations in the Dify platform. Here are some suggestions on how to potentially resolve this issue:

  1. AbortController for Uploads: Ensure that any ongoing upload or related requests are properly aborted if a new upload starts or if the user navigates away. This can prevent unnecessary parallel requests.

  2. Error Handling in Stream Processing: Make sure that error handling within the function that processes server-sent events (SSE) is robust, including catching network errors and handling incomplete JSON messages more gracefully.

  3. Timeout Management: Consider increasing the timeout or implementing a more dynamic way of handling long-running operations, especially for large file uploads or slow network conditions.

  4. Content-Type Handling in Uploads: Ensure that the Content-Type is set to multipart/form-data, which is necessary for file uploads. If there's any manipulation of headers that could interfere with this, it could cause issues with how the server processes the request.

  5. Server-Side Handling: Check that the server-side implementation correctly handles multipart/form-data requests and does not have issues that could lead to failed uploads or multiple indexing-status requests.

  6. Network and Infrastructure Check: Since this is a self-hosted Docker environment, check for any network issues or Docker-specific configurations that might affect long-running connections or large uploads.

  7. Client-Side Fetch Error Handling: Ensure that the client-side code is prepared to handle fetch errors gracefully and retry or alert the user as appropriate.

  8. Debugging and Logging: Add more detailed logging on both the client and server sides to capture the sequence of events leading up to the failure.

These suggestions are based on common issues related to file uploads, SSE, and fetch API usage in web applications. Implementing more robust error handling, ensuring proper use of headers, and reviewing server-side processing are key steps in resolving the issue. You can refer to the web/service/base.ts file for more details.

Regarding your question about the 'indexing-status' endpoint, it is designed to provide the indexing status of documents, either individually or in batches. It handles multiple parallel requests by relying on the Flask framework and its underlying Werkzeug server, which can handle multiple requests concurrently. Each request is processed in isolation, ensuring that the retrieval and calculation of document statuses do not interfere with one another. You can find more details in the api/controllers/console/datasets/datasets_document.py file.

Please try these suggestions and let me know if they help resolve the issue or if you need further assistance.

Sources

About Dosu

This response is meant to be useful and save you time. It is not meant to be a precise solution, but rather a starting point for your own research.

Help me be more useful! Please leave a 👍 if this is helpful and 👎 if it is irrelevant.

If you want to continue the conversation, start your reply with @dosu-bot.

@dosubot dosubot bot added the 🐞 bug Something isn't working label Apr 9, 2024
@trongnichi
Copy link
Author

image

Also happens in Cloud!!

@trongnichi trongnichi changed the title Multiple request indexing-status when upload knowledge file A huge of fail request indexing-status when upload knowledge file Apr 9, 2024
@JohnJyong
Copy link
Contributor

Are you using the econmy index type ? @trongnichietsu

@trongnichi
Copy link
Author

Are you using the econmy index type ? @trongnichietsu

I used High Quality index type to use re-rank model.

@trongnichi
Copy link
Author

image

I tested again with Economical index type, but it also has the same error.

@coswind
Copy link

coswind commented Apr 9, 2024

I have the same problem when I upgrade Dify from 0.5.11 to 0.6.0-fix1

@trongnichi
Copy link
Author

I have the same problem when I upgrade Dify from 0.5.11 to 0.6.0-fix1

I don't know why they can release new version 0.6,1 without checking this, I have tested again on the cloud today, but still has this issue.

@JohnJyong
Copy link
Contributor

Could you please provide more details, such as provide your files and embedding models? @trongnichi

@trongnichi
Copy link
Author

Could you please provide more details, such as provide your files and embedding models? @trongnichi

My files are all excel files from 300 - 700 kb, I use the default embedding models from OpenAI which is text-embedded-3-small.
The error happened when I clicked on "Go to document" after upload file --> Click to a file which is indexing and after 1 - 2s a huge of requests come.

@JohnJyong
Copy link
Contributor

@JzoNgKVO

@iamjoel
Copy link
Collaborator

iamjoel commented Apr 12, 2024

Fixed: #3411

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞 bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants