Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with DIFY When Orchestrating Workflows Involving Visual VL Models and Standard Text Models #8824

Closed
5 tasks done
svcvit opened this issue Sep 27, 2024 · 5 comments · Fixed by #9790
Closed
5 tasks done
Assignees
Labels
🐞 bug Something isn't working 💪 enhancement New feature or request

Comments

@svcvit
Copy link
Contributor

svcvit commented Sep 27, 2024

Self Checks

  • This is only for bug report, if you would like to ask a question, please head to Discussions.
  • I have searched for existing issues search for existing issues, including closed ones.
  • I confirm that I am using English to submit this report (我已阅读并同意 Language Policy).
  • [FOR CHINESE USERS] 请务必使用英文提交 Issue,否则会被关闭。谢谢!:)
  • Please do not modify this template :) and fill in all the required fields.

Dify version

0.8.3

Cloud or Self Hosted

Self Hosted (Docker)

Steps to reproduce

I'm trying to implement a branching logic in the workflow where regular text content is processed by a standard large language model (LLM), and image content is processed by a visual VL model. However, the system parameter sys.files cannot be referenced within the if branch.

To work around this, I added a keyword "image" to sys.query. If this keyword is present, the workflow will use the visual VL model.

The workflow functions correctly during the first round of conversation. However, in the second round, which only involves text, an error occurs: Run failed: 'TextPromptMessageContent' object has no attribute 'text'.

Reference image for the first round of conversation:

Xnip2024-09-27_10-29-21

Reference image for the second round of conversation:

Xnip2024-09-27_10-29-38

✔️ Expected Behavior

Enhancement Request: I would like the if condition to support sys.files so that image-related tasks can be processed using different models.
Bug Fix: I'm also seeking a fix for the current bug.

❌ Actual Behavior

No response

Copy link

dosubot bot commented Sep 27, 2024

I found a similar unsolved discussion that might be relevant to your issue: Error in item: Invalid variable, which was last updated on September 23, 2024 [1].

Addressing Your Issues:

  1. Enhancement Request: Support for sys.files in IF Conditions

    • Currently, the sys.files parameter is used to store images uploaded by users and requires enabling the image upload function in the 'Features' section of the application orchestration page [2]. However, it seems that sys.files cannot be directly referenced within the IF condition. This enhancement would indeed be useful for workflows involving different content types.
  2. Bug Fix: 'TextPromptMessageContent' object has no attribute 'text'

    • The error 'TextPromptMessageContent' object has no attribute 'text' occurs because the TextPromptMessageContent class uses the data attribute to store text data, not text. Ensure that you are using the data attribute to access the text content.

Here is the correct way to handle TextPromptMessageContent:

from your_module import TextPromptMessageContent, PromptMessageContentType

# Correct initialization
text_content = TextPromptMessageContent(type=PromptMessageContentType.TEXT, data="Your text here")

# Accessing the text data
print(text_content.data)  # This will print "Your text here"

In your workflow, make sure to use content.data instead of content.text when dealing with TextPromptMessageContent objects [3][4].

Implementing Branching Logic Based on Content Type:

To implement branching logic based on content type, you can use the IF/ELSE node to check for the presence of images or specific keywords in sys.query. Here is an example of how to set up such a workflow:

  1. IF/ELSE Node: Use this node to check if sys.query contains the keyword "image".
  2. Branch for Image Content: If the condition is met, process the content using the visual VL model.
  3. Branch for Text Content: If the condition is not met, process the content using the standard LLM.

This approach allows you to handle different types of content within the same workflow [5][6].

By addressing these points, you should be able to enhance your workflow and resolve the current bug.

To continue talking to Dosu, mention @dosu.

@dosubot dosubot bot added 🐞 bug Something isn't working 💪 enhancement New feature or request labels Sep 27, 2024
@crazywoola
Copy link
Member

Link #8777, not sure if is resolved.

@svcvit
Copy link
Contributor Author

svcvit commented Sep 27, 2024

I tried pulling the latest merged code and switched to developer mode, but the error still persists.

If I first send an image with text, it follows the visual processing branch and responds correctly. However, when I then send a text-only message, it throws this error.

Additionally, there's another issue—sys.files can't be used as a condition in an if statement. Is there any way to optimize this?

Xnip2024-09-27_12-33-14

@free-cx
Copy link
Contributor

free-cx commented Sep 27, 2024

This bug is related to enabling memory in the node.

@laipz8200
Copy link
Member

We'll provide full file-type support in future versions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment