Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: the consistency of MultiModalPromptMessageContent #11721

Merged
merged 6 commits into from
Dec 17, 2024

Conversation

hjlarry
Copy link
Contributor

@hjlarry hjlarry commented Dec 17, 2024

Summary

Tip

Close issue syntax: Fixes #<issue number> or Resolves #<issue number>, see documentation for more details.

Currently, when files are sent to various LLM, each model may require different parameters. For instance, some models expect:
{"image_url": url or base64 with mime_type}
Others might require separated mime_type and b64data, formatted as:
{"mime_type": "", "data": ""}
And some may utilize a parameter that specifies the format:
{"format": "..."}
To enhance the readability of the code and to maintain consistency across all MultiModalPromptMessageContent, it is proposed that we standardize these parameters.

Additionally, it is suggested to change the MULTIMODAL_SEND_IMAGE_FORMAT and MULTIMODAL_SEND_VIDEO_FORMAT to a unified MULTIMODAL_SEND_FORMAT, as it seems that a single environment variable would suffice for all purposes.

Screenshots

Before After
... ...

Checklist

Important

Please review the checklist below before submitting your pull request.

  • This change requires a documentation update, included: Dify Document
  • I understand that this PR may be closed in case there was no previous discussion or issues. (This doesn't apply to typos!)
  • I've added a test for each change that was introduced, and I tried as much as possible to make a single atomic change.
  • I've updated the documentation accordingly.
  • I ran dev/reformat(backend) and cd web && npx lint-staged(frontend) to appease the lint gods

@hjlarry hjlarry marked this pull request as ready for review December 17, 2024 03:59
@dosubot dosubot bot added size:L This PR changes 100-499 lines, ignoring generated files. 📚 documentation Improvements or additions to documentation labels Dec 17, 2024
@hjlarry hjlarry mentioned this pull request Dec 17, 2024
5 tasks
Copy link
Member

@laipz8200 laipz8200 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Dec 17, 2024
@laipz8200 laipz8200 merged commit c9b4029 into langgenius:main Dec 17, 2024
7 checks passed
Scorpion1221 added a commit to yybht155/dify that referenced this pull request Dec 19, 2024
* commit '926546b153a701dfbfc71eb8157f9a41320444f8': (486 commits)
  chore: bump version to 0.14.1 (langgenius#11784)
  feat:add hunyuan model(hunyuan-role, hunyuan-large, hunyuan-large-rol… (langgenius#11766)
  chore(opendal_storage): remove unused comment (langgenius#11783)
  feat: Disable the "Forgot your password?" button when the mail server setup is incomplete (langgenius#11653)
  chore(.env.example): add comments for opendal (langgenius#11778)
  Lindorm vdb bug-fix (langgenius#11790)
  fix: imperfect service-api introduction text (langgenius#11782)
  feat: add openai o1 & update pricing and max_token of other models (langgenius#11780)
  fix: file upload auth (langgenius#11774)
  feat: add parameters for JinaReaderTool (langgenius#11613)
  feat: full support for opendal and sync configurations between .env and docker-compose (langgenius#11754)
  feat(app_factory): speed up api startup (langgenius#11762)
  fix: Prevent redirection to /overview when accessing /workflow. (langgenius#11733)
  (doc) fix: update cURL examples to include Authorization header (langgenius#11750)
  Fix explore app icon (langgenius#11742)
  chore: improve gemini models (langgenius#11745)
  feat: use Gemini response metadata for token counting (langgenius#11743)
  chore: update comments in docker env file (langgenius#11705)
  feat(ark): support doubao vision series models (langgenius#11740)
  chore: the consistency of MultiModalPromptMessageContent (langgenius#11721)
  ...

# Conflicts:
#	api/configs/app_config.py
#	api/core/helper/code_executor/code_executor.py
#	web/yarn.lock
jiangbo721 pushed a commit to jiangbo721/dify that referenced this pull request Dec 20, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
📚 documentation Improvements or additions to documentation lgtm This PR has been approved by a maintainer size:L This PR changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants