Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Vision Support + New UI #1203

Merged
merged 60 commits into from
Nov 22, 2023
Merged

feat: Vision Support + New UI #1203

merged 60 commits into from
Nov 22, 2023

Conversation

danny-avila
Copy link
Owner

@danny-avila danny-avila commented Nov 20, 2023

Summary

Notes:

  • This is not considered stable until I cut a release (v0.6.2)
  • This is a breaking change: the main route has changed to /c/new from /chat/new
  • As of now, only image attachments are supported. URLs will be supported within a few days
  • Images are resized according to the dimensions the gpt-4-vision expects
    • Since the only storage strategy is local right now, numerous safeguards have been implemented, including optimizing image size
  • If you would like to toggle this feature off, this is not ready yet, but will be in the next few days
    • It might be worth disabling until SafeSearch or remote upload services are integrated to protect from unwanted material
    • Also if you are mindful of your server's hard disk space (though the resizing is really optimal for image quality/space)
  • OpenAI Endpoint only. Vision support for Plugins may come after the holiday.
  • All images are handled locally--remote services like Firebase and S3 are planned, and I would like contributions for this.
  • Investigating: minor but docker may or may not load images when the image request is first sent (after first attaching them to the message).
    • This is due to how docker handles internal networking and will find a solution
  • Configurable file limits

Limits:

  • Only 10 images per request and 20 MB per file max (enforced by API)
  • 25 MB total per request

The next big priority for LibreChat will be custom GPTs/Assistants support

Change Type

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Testing

Tests will fail until they are updated

Checklist

  • My code adheres to this project's style guidelines
  • I have performed a self-review of my own code
  • I have commented in any complex areas of my code
  • I have made pertinent documentation changes
  • My changes do not introduce new warnings
  • I have written tests demonstrating that my changes are effective or that my feature works
  • Local unit tests pass with my changes
  • Any changes dependent on mine have been merged and published in downstream modules.

…ategize new plan to separate react dependent packages
…acks in a way that keeps them during component unmount, initial delete handling
…ildMessages method, count tokens for nested objects/arrays
TODO: file size, type, amount validations, making sure they are styled right, and making sure you can add images from the clipboard/dragging
… Presentation component so the useDragHelpers hook has ChatContext
@danny-avila danny-avila marked this pull request as draft November 20, 2023 15:21
@danny-avila danny-avila marked this pull request as ready for review November 21, 2023 23:54
@danny-avila danny-avila changed the title feat: Vision Support feat: Vision Support + New UI Nov 22, 2023
@danny-avila danny-avila merged commit 317cdd3 into main Nov 22, 2023
3 checks passed
@danny-avila danny-avila deleted the vision branch November 22, 2023 01:12
cnkang pushed a commit to cnkang/LibreChat that referenced this pull request Feb 6, 2024
* feat: add timer duration to showToast, show toast for preset selection

* refactor: replace old /chat/ route with /c/. e2e tests will fail here

* refactor: move typedefs to root of /api/ and add a few to assistant types in TS

* refactor: reorganize data-provider imports, fix dependency cycle, strategize new plan to separate react dependent packages

* feat: add dataService for uploading images

* feat(data-provider): add mutation keys

* feat: file resizing and upload

* WIP: initial API image handling

* fix: catch JSON.parse of localStorage tools

* chore: experimental: use module-alias for absolute imports

* refactor: change temp_file_id strategy

* fix: updating files state by using Map and defining react query callbacks in a way that keeps them during component unmount, initial delete handling

* feat: properly handle file deletion

* refactor: unexpose complete filepath and resize from server for higher fidelity

* fix: make sure resized height, width is saved, catch bad requests

* refactor: use absolute imports

* fix: prevent setOptions from being called more than once for OpenAIClient, made note to fix for PluginsClient

* refactor: import supportsFiles and models vars from schemas

* fix: correctly replace temp file id

* refactor(BaseClient): use absolute imports, pass message 'opts' to buildMessages method, count tokens for nested objects/arrays

* feat: add validateVisionModel to determine if model has vision capabilities

* chore(checkBalance): update jsdoc

* feat: formatVisionMessage: change message content format dependent on role and image_urls passed

* refactor: add usage to File schema, make create and updateFile, correctly set and remove TTL

* feat: working vision support
TODO: file size, type, amount validations, making sure they are styled right, and making sure you can add images from the clipboard/dragging

* feat: clipboard support for uploading images

* feat: handle files on drop to screen, refactor top level view code to Presentation component so the useDragHelpers hook  has ChatContext

* fix(Images): replace uploaded images in place

* feat: add filepath validation to protect sensitive files

* fix: ensure correct file_ids are push and not the Map key values

* fix(ToastContext): type issue

* feat: add basic file validation

* fix(useDragHelpers): correct context issue with `files` dependency

* refactor: consolidate setErrors logic to setError

* feat: add dialog Image overlay on image click

* fix: close endpoints menu on click

* chore: set detail to auto, make note for configuration

* fix: react warning (button desc. of button)

* refactor: optimize filepath handling, pass file_ids to images for easier re-use

* refactor: optimize image file handling, allow re-using files in regen, pass more file metadata in messages

* feat: lazy loading images including use of upload preview

* fix: SetKeyDialog closing, stopPropagation on Dialog content click

* style(EndpointMenuItem): tighten up the style, fix dark theme showing in lightmode, make menu more ux friendly

* style: change maxheight of all settings textareas to 138px from 300px

* style: better styling for textarea and enclosing buttons

* refactor(PresetItems): swap back edit and delete icons

* feat: make textarea placeholder dynamic to endpoint

* style: show user hover buttons only on hover when message is streaming

* fix: ordered list not going past 9, fix css

* feat: add User/AI labels; style: hide loading spinner

* feat: add back custom footer, change original footer text

* feat: dynamic landing icons based on endpoint

* chore: comment out assistants route

* fix: autoScroll to newest on /c/ view

* fix: Export Conversation on new UI

* style: match message style of official more closely

* ci: fix api jest unit tests, comment out e2e tests for now as they will fail until addressed

* feat: more file validation and use blob in preview field, not filepath, to fix temp deletion

* feat: filefilter for multer

* feat: better AI labels based on custom name, model, and endpoint instead of  `ChatGPT`
jinzishuai pushed a commit to aitok-ai/LibreChat that referenced this pull request May 20, 2024
* feat: add timer duration to showToast, show toast for preset selection

* refactor: replace old /chat/ route with /c/. e2e tests will fail here

* refactor: move typedefs to root of /api/ and add a few to assistant types in TS

* refactor: reorganize data-provider imports, fix dependency cycle, strategize new plan to separate react dependent packages

* feat: add dataService for uploading images

* feat(data-provider): add mutation keys

* feat: file resizing and upload

* WIP: initial API image handling

* fix: catch JSON.parse of localStorage tools

* chore: experimental: use module-alias for absolute imports

* refactor: change temp_file_id strategy

* fix: updating files state by using Map and defining react query callbacks in a way that keeps them during component unmount, initial delete handling

* feat: properly handle file deletion

* refactor: unexpose complete filepath and resize from server for higher fidelity

* fix: make sure resized height, width is saved, catch bad requests

* refactor: use absolute imports

* fix: prevent setOptions from being called more than once for OpenAIClient, made note to fix for PluginsClient

* refactor: import supportsFiles and models vars from schemas

* fix: correctly replace temp file id

* refactor(BaseClient): use absolute imports, pass message 'opts' to buildMessages method, count tokens for nested objects/arrays

* feat: add validateVisionModel to determine if model has vision capabilities

* chore(checkBalance): update jsdoc

* feat: formatVisionMessage: change message content format dependent on role and image_urls passed

* refactor: add usage to File schema, make create and updateFile, correctly set and remove TTL

* feat: working vision support
TODO: file size, type, amount validations, making sure they are styled right, and making sure you can add images from the clipboard/dragging

* feat: clipboard support for uploading images

* feat: handle files on drop to screen, refactor top level view code to Presentation component so the useDragHelpers hook  has ChatContext

* fix(Images): replace uploaded images in place

* feat: add filepath validation to protect sensitive files

* fix: ensure correct file_ids are push and not the Map key values

* fix(ToastContext): type issue

* feat: add basic file validation

* fix(useDragHelpers): correct context issue with `files` dependency

* refactor: consolidate setErrors logic to setError

* feat: add dialog Image overlay on image click

* fix: close endpoints menu on click

* chore: set detail to auto, make note for configuration

* fix: react warning (button desc. of button)

* refactor: optimize filepath handling, pass file_ids to images for easier re-use

* refactor: optimize image file handling, allow re-using files in regen, pass more file metadata in messages

* feat: lazy loading images including use of upload preview

* fix: SetKeyDialog closing, stopPropagation on Dialog content click

* style(EndpointMenuItem): tighten up the style, fix dark theme showing in lightmode, make menu more ux friendly

* style: change maxheight of all settings textareas to 138px from 300px

* style: better styling for textarea and enclosing buttons

* refactor(PresetItems): swap back edit and delete icons

* feat: make textarea placeholder dynamic to endpoint

* style: show user hover buttons only on hover when message is streaming

* fix: ordered list not going past 9, fix css

* feat: add User/AI labels; style: hide loading spinner

* feat: add back custom footer, change original footer text

* feat: dynamic landing icons based on endpoint

* chore: comment out assistants route

* fix: autoScroll to newest on /c/ view

* fix: Export Conversation on new UI

* style: match message style of official more closely

* ci: fix api jest unit tests, comment out e2e tests for now as they will fail until addressed

* feat: more file validation and use blob in preview field, not filepath, to fix temp deletion

* feat: filefilter for multer

* feat: better AI labels based on custom name, model, and endpoint instead of  `ChatGPT`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant