-
-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Chat with Multiple Images. Support Vision with Gemini #942
Conversation
Previously Khoj could respond to a single shared image at a time. This changes updates the chat API to accept multiple images shared by the user and send it to the appropriate chat actors including the openai response generation chat actor for getting an image aware response
Previously the web app only expected a single image to be shared by the user as part of their query. This change allows sharing multiple images from the web app. Closes #921
- Put the attached images display div inside the same parent div as the text area - Keep the attachment, microphone/send message buttons aligned with the text area. So the attached images just show up at the top of the text area but everything else stays at the same horizontal height as before. - This improves the UX by - Ensuring that the attached images do not obscure the agents pane above the chat input area - The attached images visually look like they are inside the actual input area, rather than floating above it. So the visual aligns with the semantics
7e5ed8a
to
3cc1426
Compare
Currently experiencing difficulty instruction following when an image is shared. It's more likely to try and output an image. Update to make a clearer distinction.
…e on the homage page One limitation of this methodology is that localStorage has a limit in how much data it can take. Should add more graceful error handling here as well.
5510f9a
to
7646ac6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🚀 awesome being able to chat w/ multiple images & having gemini support.
src/interface/web/app/components/chatInputArea/chatInputArea.module.css
Outdated
Show resolved
Hide resolved
@@ -1102,9 +1089,9 @@ | |||
|
|||
## Stream Text Response | |||
if stream: | |||
return StreamingResponse(event_generator(q, image=image), media_type="text/plain") | |||
return StreamingResponse(event_generator(q, images=raw_images), media_type="text/plain") |
Check warning
Code scanning / CodeQL
Information exposure through an exception Medium
Stack trace information
Stack trace information
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix AI 4 months ago
To fix the problem, we need to ensure that detailed error messages and stack traces are not exposed to the end user. Instead, we should log the detailed error information on the server and return a generic error message to the user. This can be achieved by modifying the exception handling in src/khoj/processor/image/generate.py
to yield a generic error message and status code, and ensuring that the event_generator
function in src/khoj/routers/api_chat.py
handles these generic messages appropriately.
-
Copy modified line R1086 -
Copy modified line R1088
@@ -1085,5 +1085,5 @@ | ||
yield result | ||
except Exception as e: | ||
except Exception: | ||
continue_stream = False | ||
logger.info(f"User {user} disconnected. Emitting rest of responses to clear thread: {e}") | ||
logger.info(f"User {user} disconnected. Emitting rest of responses to clear thread.") | ||
|
-
Copy modified line R86 -
Copy modified line R90 -
Copy modified line R94 -
Copy modified line R98
@@ -85,3 +85,3 @@ | ||
webp_image_bytes = generate_image_with_replicate(image_prompt, text_to_image_config, text2image_model) | ||
except openai.OpenAIError or openai.BadRequestError or openai.APIConnectionError as e: | ||
except (openai.OpenAIError, openai.BadRequestError, openai.APIConnectionError) as e: | ||
if "content_policy_violation" in e.message: | ||
@@ -89,4 +89,3 @@ | ||
status_code = e.status_code # type: ignore | ||
message = f"Image generation blocked by OpenAI: {e.message}" # type: ignore | ||
yield image_url or image, status_code, message, intent_type.value | ||
yield image_url or image, status_code, "Image generation blocked due to policy violation.", intent_type.value | ||
return | ||
@@ -94,5 +93,3 @@ | ||
logger.error(f"Image Generation failed with {e}", exc_info=True) | ||
message = f"Image generation failed with OpenAI error: {e.message}" # type: ignore | ||
status_code = e.status_code # type: ignore | ||
yield image_url or image, status_code, message, intent_type.value | ||
yield image_url or image, 500, "Image generation failed due to an internal error.", intent_type.value | ||
return | ||
@@ -100,5 +97,3 @@ | ||
logger.error(f"Image Generation failed with {e}", exc_info=True) | ||
message = f"Image generation using {text2image_model} via {text_to_image_config.model_type} failed with error: {e}" | ||
status_code = 502 | ||
yield image_url or image, status_code, message, intent_type.value | ||
yield image_url or image, 502, "Image generation failed due to a network error.", intent_type.value | ||
return |
Set max combined images size to 20mb to allow multiple photos to be shared
99d06fc
to
b3fff43
Compare
Overview
Screenshots
Closes #716
Closes #921