-
Notifications
You must be signed in to change notification settings - Fork 349
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Extension LLava with in memory images #653
Conversation
@zsogitbe, thank you very much for your work. I would like to share my thoughts:
And to change the property ImagePaths for something like:
Anyway that's only my opinion. |
SignalRT, it is a good idea. I did not do it because it means much more code (I have updated the PR, you can see the changes needed). I usually prefer simplicity, but in this case the standardization worth it. |
Thank you @zsogitbe, I will review the changes as soon as possible. |
LLama/LLamaInteractExecutor.cs
Outdated
throw new NotImplementedException(); | ||
using var httpClient = new HttpClient(); | ||
var uri = new Uri((string)image.Data); | ||
var imageBytes = httpClient.GetByteArrayAsync(uri).Result; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't use Result
here (or as a rule, anywhere). use await
instead.
LLama/LLamaInteractExecutor.cs
Outdated
@@ -154,15 +155,21 @@ private Task PreprocessLlava(string text, InferStateArgs args, bool addBos = tru | |||
{ | |||
if (image.Type == ImageData.DataType.ImagePath && image.Data != null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can all of this logic be moved out of the executor? Maybe a base interface with separate classes for the different variations.
For example:
e.g.
interface IClipImage { Task<SafeLlavaImageEmbedHandle> GetEmbed(); }
class ImageDataFromUrl(string url)
{
public async Task<SafeLlavaImageEmbedHandle> GetEmbed()
{
return SafeLlavaImageEmbedHandle.CreateFromMemory(await GetImageBytes());
}
private async Task<byte[]> GetImageBytes()
{
return await DownloadThatUrl(url);
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking the same after adding more and more to the code. The reading/downloading of the image should be the task of the user of the library.
This is ready for merging. |
Looks good to me. I'll leave the final review to @SignalRT since he knows more about llava than me. |
I would have clearly preferred to keep the option to allow paths to files and images in memory, but it is blocking another PR with some change in key management, so I think that can be merge as is. |
SignalRT, , Please think about this once more. I think that the most optimal way to support images in the library is to have in memory byte array as the core. The users can easily convert any image from anywhere (HD, internet, DB, ...) to this byte array. The in memory image (byte array) works with all possible image locations! This is the reason for having this is the best standardized way of working. |
No description provided.