-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix dynamic content types like audio and images #34
base: develop
Are you sure you want to change the base?
Conversation
@@ -64,7 +67,7 @@ def duckdb_schema(self): | |||
id VARCHAR PRIMARY KEY, | |||
model VARCHAR, | |||
timestamp TIMESTAMP, | |||
messages STRUCT(role VARCHAR, content VARCHAR)[], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cfahlgren1, currently other modalities, like the openai_vision_example.py
, do not work with the current message schema.
I am not too familiar with DuckDB but can you have a look at this schema definiton? The input is a bit dynaminc and I am not sure how to best tackle this besides converting to JSON (losing the nice array queryability etc).
tests/integration/observers/test_observers_examples.py hold some of the potential types we could be expecting and it should work for testing them.
[
{
role: "user",
content: [
{
type: "text",
text: "What's in this image?",
},
{
type: "image_url",
image_url: {
url: imageUrl,
},
},
],
},
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah very good point. Ollama has this format for images for their client library for vision
"messages": [
{
"role": "user",
"content": "what is in this image?",
"images": ["<base64-encoded image data>"]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for it to be a DuckDB Struct each row would have to have the same amount of keys.
The key difference is that DuckDB STRUCTs require the same keys in each row of a STRUCT column
We may have to make it a JSON type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@cfahlgren1 Or we can rewrite the message structure to follow the following structure? I think they can be used interchangeably according to the typing. Then we avoid losing the queryability of the arrays. WDYT?
{"type": "text", "text": "hello", "image_url": None}
{"type": "text", "text": None, "image_url": "datauri"}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the message formats. e.g.
ChatCompletionSystemMessageParam(content=""),
ChatCompletionUserMessageParam(
content=ChatCompletionContentPartTextParam(text="test")
),
ChatCompletionAssistantMessageParam(content=""),
ChatCompletionUserMessageParam(
content=ChatCompletionContentPartImageParam(image_url="image")
),
ChatCompletionAssistantMessageParam(content=""),
ChatCompletionUserMessageParam(
content=ChatCompletionContentPartInputAudioParam(
input_audio=InputAudio(data="audio", format="wav")
)
),
ChatCompletionFunctionMessageParam(content=""),
ChatCompletionToolMessageParam(content="")
# the following are equal, more or less
ChatCompletionUserMessageParam(
content=ChatCompletionContentPartTextParam(text="test")
)
ChatCompletionUserMessageParam(
content="test"
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be the down side, but querying etc would be easier I think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think that is fine and gets most of the benefits of having it in a sql store like that. Obviously when being send to HuggingFace we won't need to send image url etc if there are none.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The DuckDB union could be valuable here, but not sure if it would make anything more complicated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One the schema changes topic, how do we want to make things backwards compatible or for upgrading existing duckdb instances to work with new version?
One idea is some type of migrations to update columns etc? But don't want to add to much complexity 🤔
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We will have the same problem with SQLite store. It may be better to just store as JSON. It adds a step (CASTING to STRUCT) for querying locally but I think it's a worthy trade off in favor of not introducing a lot of complexity or maintenence
Should be merged after #33