Skip to content

Commit

Permalink
feat (provider/mistral): image support (#3048)
Browse files Browse the repository at this point in the history
  • Loading branch information
lgrammel authored Sep 18, 2024
1 parent b22b7ea commit 518c276
Show file tree
Hide file tree
Showing 15 changed files with 167 additions and 57 deletions.
5 changes: 5 additions & 0 deletions .changeset/grumpy-students-remember.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
---
'@ai-sdk/mistral': patch
---

feat (provider/mistral): image support
1 change: 1 addition & 0 deletions content/docs/02-foundations/02-providers-and-models.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,7 @@ Here are the capabilities of popular models:
| [Anthropic](/providers/ai-sdk-providers/anthropic) | `claude-3-5-sonnet-20240620` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Mistral](/providers/ai-sdk-providers/mistral) | `mistral-large-latest` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Mistral](/providers/ai-sdk-providers/mistral) | `mistral-small-latest` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Mistral](/providers/ai-sdk-providers/mistral) | `pixtral-12b-2409` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Google Generative AI](/providers/ai-sdk-providers/google-generative-ai) | `gemini-1.5-flash` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Google Generative AI](/providers/ai-sdk-providers/google-generative-ai) | `gemini-1.5-pro` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Google Vertex](/providers/ai-sdk-providers/google-vertex) | `gemini-1.5-flash` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
Expand Down
18 changes: 9 additions & 9 deletions content/providers/01-ai-sdk-providers/20-mistral.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -111,15 +111,15 @@ Mistral language models can also be used in the `streamText`, `generateObject`,

### Model Capabilities

| Model | Image Input | Object Generation | Tool Usage | Tool Streaming |
| ----------------------- | ------------------- | ------------------- | ------------------- | ------------------- |
| `mistral-large-latest` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `mistral-medium-latest` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `mistral-small-latest` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `open-mistral-nemo` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `open-mixtral-8x22b` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `open-mixtral-8x7b` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `open-mistral-7b` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| Model | Image Input | Object Generation | Tool Usage | Tool Streaming |
| ---------------------- | ------------------- | ------------------- | ------------------- | ------------------- |
| `mistral-large-latest` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `mistral-small-latest` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `pixtral-12b-2409` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `open-mistral-nemo` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `open-mixtral-8x22b` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `open-mixtral-8x7b` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| `open-mistral-7b` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |

<Note>
The table above lists popular models. You can also pass any available provider
Expand Down
1 change: 1 addition & 0 deletions content/providers/01-ai-sdk-providers/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ Not all providers support all AI SDK features. Here's a quick comparison of the
| [Anthropic](/providers/ai-sdk-providers/anthropic) | `claude-3-5-sonnet-20240620` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Mistral](/providers/ai-sdk-providers/mistral) | `mistral-large-latest` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Mistral](/providers/ai-sdk-providers/mistral) | `mistral-small-latest` | <Cross size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Mistral](/providers/ai-sdk-providers/mistral) | `pixtral-12b-2409` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Google Generative AI](/providers/ai-sdk-providers/google-generative-ai) | `gemini-1.5-flash` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Google Generative AI](/providers/ai-sdk-providers/google-generative-ai) | `gemini-1.5-pro` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
| [Google Vertex](/providers/ai-sdk-providers/google-vertex) | `gemini-1.5-flash` | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> | <Check size={18} /> |
Expand Down
26 changes: 26 additions & 0 deletions examples/ai-core/src/generate-text/mistral-multimodal-base64.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import { mistral } from '@ai-sdk/mistral';
import { generateText } from 'ai';
import 'dotenv/config';
import fs from 'node:fs';

async function main() {
const result = await generateText({
model: mistral('pixtral-12b-2409'),
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'Describe the image in detail.' },
{
type: 'image',
image: fs.readFileSync('./data/comic-cat.png').toString('base64'),
},
],
},
],
});

console.log(result.text);
}

main().catch(console.error);
26 changes: 26 additions & 0 deletions examples/ai-core/src/generate-text/mistral-multimodal-url.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
import { mistral } from '@ai-sdk/mistral';
import { generateText } from 'ai';
import 'dotenv/config';

async function main() {
const result = await generateText({
model: mistral('pixtral-12b-2409'),
messages: [
{
role: 'user',
content: [
{ type: 'text', text: 'Describe the image in detail.' },
{
type: 'image',
image:
'https://github.com/vercel/ai/blob/main/examples/ai-core/data/comic-cat.png?raw=true',
},
],
},
],
});

console.log(result.text);
}

main().catch(console.error);
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
// Vitest Snapshot v1, https://vitest.dev/guide/snapshot.html

exports[`tool calls > should stringify arguments to tool calls 1`] = `
[
{
"content": "",
"role": "assistant",
"tool_calls": [
{
"function": {
"arguments": "{"key":"arg-value"}",
"name": "tool-1",
},
"id": "tool-call-id-1",
"type": "function",
},
],
},
{
"content": "{"key":"result-value"}",
"name": "tool-1",
"role": "tool",
"tool_call_id": "tool-call-id-1",
},
]
`;
54 changes: 32 additions & 22 deletions packages/mistral/src/convert-to-mistral-chat-messages.test.ts
Original file line number Diff line number Diff line change
@@ -1,5 +1,36 @@
import { convertToMistralChatMessages } from './convert-to-mistral-chat-messages';

describe('user messages', () => {
it('should convert messages with image parts', async () => {
const result = convertToMistralChatMessages([
{
role: 'user',
content: [
{ type: 'text', text: 'Hello' },
{
type: 'image',
image: new Uint8Array([0, 1, 2, 3]),
mimeType: 'image/png',
},
],
},
]);

expect(result).toEqual([
{
role: 'user',
content: [
{ type: 'text', text: 'Hello' },
{
type: 'image_url',
image_url: 'data:image/png;base64,AAECAw==',
},
],
},
]);
});
});

describe('tool calls', () => {
it('should stringify arguments to tool calls', () => {
const result = convertToMistralChatMessages([
Expand Down Expand Up @@ -27,27 +58,6 @@ describe('tool calls', () => {
},
]);

expect(result).toEqual([
{
role: 'assistant',
content: '',
tool_calls: [
{
type: 'function',
id: 'tool-call-id-1',
function: {
name: 'tool-1',
arguments: JSON.stringify({ key: 'arg-value' }),
},
},
],
},
{
role: 'tool',
content: JSON.stringify({ key: 'result-value' }),
name: 'tool-1',
tool_call_id: 'tool-call-id-1',
},
]);
expect(result).toMatchSnapshot();
});
});
36 changes: 19 additions & 17 deletions packages/mistral/src/convert-to-mistral-chat-messages.ts
Original file line number Diff line number Diff line change
@@ -1,7 +1,5 @@
import {
LanguageModelV1Prompt,
UnsupportedFunctionalityError,
} from '@ai-sdk/provider';
import { LanguageModelV1Prompt } from '@ai-sdk/provider';
import { convertUint8ArrayToBase64 } from '@ai-sdk/provider-utils';
import { MistralChatPrompt } from './mistral-chat-prompt';

export function convertToMistralChatMessages(
Expand All @@ -19,20 +17,24 @@ export function convertToMistralChatMessages(
case 'user': {
messages.push({
role: 'user',
content: content
.map(part => {
switch (part.type) {
case 'text': {
return part.text;
}
case 'image': {
throw new UnsupportedFunctionalityError({
functionality: 'image-part',
});
}
content: content.map(part => {
switch (part.type) {
case 'text': {
return { type: 'text', text: part.text };
}
})
.join(''),
case 'image': {
return {
type: 'image_url',
image_url:
part.image instanceof URL
? part.image.toString()
: `data:${
part.mimeType ?? 'image/jpeg'
};base64,${convertUint8ArrayToBase64(part.image)}`,
};
}
}
}),
});
break;
}
Expand Down
6 changes: 3 additions & 3 deletions packages/mistral/src/mistral-chat-language-model.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -191,7 +191,7 @@ describe('doGenerate', () => {

expect(await server.getRequestBodyJson()).toStrictEqual({
model: 'mistral-small-latest',
messages: [{ role: 'user', content: 'Hello' }],
messages: [{ role: 'user', content: [{ type: 'text', text: 'Hello' }] }],
});
});

Expand Down Expand Up @@ -225,7 +225,7 @@ describe('doGenerate', () => {

expect(await server.getRequestBodyJson()).toStrictEqual({
model: 'mistral-small-latest',
messages: [{ role: 'user', content: 'Hello' }],
messages: [{ role: 'user', content: [{ type: 'text', text: 'Hello' }] }],
tools: [
{
type: 'function',
Expand Down Expand Up @@ -433,7 +433,7 @@ describe('doStream', () => {
expect(await server.getRequestBodyJson()).toStrictEqual({
stream: true,
model: 'mistral-small-latest',
messages: [{ role: 'user', content: 'Hello' }],
messages: [{ role: 'user', content: [{ type: 'text', text: 'Hello' }] }],
});
});

Expand Down
1 change: 1 addition & 0 deletions packages/mistral/src/mistral-chat-language-model.ts
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ type MistralChatConfig = {
export class MistralChatLanguageModel implements LanguageModelV1 {
readonly specificationVersion = 'v1';
readonly defaultObjectGenerationMode = 'json';
readonly supportsImageUrls = false;

readonly modelId: MistralChatModelId;
readonly settings: MistralChatSettings;
Expand Down
16 changes: 15 additions & 1 deletion packages/mistral/src/mistral-chat-prompt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,21 @@ export interface MistralSystemMessage {

export interface MistralUserMessage {
role: 'user';
content: string;
content: Array<MistralUserMessageContent>;
}

export type MistralUserMessageContent =
| MistralUserMessageTextContent
| MistralUserMessageImageContent;

export interface MistralUserMessageImageContent {
type: 'image_url';
image_url: string;
}

export interface MistralUserMessageTextContent {
type: 'text';
text: string;
}

export interface MistralAssistantMessage {
Expand Down
2 changes: 1 addition & 1 deletion packages/mistral/src/mistral-chat-settings.ts
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@ export type MistralChatModelId =
| 'open-mixtral-8x7b'
| 'open-mixtral-8x22b'
| 'open-mistral-nemo'
| 'pixtral-12b-2409'
| 'mistral-small-latest'
| 'mistral-medium-latest'
| 'mistral-large-latest'
| (string & {});

Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import { convertToOpenAIChatMessages } from './convert-to-openai-chat-messages';

describe('user messages', () => {
it('should convert messages with image parts to multiple parts', async () => {
it('should convert messages with image parts', async () => {
const result = convertToOpenAIChatMessages({
prompt: [
{
Expand Down
4 changes: 1 addition & 3 deletions packages/openai/src/openai-chat-prompt.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,7 @@ export type ChatCompletionContentPart =

export interface ChatCompletionContentPartImage {
type: 'image_url';
image_url: {
url: string;
};
image_url: { url: string };
}

export interface ChatCompletionContentPartText {
Expand Down

0 comments on commit 518c276

Please sign in to comment.