Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

✨ feat: add o3-mini support for OpenAI & GitHub Models #5657

Merged
merged 16 commits into from
Feb 3, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 16 additions & 0 deletions docs/usage/agents/model.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -77,3 +77,19 @@ It is a mechanism that penalizes frequently occurring new vocabulary in the text
- `0.0` When the morning sun poured into the small diner, a tired postman appeared at the door, carrying a bag of letters in his hands. The owner warmly prepared a breakfast for him, and he started sorting the mail while enjoying his breakfast. **(The highest frequency word is "of", accounting for 8.45%)**
- `1.0` A girl in deep sleep was woken up by a warm ray of sunshine, she saw the first ray of morning light, surrounded by birdsong and flowers, everything was full of vitality. (The highest frequency word is "of", accounting for 5.45%)
- `2.0` Every morning, he would sit on the balcony to have breakfast. Under the soft setting sun, everything looked very peaceful. However, one day, when he was about to pick up his breakfast, an optimistic little bird flew by, bringing him a good mood for the day. (The highest frequency word is "of", accounting for 4.94%)

<br />

### `reasoning_effort`

The `reasoning_effort` parameter controls the strength of the reasoning process. This setting affects the depth of reasoning the model performs when generating a response. The available values are **`low`**, **`medium`**, and **`high`**, with the following meanings:

- **low**: Lower reasoning effort, resulting in faster response times. Suitable for scenarios where quick responses are needed, but it may sacrifice some reasoning accuracy.
- **medium** (default): Balances reasoning accuracy and response speed, suitable for most scenarios.
- **high**: Higher reasoning effort, producing more detailed and complex responses, but slower response times and greater token consumption.

By adjusting the `reasoning_effort` parameter, you can find an appropriate balance between response speed and reasoning depth based on your needs. For example, in conversational scenarios, if fast responses are a priority, you can choose low reasoning effort; if more complex analysis or reasoning is needed, you can opt for high reasoning effort.

<Callout>
This parameter is only applicable to reasoning models, such as OpenAI's `o1`, `o1-mini`, `o3-mini`, etc.
</Callout>
16 changes: 16 additions & 0 deletions docs/usage/agents/model.zh-CN.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -72,3 +72,19 @@ Presence Penalty 参数可以看作是对生成文本中重复内容的一种惩
- `0.0` 当清晨的阳光洒进小餐馆时,一名疲倦的邮递员出现在门口,他的手中提着一袋信件。店主热情地为他准备了一份早餐,他在享用早餐的同时开始整理邮件。**(频率最高的词是 “的”,占比 8.45%)**
- `1.0` 一个深度睡眠的女孩被一阵温暖的阳光唤醒,她看到了早晨的第一缕阳光,周围是鸟语花香,一切都充满了生机。*(频率最高的词是 “的”,占比 5.45%)*
- `2.0` 每天早上,他都会在阳台上坐着吃早餐。在柔和的夕阳照耀下,一切看起来都非常宁静。然而有一天,当他准备端起早餐的时候,一只乐观的小鸟飞过,给他带来了一天的好心情。 *(频率最高的词是 “的”,占比 4.94%)*

<br />

### `reasoning_effort`

`reasoning_effort` 参数用于控制推理过程的强度。此参数的设置会影响模型在生成回答时的推理深度。可选值包括 **`low`****`medium`****`high`**,具体含义如下:

- **low(低)**:推理强度较低,生成速度较快,适用于需要快速响应的场景,但可能牺牲一定的推理精度。
- **medium(中,默认值)**:平衡推理精度与响应速度,适用于大多数场景。
- **high(高)**:推理强度较高,生成更为详细和复杂的回答,但响应时间较长,且消耗更多的 Token。

通过调整 `reasoning_effort` 参数,可以根据需求在生成速度与推理深度之间找到适合的平衡。例如,在对话场景中,如果更关注快速响应,可以选择低推理强度;如果需要更复杂的分析或推理,可以选择高推理强度。

<Callout>
该参数仅适用于推理模型,如 OpenAI 的 `o1``o1-mini``o3-mini` 等。
</Callout>
4 changes: 4 additions & 0 deletions locales/ar/discover.json
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,10 @@
"title": "جدة الموضوع"
},
"range": "نطاق",
"reasoning_effort": {
"desc": "تُستخدم هذه الإعدادات للتحكم في شدة التفكير التي يقوم بها النموذج قبل توليد الإجابات. الشدة المنخفضة تعطي الأولوية لسرعة الاستجابة وتوفر الرموز، بينما الشدة العالية توفر تفكيرًا أكثر اكتمالًا ولكنها تستهلك المزيد من الرموز وتقلل من سرعة الاستجابة. القيمة الافتراضية هي متوسطة، مما يوازن بين دقة التفكير وسرعة الاستجابة.",
"title": "شدة التفكير"
},
"temperature": {
"desc": "تؤثر هذه الإعدادات على تنوع استجابة النموذج. القيم المنخفضة تؤدي إلى استجابات أكثر توقعًا ونمطية، بينما القيم الأعلى تشجع على استجابات أكثر تنوعًا وغير شائعة. عندما تكون القيمة 0، يعطي النموذج نفس الاستجابة دائمًا لنفس المدخل.",
"title": "عشوائية"
Expand Down
3 changes: 3 additions & 0 deletions locales/ar/models.json
Original file line number Diff line number Diff line change
Expand Up @@ -1184,6 +1184,9 @@
"o1-preview": {
"description": "o1 هو نموذج استدلال جديد من OpenAI، مناسب للمهام المعقدة التي تتطلب معرفة عامة واسعة. يحتوي هذا النموذج على 128K من السياق وتاريخ انتهاء المعرفة في أكتوبر 2023."
},
"o3-mini": {
"description": "o3-mini هو أحدث نموذج استدلال صغير لدينا، يقدم ذكاءً عالياً تحت نفس تكاليف التأخير والأداء مثل o1-mini."
},
"open-codestral-mamba": {
"description": "Codestral Mamba هو نموذج لغة Mamba 2 يركز على توليد الشيفرة، ويوفر دعمًا قويًا لمهام الشيفرة المتقدمة والاستدلال."
},
Expand Down
12 changes: 12 additions & 0 deletions locales/ar/setting.json
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,9 @@
"enableMaxTokens": {
"title": "تمكين الحد الأقصى للردود"
},
"enableReasoningEffort": {
"title": "تفعيل ضبط قوة الاستدلال"
},
"frequencyPenalty": {
"desc": "كلما زادت القيمة، زاد احتمال تقليل تكرار الكلمات",
"title": "عقوبة التكرار"
Expand All @@ -216,6 +219,15 @@
"desc": "كلما زادت القيمة، زاد احتمال التوسع في مواضيع جديدة",
"title": "جديد الحديث"
},
"reasoningEffort": {
"desc": "كلما زادت القيمة، زادت قدرة الاستدلال، ولكن قد يؤدي ذلك إلى زيادة وقت الاستجابة واستهلاك التوكنات",
"options": {
"high": "عالي",
"low": "منخفض",
"medium": "متوسط"
},
"title": "قوة الاستدلال"
},
"temperature": {
"desc": "كلما زادت القيمة، زادت الردود عشوائية أكثر",
"title": "التباين",
Expand Down
4 changes: 4 additions & 0 deletions locales/bg-BG/discover.json
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,10 @@
"title": "Свежест на темата"
},
"range": "Обхват",
"reasoning_effort": {
"desc": "Тази настройка контролира интензивността на разсъжденията на модела преди генерирането на отговор. Ниска интензивност приоритизира скоростта на отговор и спестява токени, докато висока интензивност предоставя по-пълни разсъждения, но изразходва повече токени и намалява скоростта на отговор. Стойността по подразбиране е средна, което балансира точността на разсъжденията и скоростта на отговор.",
"title": "Интензивност на разсъжденията"
},
"temperature": {
"desc": "Тази настройка влияе на разнообразието на отговорите на модела. По-ниски стойности водят до по-предсказуеми и типични отговори, докато по-високи стойности насърчават по-разнообразни и необичайни отговори. Когато стойността е 0, моделът винаги дава един и същ отговор на даден вход.",
"title": "Случайност"
Expand Down
3 changes: 3 additions & 0 deletions locales/bg-BG/models.json
Original file line number Diff line number Diff line change
Expand Up @@ -1184,6 +1184,9 @@
"o1-preview": {
"description": "o1 е новият модел за изводи на OpenAI, подходящ за сложни задачи, изискващи обширни общи знания. Моделът разполага с контекст от 128K и дата на знание до октомври 2023."
},
"o3-mini": {
"description": "o3-mini е нашият най-нов малък модел за инференция, който предлага висока интелигентност при същите разходи и цели за закъснение като o1-mini."
},
"open-codestral-mamba": {
"description": "Codestral Mamba е модел на езика Mamba 2, специализиран в генерирането на код, предоставящ мощна поддръжка за напреднали кодови и разсъждателни задачи."
},
Expand Down
12 changes: 12 additions & 0 deletions locales/bg-BG/setting.json
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,9 @@
"enableMaxTokens": {
"title": "Активиране на ограничението за максимален брой токени"
},
"enableReasoningEffort": {
"title": "Активиране на настройките за интензивност на разсъжденията"
},
"frequencyPenalty": {
"desc": "Колкото по-висока е стойността, толкова по-вероятно е да се намалят повтарящите се думи",
"title": "Наказание за честота"
Expand All @@ -216,6 +219,15 @@
"desc": "Колкото по-висока е стойността, толкова по-вероятно е да се разшири до нови теми",
"title": "Свежест на темата"
},
"reasoningEffort": {
"desc": "Колкото по-висока е стойността, толкова по-силна е способността за разсъждение, но може да увеличи времето за отговор и консумацията на токени",
"options": {
"high": "Висока",
"low": "Ниска",
"medium": "Средна"
},
"title": "Интензивност на разсъжденията"
},
"temperature": {
"desc": "Колкото по-висока е стойността, толкова по-случаен е отговорът",
"title": "Случайност",
Expand Down
4 changes: 4 additions & 0 deletions locales/de-DE/discover.json
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,10 @@
"title": "Themenfrische"
},
"range": "Bereich",
"reasoning_effort": {
"desc": "Diese Einstellung steuert die Intensität des Denkprozesses des Modells, bevor es eine Antwort generiert. Eine niedrige Intensität priorisiert die Geschwindigkeit der Antwort und spart Token, während eine hohe Intensität eine umfassendere Argumentation bietet, jedoch mehr Token verbraucht und die Antwortgeschwindigkeit verringert. Der Standardwert ist mittel, um eine Balance zwischen Genauigkeit des Denkens und Antwortgeschwindigkeit zu gewährleisten.",
"title": "Denkintensität"
},
"temperature": {
"desc": "Diese Einstellung beeinflusst die Vielfalt der Antworten des Modells. Niedrigere Werte führen zu vorhersehbareren und typischen Antworten, während höhere Werte zu vielfältigeren und weniger häufigen Antworten anregen. Wenn der Wert auf 0 gesetzt wird, gibt das Modell für einen bestimmten Input immer die gleiche Antwort.",
"title": "Zufälligkeit"
Expand Down
3 changes: 3 additions & 0 deletions locales/de-DE/models.json
Original file line number Diff line number Diff line change
Expand Up @@ -1184,6 +1184,9 @@
"o1-preview": {
"description": "o1 ist OpenAIs neues Inferenzmodell, das für komplexe Aufgaben geeignet ist, die umfangreiches Allgemeinwissen erfordern. Das Modell hat einen Kontext von 128K und einen Wissensstand bis Oktober 2023."
},
"o3-mini": {
"description": "o3-mini ist unser neuestes kompaktes Inferenzmodell, das bei den gleichen Kosten- und Verzögerungszielen wie o1-mini hohe Intelligenz bietet."
},
"open-codestral-mamba": {
"description": "Codestral Mamba ist ein auf die Codegenerierung spezialisiertes Mamba 2-Sprachmodell, das starke Unterstützung für fortschrittliche Code- und Schlussfolgerungsaufgaben bietet."
},
Expand Down
12 changes: 12 additions & 0 deletions locales/de-DE/setting.json
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,9 @@
"enableMaxTokens": {
"title": "Maximale Token pro Antwort aktivieren"
},
"enableReasoningEffort": {
"title": "Aktivieren Sie die Anpassung der Schlussfolgerungsintensität"
},
"frequencyPenalty": {
"desc": "Je höher der Wert, desto wahrscheinlicher ist es, dass sich wiederholende Wörter reduziert werden",
"title": "Frequenzstrafe"
Expand All @@ -216,6 +219,15 @@
"desc": "Je höher der Wert, desto wahrscheinlicher ist es, dass sich das Gespräch auf neue Themen ausweitet",
"title": "Themenfrische"
},
"reasoningEffort": {
"desc": "Je höher der Wert, desto stärker die Schlussfolgerungsfähigkeit, aber dies kann die Antwortzeit und den Tokenverbrauch erhöhen.",
"options": {
"high": "Hoch",
"low": "Niedrig",
"medium": "Mittel"
},
"title": "Schlussfolgerungsintensität"
},
"temperature": {
"desc": "Je höher der Wert, desto zufälliger die Antwort",
"title": "Zufälligkeit",
Expand Down
4 changes: 4 additions & 0 deletions locales/en-US/discover.json
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,10 @@
"title": "Topic Freshness"
},
"range": "Range",
"reasoning_effort": {
"desc": "This setting controls the intensity of reasoning the model applies before generating a response. Low intensity prioritizes response speed and saves tokens, while high intensity provides more comprehensive reasoning but consumes more tokens and slows down response time. The default value is medium, balancing reasoning accuracy with response speed.",
"title": "Reasoning Intensity"
},
"temperature": {
"desc": "This setting affects the diversity of the model's responses. Lower values lead to more predictable and typical responses, while higher values encourage more diverse and less common responses. When set to 0, the model always gives the same response to a given input.",
"title": "Randomness"
Expand Down
3 changes: 3 additions & 0 deletions locales/en-US/models.json
Original file line number Diff line number Diff line change
Expand Up @@ -1184,6 +1184,9 @@
"o1-preview": {
"description": "o1 is OpenAI's new reasoning model, suitable for complex tasks that require extensive general knowledge. This model features a 128K context and has a knowledge cutoff date of October 2023."
},
"o3-mini": {
"description": "o3-mini is our latest small inference model that delivers high intelligence while maintaining the same cost and latency targets as o1-mini."
},
"open-codestral-mamba": {
"description": "Codestral Mamba is a language model focused on code generation, providing strong support for advanced coding and reasoning tasks."
},
Expand Down
12 changes: 12 additions & 0 deletions locales/en-US/setting.json
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,9 @@
"enableMaxTokens": {
"title": "Enable Max Tokens Limit"
},
"enableReasoningEffort": {
"title": "Enable Reasoning Effort Adjustment"
},
"frequencyPenalty": {
"desc": "The higher the value, the more likely it is to reduce repeated words",
"title": "Frequency Penalty"
Expand All @@ -216,6 +219,15 @@
"desc": "The higher the value, the more likely it is to expand to new topics",
"title": "Topic Freshness"
},
"reasoningEffort": {
"desc": "The higher the value, the stronger the reasoning ability, but it may increase response time and token consumption.",
"options": {
"high": "High",
"low": "Low",
"medium": "Medium"
},
"title": "Reasoning Effort"
},
"temperature": {
"desc": "The higher the value, the more random the response",
"title": "Randomness",
Expand Down
4 changes: 4 additions & 0 deletions locales/es-ES/discover.json
Original file line number Diff line number Diff line change
Expand Up @@ -126,6 +126,10 @@
"title": "Novedad del tema"
},
"range": "Rango",
"reasoning_effort": {
"desc": "Esta configuración se utiliza para controlar la intensidad de razonamiento del modelo antes de generar una respuesta. Una baja intensidad prioriza la velocidad de respuesta y ahorra tokens, mientras que una alta intensidad proporciona un razonamiento más completo, pero consume más tokens y reduce la velocidad de respuesta. El valor predeterminado es medio, equilibrando la precisión del razonamiento con la velocidad de respuesta.",
"title": "Intensidad de razonamiento"
},
"temperature": {
"desc": "Esta configuración afecta la diversidad de las respuestas del modelo. Un valor más bajo resultará en respuestas más predecibles y típicas, mientras que un valor más alto alentará respuestas más diversas y menos comunes. Cuando el valor se establece en 0, el modelo siempre dará la misma respuesta para una entrada dada.",
"title": "Aleatoriedad"
Expand Down
3 changes: 3 additions & 0 deletions locales/es-ES/models.json
Original file line number Diff line number Diff line change
Expand Up @@ -1184,6 +1184,9 @@
"o1-preview": {
"description": "o1 es el nuevo modelo de inferencia de OpenAI, adecuado para tareas complejas que requieren un amplio conocimiento general. Este modelo tiene un contexto de 128K y una fecha de corte de conocimiento en octubre de 2023."
},
"o3-mini": {
"description": "o3-mini es nuestro último modelo de inferencia de tamaño pequeño, que ofrece alta inteligencia con los mismos objetivos de costo y latencia que o1-mini."
},
"open-codestral-mamba": {
"description": "Codestral Mamba es un modelo de lenguaje Mamba 2 enfocado en la generación de código, que proporciona un fuerte apoyo para tareas avanzadas de codificación y razonamiento."
},
Expand Down
12 changes: 12 additions & 0 deletions locales/es-ES/setting.json
Original file line number Diff line number Diff line change
Expand Up @@ -200,6 +200,9 @@
"enableMaxTokens": {
"title": "Activar límite de tokens por respuesta"
},
"enableReasoningEffort": {
"title": "Activar ajuste de intensidad de razonamiento"
},
"frequencyPenalty": {
"desc": "Cuanto mayor sea el valor, más probable es que se reduzcan las repeticiones de palabras",
"title": "Penalización de frecuencia"
Expand All @@ -216,6 +219,15 @@
"desc": "Cuanto mayor sea el valor, más probable es que se amplíe a nuevos temas",
"title": "Penalización de novedad del tema"
},
"reasoningEffort": {
"desc": "Cuanto mayor sea el valor, más fuerte será la capacidad de razonamiento, pero puede aumentar el tiempo de respuesta y el consumo de tokens.",
"options": {
"high": "Alto",
"low": "Bajo",
"medium": "Medio"
},
"title": "Intensidad de razonamiento"
},
"temperature": {
"desc": "Cuanto mayor sea el valor, más aleatoria será la respuesta",
"title": "Temperatura",
Expand Down
Loading