Refine system message settings #14877

JonasHelming · 2025-02-09T22:03:19Z

What it does

replaces the supportsDeveloperMessage with developerMessageSettings (enum: ['user', 'system', 'developer', 'mergeWithFollowingUserMessage', 'skip'])

Controls the first system message: user, system, and developer will be used as a role, mergeWithFollowingUserMessage will prefix the following user message with the system message or convert the system message to user message if the next message is not a user message. skip will just remove the system message), defaulting to developer.

Extract message processing into a rebindable OpenAiModelUtils so that future special cases can be handled by adopters
Add test case for message handling

How to test

Test official and custom OpenAI models. Specifically check official: ['o1-preview', 'o1-mini'] that the system message is converted to a user message. All other official openAI models should use 'developer'

To test custom models, you can use:

{
"model": "gpt-4o",
"url": "https://api.openai.com/v1",
"id": "openaicustom",
"apiKey": true,
"enableStreaming": true,
"developerMessageSettings": "valueToTest"
},

DeepSeek reasoner now works with these settings:
{
"model": "deepseek-reasoner",
"url": "https://api.deepseek.com",
"id": "deepseek-resoner",
"apiKey": "YourKey",
"enableStreaming": true,
"developerMessageSettings": "mergeWithFollowingUserMessage"
},

To test, set a break point to where the messages are converted (the requests are sent)

Follow-ups

Breaking changes

This PR introduces breaking changes and requires careful review. If yes, the breaking changes section in the changelog has been updated.

Attribution

Review checklist

As an author, I have thoroughly tested my changes and carefully followed the review guidelines

Reminder for reviewers

As a reviewer, I agree to behave in accordance with the review guidelines

planger

Thank you, looks very good!

Just one comment inline that might be worth discussing, as we currently have pretty strong assumptions on the order of messages we get. Do you think we should be more flexible to the order and improve the algorithm to support an arbitrary order of system, user and assistant messages? Theoretically it is not specified that the system message is the first one.

planger · 2025-02-10T08:12:34Z

packages/ai-openai/src/node/openai-language-model.ts

+     * @param model the OpenAI model identifier. Currently not used, but allows subclasses to implement model-specific behavior.
+     * @returns an array of messages formatted for the OpenAI API.
+     */
+    public processMessages(


Suggested change

public processMessages(

processMessages(

planger · 2025-02-10T08:15:41Z

packages/ai-openai/src/node/openai-language-model.ts

+        messages: LanguageModelRequestMessage[],
+        developerMessageSettings: DeveloperMessageSettings
+    ): LanguageModelRequestMessage[] {
+        if (messages.length > 0 && messages[0].actor === 'system') {


I'm wondering whether we should really limit this behavior for system messages that are on the first position. Shouldn't we just iterate through the list and apply this logic to all system messages wherever they are?

We could then maybe rename mergeWithFirstUserMessage with mergeWithNextUserMessage.

Agreed. The interesting cases are then:

system:1

system:2

assistant:1

I would merge this to:

user:1\n2

assistant:1

and

system:1

system:2

user:A

i would merge this to

user: 1\n2\nA

Reasoning would be that there are APIs which forbid two following user messages.
We can easily achieve this behavior by running from last to first while merging.

@planger

Yes, this looks good and makes sense! Thank you!

I change the value to "mergeWithFollowingUserMessage". It only merges, if the next message is a user message. Otherwise, I fear we risk very strange changes in the order of message. You can of course not produce message orders in which this might break specific models, e.g.

user

system

assistant

user

will become

user

user

assistant

user

But if you have these super special cases, you can now still override the processMessages.

planger · 2025-02-10T08:22:23Z

packages/ai-openai/src/node/openai-language-model.ts

+            } else if (developerMessageSettings === 'mergeWithFirstUserMessage') {
+                const systemMsg = messages[0];
+                const updatedMessages = messages.slice();
+                const userIndex = updatedMessages.findIndex((m, index) => index > 0 && m.actor === 'user');


This seems a bit dangerous in a multi-turn conversation: where there are messages with the following order: user, system, assistant, user, assistant. We may skip over to the next user request essentially.

I think it'd be safer to find the next user and the next assistant message and if the next assistant message is closer than the next user message (or we receive -1 for user), we insert a user message as we do below.

This way, we prevent merging the system message into the next user/assistant response pair.

planger

Thank you! Looks good to me.

@fipro78 Could you please check if the use case you've enabled in #14722 is still functional after this PR? As you'll see in this PR, we had to make the system message handling even more flexible now, as some models (DeepSeek, etc.) have diverging capabilities/requirements when it comes to the system message. Thank you!

fipro78 · 2025-02-13T14:56:11Z

Thank you! Looks good to me.

@fipro78 Could you please check if the use case you've enabled in #14722 is still functional after this PR? As you'll see in this PR, we had to make the system message handling even more flexible now, as some models (DeepSeek, etc.) have diverging capabilities/requirements when it comes to the system message. Thank you!

Sure. How can I test this?

planger · 2025-02-14T09:12:08Z

@fipro78 Thank you! It'd be great if you could build this PR locally and test if you can still correctly configure and connect to an Azure OpenAI instance, in particular regarding the system message setting. We'd like to ensure that we don't break your use case with this PR, but don't have an Azure instance to test this, unfortunately.

Thanks a lot!

fipro78 · 2025-02-19T05:09:20Z

@JonasHelming @planger
I checked out this PR, built locally and tested with our Azure OpenAI.
If I use "developerMessageSettings": "developer", I get the expected error, that the role developer is not supported by this model. Switching then to "developerMessageSettings": "system" I get an answer from the model.

So from my point of view, the PR does not break the configuration and connection to an Azure OpenAI instance.

fixed #14867

JonasHelming requested a review from planger February 9, 2025 22:51

planger approved these changes Feb 10, 2025

View reviewed changes

planger approved these changes Feb 13, 2025

View reviewed changes

JonasHelming mentioned this pull request Feb 14, 2025

Theia AI G&A - AI-powered Theia IDE alpha - Release epic #14923

Open

46 tasks

JonasHelming added 7 commits February 19, 2025 08:51

Refine system message settings

4c40085

fixed #14867

Adress review comments

ee0171f

fix typos

d74eb30

Remove print

9f3a202

Removed typo

c8696a2

Fix more typos and update Readme

6c4c698

Remove developerMessageSettings

7ff280a

JonasHelming force-pushed the GH-14867 branch from 370bd79 to 7ff280a Compare February 19, 2025 07:54

JonasHelming merged commit 0d40de4 into master Feb 19, 2025
10 of 11 checks passed

github-actions bot added this to the 1.59.0 milestone Feb 19, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refine system message settings #14877

Refine system message settings #14877

JonasHelming commented Feb 9, 2025 •

edited

Loading

planger left a comment

planger Feb 10, 2025

planger Feb 10, 2025

JonasHelming Feb 10, 2025

planger Feb 12, 2025

JonasHelming Feb 13, 2025

planger Feb 10, 2025

planger left a comment

fipro78 commented Feb 13, 2025

planger commented Feb 14, 2025 •

edited

Loading

fipro78 commented Feb 19, 2025

Refine system message settings #14877

Refine system message settings #14877

Conversation

JonasHelming commented Feb 9, 2025 • edited Loading

What it does

How to test

Follow-ups

Breaking changes

Attribution

Review checklist

Reminder for reviewers

planger left a comment

Choose a reason for hiding this comment

planger Feb 10, 2025

Choose a reason for hiding this comment

planger Feb 10, 2025

Choose a reason for hiding this comment

JonasHelming Feb 10, 2025

Choose a reason for hiding this comment

planger Feb 12, 2025

Choose a reason for hiding this comment

JonasHelming Feb 13, 2025

Choose a reason for hiding this comment

planger Feb 10, 2025

Choose a reason for hiding this comment

planger left a comment

Choose a reason for hiding this comment

fipro78 commented Feb 13, 2025

planger commented Feb 14, 2025 • edited Loading

fipro78 commented Feb 19, 2025

JonasHelming commented Feb 9, 2025 •

edited

Loading

planger commented Feb 14, 2025 •

edited

Loading