Skip to content

Conversation

@lramos15
Copy link
Member

@lramos15 lramos15 commented Nov 20, 2025

I'm not sure I love this flow, but I wanted each tool to be able to handle its own messaging while data streams in because it would likely know best. If we want to simplify the API we could just have one centralized streaming tool call processor in Copilot that handles everything before any tools have prepare called.

The current flow is

  1. Stream gets a prepareToolInvocation pushed to it with partial stream data
  2. Core calls handleToolStream with that data
  3. The returned invocation message is then rendered as a progress message which updates the progress message rather than stacking messages

Copilot AI review requested due to automatic review settings November 20, 2025 19:03
@lramos15 lramos15 enabled auto-merge (squash) November 20, 2025 19:03
@lramos15 lramos15 self-assigned this Nov 20, 2025
@vs-code-engineering
Copy link

📬 CODENOTIFY

The following users are being notified based on files changed in this PR:

@roblourens

Matched files:

  • src/vs/workbench/contrib/chat/browser/chatListRenderer.ts

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds support for streaming partial tool data during tool invocations. The implementation allows tools to provide real-time progress updates while their input data is being streamed, improving user experience by showing tool preparation status rather than a generic "working" indicator.

Key Changes:

  • Added handleToolStream method to the tool system for processing partial streaming data
  • Introduced IToolInvocationStreamContext and IStreamedToolInvocation interfaces for streaming context and results
  • Modified progress message handling to support replacing previous messages via replacesPreviousMessage flag
  • Updated ChatPrepareToolInvocationPart to include optional streamData field

Reviewed Changes

Copilot reviewed 13 out of 14 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
src/vs/workbench/contrib/chat/common/languageModelToolsService.ts Added streaming interfaces and method signature for handleToolStream
src/vs/workbench/contrib/chat/browser/languageModelToolsService.ts Implemented handleToolStream method with tool activation and error handling
src/vs/workbench/contrib/chat/common/chatService.ts Added replacesPreviousMessage flag to IChatProgressMessage and streamData to IChatPrepareToolInvocationPart
src/vs/workbench/contrib/chat/common/chatModel.ts Implemented logic to replace previous progress messages instead of appending
src/vs/workbench/contrib/chat/browser/chatListRenderer.ts Updated "working" indicator logic to hide when prepareToolInvocation is present
src/vs/workbench/api/common/extHostTypes.ts Updated ChatPrepareToolInvocationPart constructor to accept optional streamData parameter
src/vs/workbench/api/common/extHostTypeConverters.ts Added conversion logic for streamData field
src/vs/workbench/api/common/extHostLanguageModelTools.ts Implemented $handleToolStream method in extension host
src/vs/workbench/api/common/extHostChatAgents2.ts Updated prepareToolInvocation to accept streamData parameter
src/vs/workbench/api/common/extHost.protocol.ts Added $handleToolStream to protocol interface
src/vs/workbench/api/browser/mainThreadLanguageModelTools.ts Added handleToolStream implementation in main thread
src/vs/workbench/api/browser/mainThreadChatAgents2.ts Implemented progress chunk handling for tool stream invocations
src/vs/workbench/contrib/chat/test/common/mockLanguageModelToolsService.ts Added stub implementation and updated signature for test mock

((lastPart.kind === 'textEditGroup' || lastPart.kind === 'notebookEditGroup') && lastPart.done && !partsToRender.some(part => part.kind === 'toolInvocation' && !IChatToolInvocation.isComplete(part))) ||
(lastPart.kind === 'progressTask' && lastPart.deferred.isSettled) ||
lastPart.kind === 'prepareToolInvocation' || lastPart.kind === 'mcpServersStarting'
lastPart.kind === 'mcpServersStarting'
Copy link

Copilot AI Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of lastPart.kind === 'prepareToolInvocation' from this condition creates inconsistency. The earlier check at line 774 prevents showing "working" when prepareToolInvocation is present, but this line determines when to show "working" based on the last part. If prepareToolInvocation is the last part, the earlier check would prevent showing "working", making this change unnecessary. Consider documenting why prepareToolInvocation should not be treated as a completion state here.

Copilot uses AI. Check for mistakes.
Comment on lines +688 to +691
/**
* Raw argument payload, such as the streamed JSON fragment from the language model.
*/
readonly rawInput?: unknown;
Copy link
Member

@connor4312 connor4312 Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I question whether this is useful/something we'd want to expose in the API like this. Tools are generic to any specific model. The tool doesn't know how the input is encoded. JSON is the default right now, but OAI's has `raw tool calls, and maybe in the future something like Toon gains traction.

We can also do some transforms on the call object after validation -- for example, Sonnet has a propensity to double JSON-encode nested objects and arrays and we can correct those.

That said I really like the idea of being able to show e.g. the files that are about to be edited while the call is streaming in... maybe we just

  1. spec this is a fragment of a JSON-encoded string (and it's up to the implementation to convert that if needed)
  2. explicitly say the input might be transformed before the actual tool call

(of course this is all proposed API right now, but for the general shape of things)

Copy link
Member Author

@lramos15 lramos15 Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's safe to say handleToolStream is just a string that will be partial data from whatever the stream returns. Then when prepareInvocation is called we guarnatee it's in some format the tool expects. My guess is the tool will have to know how to handle various formats.

The benefit of exposing this is if you have a durable JSON parser that can parse incomplete data you could likely render that file foo.js is being created and it has x lines at the moment

Copy link
Member

@connor4312 connor4312 Nov 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, but I'm suggesting we have a way to identify that. Maybe there's a mimeType on the stream so the producer/consumer can handle the data correctly? OAI "raw" tools would emit text/plain as their mimetype. (We don't have any of these yet, but they have said that raw works a lot better for apply_patch and it's on our todo to adopt that)

@lramos15 lramos15 requested review from justschen and removed request for justschen November 20, 2025 19:54
Copy link
Member

@roblourens roblourens left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I see the extension side of it too?

@lramos15
Copy link
Member Author

Can I see the extension side of it too?

Please see microsoft/vscode-copilot-chat#2165

@lramos15 lramos15 disabled auto-merge November 24, 2025 21:41
@lramos15 lramos15 requested a review from roblourens November 24, 2025 22:00
}
},
prepareToolInvocation: (context, token) => this._proxy.$prepareToolInvocation(id, context, token),
handleToolStream: (context, token) => this._proxy.$handleToolStream(id, context, token),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this only be defined if the tool in the ext host side has this method defined?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would we know if the tool in the ext host has this defined? Isn't this the same as prepareToolInvocation?

// Hide when following content contains parts that are not prepareToolInvocation or progressMessage
// This is similar to how progressMessage parts hide when other content arrives
const followingContent = context.content.slice(context.contentIndex + 1);
this.isHiddenByFollowingContent = followingContent.some(part => part.kind !== 'prepareToolInvocation' && part.kind !== 'progressMessage');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a huge fan of of this approach of rendering a new part for each piece of the stream. If we imagine a pedantic case where I have a tool call emitting a word at a time generating an enormous file this could get expensive even for relatively small files of a few hundred KB.

How I would imagine it should work is the stream gets passed to invokeTool() in the LanguageModelToolsService, which internally could create a content part that includes an IObservable with whatever information should get shown in the UI

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already do this approach for progress which is why I followed it

const followingContent = context.content.slice(context.contentIndex + 1);
this.showSpinner = forceShowSpinner ?? shouldShowSpinner(followingContent, context.element);
this.isHidden = forceShowMessage !== true && followingContent.some(part => part.kind !== 'progressMessage');

since effectively this is how we want this to behave as well. But I guess with progress that is a part that gets emitted a lot less

};

try {
const streamResult = await this.languageModelToolsService.handleToolStream(toolData.id, streamContext, CancellationToken.None);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot do side-effects like this from the chat widgets, as this will not work for background sessions.

One could say that partial results don't matter if the user isn't actively looking at the window, but we then might get into weird states if the user opens or closes the session mid-stream. Better imo to keep the view and the data model separate.


// If following content contains parts that are not prepareToolInvocation or progressMessage,
// we need to re-render to hide this part (similar to progressMessage behavior)
const shouldBeHidden = followingContent.some(part => part.kind !== 'prepareToolInvocation' && part.kind !== 'progressMessage');
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should follow this pattern of hiding things. Now we have a toolcallid so we can associate the incomplete call with the real call, it seems to me like the incoplete call should just sort of resolve into the real call widget.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants