Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support multiple tool calls in render function #1210

Closed
wants to merge 8 commits into from

Conversation

unstubbable
Copy link
Contributor

@unstubbable unstubbable commented Mar 22, 2024

Firstly, I would like to apologize for proposing such a significant change without first discussing its API in an issue, and inquiring if contributions of this nature would be welcome. My initial intention was to make a minor adjustment to enable support for multiple tool calls. However, one thing led to another, and here we are.

The motivation behind this change was an issue in the current implementation: when the LLM decides to send more than one tool call, the latter would overwrite the UI of the previous tool calls. Additionally, the text UI would be overwritten by the tool UI unless specifically handled by storing the last text content and inserting it into the tool UI.

With the proposed compose renderer (defaulting to a React Fragment), we aim to support composing all three types of LLM output:

  • Text
  • A function call
  • Multiple tool calls

I am fully open to completely revising the API based on your feedback, or even discarding this approach if you think it is unnecessary or if you have other plans for the SDK. After all, it should be considered that advanced use cases like this might be better off utilizing the low-level APIs, as demonstrated in examples/next-ai-rsc/app/action.tsx, rather than using render.

PS: The diff may appear larger than it actually is because I also added a basic example app that replicates the example from the Generative UI docs, which can also be used to test multiple tool calls with render:

multiple tool calls

Disclaimer: This text was edited by an LLM.

compose?: Renderer<{
text: React.ReactNode | undefined;
functionCall: { name: keyof FS; node: React.ReactNode } | undefined;
toolCalls: { name: keyof TS; id: string; node: React.ReactNode }[];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the author of #1166 exactly had in mind, but it's possible that this might fix the issue.

@unstubbable unstubbable marked this pull request as ready for review March 22, 2024 10:32
@unstubbable unstubbable force-pushed the multiple-tool-calls branch from aecde2e to a92f435 Compare March 23, 2024 07:48
@unstubbable
Copy link
Contributor Author

When the LLM decides to send more than one tool call, the latter would overwrite the UI of the previous tool calls. Additionally, the text UI would be overwritten by the tool UI unless specifically handled by storing the last text content and inserting it into the tool UI.
[...]
It should be considered that advanced use cases like this might be better off utilizing the low-level APIs, as demonstrated in examples/next-ai-rsc/app/action.tsx, rather than using render.

I changed my mind on this being an "advanced use case", because any user that follows the render example from the Generative UI docs is likely to stumble over these two issues. So I would argue this should just work out of the box, which it would with the provided default compose renderer.

@unstubbable unstubbable force-pushed the multiple-tool-calls branch from a92f435 to 6af84af Compare March 26, 2024 10:12
@shuding
Copy link
Member

shuding commented Mar 26, 2024

Love this, thank you! I think we want this PR to be in, but I'm currently a bit concerned about the current text() function being caused multiple times which causes a Suspense-related reconciliation issue (still looking into it). Hence I had #1195 (but I don't want to actually land it if we can get rid of that issue).

@unstubbable
Copy link
Contributor Author

I'm currently a bit concerned about the current text() function being caused multiple times which causes a Suspense-related reconciliation issue (still looking into it). Hence I had #1195 (but I don't want to actually land it if we can get rid of that issue).

@shuding Just out of curiosity, how does this issue manifest itself?

@unstubbable
Copy link
Contributor Author

I'm currently a bit concerned about the current text() function being caused multiple times which causes a Suspense-related reconciliation issue (still looking into it). Hence I had #1195 (but I don't want to actually land it if we can get rid of that issue).

@shuding Just out of curiosity, how does this issue manifest itself?

I think now I understand what you're trying to solve with #1195. It's basically the same issue that's reported in #1257, where update(node)done() leads to the node being rendered in the suspense fallback and children. And with Flight currently not being able to deduplicate these elements (only detriplicate), this leads to a remount in the client. Hm, tricky...

On a related note, with facebook/react#28847 having landed, and facebook/react#28849 landing soon, I wonder how the two different async generator concepts (one representing a list, the other representing a value over time) won't confuse users of the AI SDK. 🤔

@shuding
Copy link
Member

shuding commented Apr 17, 2024

Yes @unstubbable that's totally what I've been thinking about (sorry I missed your previous message). React will reconcile the node even the same node is passed to both fallback and the children of Suspense (<Suspense fallback={node}>{node}</Suspense>). This makes render() unable to stream any client component without remounting it.

In terms of UX, it's fine for tool calls but for text() it becomes tricky. Imagine that if you do:

render({
  ...
  text({ content }) {
    return <Message>{content}</Message>
  }
})

Then there's no way to keep any internal state of <Message>. And with compose() that issue becomes unavoidable as well.

With low level APIs like createStreamableValue, the issue can be avoided:

const streamableValue = createStreamableValue('')

render({
  ...
  text({ content }) {
    streamableValue.update(content)
  }
})

return <Message content={streamableValue.value} />

Hence I still think that the API or maybe abstraction of render() isn't super great.

The short-term move is to make these "render" functions not that like React's "render". Like renaming the render method of tools as execute or maybe generate.

@shuding
Copy link
Member

shuding commented Apr 17, 2024

the other representing a value over time

Could you point me to the place of this? I briefly check the React PRs and they all seem to represent a list. Thanks!

@unstubbable
Copy link
Contributor Author

the other representing a value over time

Could you point me to the place of this? I briefly check the React PRs and they all seem to represent a list. Thanks!

@shuding Sorry, I wasn't clear about that. With "two different async generator concepts" I meant one being the React (Async) Generator Server Components, and the other being the tools render (async) generator function in ai/rsc.

In React, this would render a list of items:

async function* ItemsServerComponent() {
  yield <p>A</p>;
  await Promise.resolve();
  yield <p>B</p>;
  await Promise.resolve();
  yield <p>C</p>;
}
function ListClientComponent({ children }) {
  // Unwrap async iterables here with React.use(), until natively supported in React.
}
<ListClientComponent>
  <ItemsServerComponent />
</ListClientComponent>

Result:

<p>A</p>
<p>B</p>
<p>C</p>

Whereas in ai/rsc, this would render first a spinner, and afterwards the final component, and not both in a list:

{
  render: async function* ({ city }) {
    yield <div>🌀</div>;

    const { temperature } = await getWeatherInfo(city);
    // update AI state ...

    return <div>{temperature}</div>
  }
}

Intermediate result:

<div>🌀</div>

Final result:

<div>24°C</div>

So at first glance, the two look very similar, but they behave totally different.

The short-term move is to make these "render" functions not that like React's "render". Like renaming the render method of tools as execute or maybe generate.

With this apprach though, you might be able solve the aforementioned conflict of concepts.

@shuding
Copy link
Member

shuding commented Apr 19, 2024

@unstubbable Yes I think we're on the same page :)

@lgrammel
Copy link
Collaborator

lgrammel commented May 8, 2024

We now recommend using streamUI instead of render. Would this apply to streamUI as well?

@unstubbable
Copy link
Contributor Author

Yes, it would, but I was holding off on re-implementing it in streamUI because of the aforementioned reconciliation issue. I'm fine with closing this PR for now. We can create a new one if/when it makes sense.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants