Skip to content

Conversation

ehhuang
Copy link
Contributor

@ehhuang ehhuang commented Jan 27, 2025

What does this PR do?

When client tools are used, create_turn currently terminates after executing the client tool requested by the model.
This PR proposes to mirror the behavior when only server side tools are used where the tool response is passed back to the model for a final response.

Test Plan

Before:
image

After:
image

Sources

Please link relevant resources if necessary.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Ran pre-commit to handle lint / formatting issues.
  • Read the contributor guideline,
    Pull Request section?
  • Updated relevant documentation.
  • Wrote necessary unit or integration tests.

@ehhuang ehhuang marked this pull request as ready for review January 27, 2025 23:53
@ehhuang ehhuang marked this pull request as draft January 27, 2025 23:54
@ehhuang ehhuang marked this pull request as ready for review January 28, 2025 00:41
Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good thanks!

@ehhuang ehhuang merged commit 2cc1782 into main Jan 28, 2025
2 checks passed
@ehhuang ehhuang deleted the agent-resume-client-tool branch January 28, 2025 19:11
ehhuang added a commit that referenced this pull request Feb 13, 2025
Summary:

In #102, we made a turn's behavior more complete by automatically passing back the tool response and create another turn when client tool is used.

However, this creates a problem with the non-streaming API where the response object only contains information since the last tool call.

This PR is a hacky attemp to address this, by combining the Turn responses into one. I think ideally we should move all the loop logic to only be on the server side, where a turn would pause and the client SDK would pass tool reponses back to resume a turn.

Test Plan:

Run a simple script with Agent and client tool. Observe the returned response has steps from both created turns.

Turn(
│   input_messages=[
│   │   UserMessage(
│   │   │   content='load https://llama-stack.readthedocs.io/en/latest/introduction/index.html and summarize it',
│   │   │   role='user',
│   │   │   context=None
│   │   )
│   ],
│   output_message=CompletionMessage(
│   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   role='assistant',
│   │   stop_reason='end_of_turn',
│   │   tool_calls=[]
│   ),
│   session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba',
│   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186),
│   steps=[
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content='',
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[
│   │   │   │   │   ToolCall(
│   │   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   │   tool_name='load_url'
│   │   │   │   │   )
│   │   │   │   ]
│   │   │   ),
│   │   │   step_id='d724a238-d02b-4d77-a4bc-a978a54979c6',
│   │   │   step_type='inference',
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535)
│   │   ),
│   │   ToolExecutionStep(
│   │   │   step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89',
│   │   │   step_type='tool_execution',
│   │   │   tool_calls=[
│   │   │   │   ToolCall(
│   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   tool_responses=[
│   │   │   │   ToolResponse(
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   content='{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to produce new hardware.\n\nPreviously, we have been working on our own replacement firmware: RebbleOS. As you can see by the commit history though, progress was slow. Building a production-ready realtime OS for the Pebble is no small feat, and although we were confident we’d get there given enough time, it was never our ideal path. Thanks to the hard work of many people both within Google and not, we finally have our hands on the original source code for PebbleOS. You can read Google’s blog post on this for even more information.\n\nThis does not mean we instantly have the ability to start developing updates for PebbleOS though, we first will need to spend some concentrated time getting it to build. But before we talk about that, let’s talk about Rebble itself.\n"}',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756)
│   │   ),
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[]
│   │   │   ),
│   │   │   step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a',
│   │   │   step_type='inference',
│   │   │   turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449)
│   │   )
│   ],
│   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199),
│   output_attachments=[]
)
```
ehhuang added a commit that referenced this pull request Feb 13, 2025
Summary:

In #102, we made a turn's behavior more complete by automatically passing back the tool response and create another turn when client tool is used.

However, this creates a problem with the non-streaming API where the response object only contains information since the last tool call.

This PR is a hacky attemp to address this, by combining the Turn responses into one. I think ideally we should move all the loop logic to only be on the server side, where a turn would pause and the client SDK would pass tool reponses back to resume a turn.

Test Plan:

Added test in

Run a simple script with Agent and client tool. Observe the returned response has steps from both created turns.

Turn(
│   input_messages=[
│   │   UserMessage(
│   │   │   content='load https://llama-stack.readthedocs.io/en/latest/introduction/index.html and summarize it',
│   │   │   role='user',
│   │   │   context=None
│   │   )
│   ],
│   output_message=CompletionMessage(
│   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   role='assistant',
│   │   stop_reason='end_of_turn',
│   │   tool_calls=[]
│   ),
│   session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba',
│   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186),
│   steps=[
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content='',
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[
│   │   │   │   │   ToolCall(
│   │   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   │   tool_name='load_url'
│   │   │   │   │   )
│   │   │   │   ]
│   │   │   ),
│   │   │   step_id='d724a238-d02b-4d77-a4bc-a978a54979c6',
│   │   │   step_type='inference',
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535)
│   │   ),
│   │   ToolExecutionStep(
│   │   │   step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89',
│   │   │   step_type='tool_execution',
│   │   │   tool_calls=[
│   │   │   │   ToolCall(
│   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   tool_responses=[
│   │   │   │   ToolResponse(
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   content='{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to produce new hardware.\n\nPreviously, we have been working on our own replacement firmware: RebbleOS. As you can see by the commit history though, progress was slow. Building a production-ready realtime OS for the Pebble is no small feat, and although we were confident we’d get there given enough time, it was never our ideal path. Thanks to the hard work of many people both within Google and not, we finally have our hands on the original source code for PebbleOS. You can read Google’s blog post on this for even more information.\n\nThis does not mean we instantly have the ability to start developing updates for PebbleOS though, we first will need to spend some concentrated time getting it to build. But before we talk about that, let’s talk about Rebble itself.\n"}',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756)
│   │   ),
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[]
│   │   │   ),
│   │   │   step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a',
│   │   │   step_type='inference',
│   │   │   turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449)
│   │   )
│   ],
│   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199),
│   output_attachments=[]
)
```
ehhuang added a commit that referenced this pull request Feb 13, 2025
Summary:

In #102, we made a turn's behavior more complete by automatically passing back the tool response and create another turn when client tool is used.

However, this creates a problem with the non-streaming API where the response object only contains information since the last tool call.

This PR is a hacky attemp to address this, by combining the Turn responses into one. I think ideally we should move all the loop logic to only be on the server side, where a turn would pause and the client SDK would pass tool reponses back to resume a turn.

I also changed it to not yield ToolResponseMessage but instead yield a proper ToolExecutionStep event so that it can be treated the same as server side tool execution in terms of logging. I.e. it now outputs:
"tool_execution> Tool:load_url Response:{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our" instead of "CustomTool> {"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to "

Test Plan:

Added test in llamastack/llama-stack#1078

Run a simple script with Agent and client tool. Observe the returned response has steps from both created turns.

Turn(
│   input_messages=[
│   │   UserMessage(
│   │   │   content='load https://llama-stack.readthedocs.io/en/latest/introduction/index.html and summarize it',
│   │   │   role='user',
│   │   │   context=None
│   │   )
│   ],
│   output_message=CompletionMessage(
│   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   role='assistant',
│   │   stop_reason='end_of_turn',
│   │   tool_calls=[]
│   ),
│   session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba',
│   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186),
│   steps=[
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content='',
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[
│   │   │   │   │   ToolCall(
│   │   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   │   tool_name='load_url'
│   │   │   │   │   )
│   │   │   │   ]
│   │   │   ),
│   │   │   step_id='d724a238-d02b-4d77-a4bc-a978a54979c6',
│   │   │   step_type='inference',
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535)
│   │   ),
│   │   ToolExecutionStep(
│   │   │   step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89',
│   │   │   step_type='tool_execution',
│   │   │   tool_calls=[
│   │   │   │   ToolCall(
│   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   tool_responses=[
│   │   │   │   ToolResponse(
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   content='{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to produce new hardware.\n\nPreviously, we have been working on our own replacement firmware: RebbleOS. As you can see by the commit history though, progress was slow. Building a production-ready realtime OS for the Pebble is no small feat, and although we were confident we’d get there given enough time, it was never our ideal path. Thanks to the hard work of many people both within Google and not, we finally have our hands on the original source code for PebbleOS. You can read Google’s blog post on this for even more information.\n\nThis does not mean we instantly have the ability to start developing updates for PebbleOS though, we first will need to spend some concentrated time getting it to build. But before we talk about that, let’s talk about Rebble itself.\n"}',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756)
│   │   ),
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[]
│   │   │   ),
│   │   │   step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a',
│   │   │   step_type='inference',
│   │   │   turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449)
│   │   )
│   ],
│   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199),
│   output_attachments=[]
)
```
ehhuang added a commit that referenced this pull request Feb 13, 2025
Summary:

In #102, we made a turn's behavior more complete by automatically passing back the tool response and create another turn when client tool is used.

However, this creates a problem with the non-streaming API where the response object only contains information since the last tool call.

This PR is a hacky attemp to address this, by combining the Turn responses into one. I think ideally we should move all the loop logic to only be on the server side, where a turn would pause and the client SDK would pass tool reponses back to resume a turn.

I also changed it to not yield ToolResponseMessage but instead yield a proper ToolExecutionStep event so that it can be treated the same as server side tool execution in terms of logging. I.e. it now outputs:
"tool_execution> Tool:load_url Response:{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our" instead of "CustomTool> {"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to "

Test Plan:

Added test in llamastack/llama-stack#1078

Run a simple script with Agent and client tool. Observe the returned response has steps from both created turns.

Turn(
│   input_messages=[
│   │   UserMessage(
│   │   │   content='load https://llama-stack.readthedocs.io/en/latest/introduction/index.html and summarize it',
│   │   │   role='user',
│   │   │   context=None
│   │   )
│   ],
│   output_message=CompletionMessage(
│   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   role='assistant',
│   │   stop_reason='end_of_turn',
│   │   tool_calls=[]
│   ),
│   session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba',
│   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186),
│   steps=[
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content='',
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[
│   │   │   │   │   ToolCall(
│   │   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   │   tool_name='load_url'
│   │   │   │   │   )
│   │   │   │   ]
│   │   │   ),
│   │   │   step_id='d724a238-d02b-4d77-a4bc-a978a54979c6',
│   │   │   step_type='inference',
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535)
│   │   ),
│   │   ToolExecutionStep(
│   │   │   step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89',
│   │   │   step_type='tool_execution',
│   │   │   tool_calls=[
│   │   │   │   ToolCall(
│   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   tool_responses=[
│   │   │   │   ToolResponse(
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   content='{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to produce new hardware.\n\nPreviously, we have been working on our own replacement firmware: RebbleOS. As you can see by the commit history though, progress was slow. Building a production-ready realtime OS for the Pebble is no small feat, and although we were confident we’d get there given enough time, it was never our ideal path. Thanks to the hard work of many people both within Google and not, we finally have our hands on the original source code for PebbleOS. You can read Google’s blog post on this for even more information.\n\nThis does not mean we instantly have the ability to start developing updates for PebbleOS though, we first will need to spend some concentrated time getting it to build. But before we talk about that, let’s talk about Rebble itself.\n"}',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756)
│   │   ),
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[]
│   │   │   ),
│   │   │   step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a',
│   │   │   step_type='inference',
│   │   │   turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449)
│   │   )
│   ],
│   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199),
│   output_attachments=[]
)
```
ehhuang added a commit that referenced this pull request Feb 13, 2025
Summary:

In #102, we made a turn's behavior more complete by automatically passing back the tool response and create another turn when client tool is used.

However, this creates a problem with the non-streaming API where the response object only contains information since the last tool call.

This PR is a hacky attemp to address this, by combining the Turn responses into one. I think ideally we should move all the loop logic to only be on the server side, where a turn would pause and the client SDK would pass tool reponses back to resume a turn.

I also changed it to not yield ToolResponseMessage but instead yield a proper ToolExecutionStep event so that it can be treated the same as server side tool execution in terms of logging. I.e. it now outputs:
"tool_execution> Tool:load_url Response:{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our" instead of "CustomTool> {"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to "

Test Plan:

Added test in llamastack/llama-stack#1078

Run a simple script with Agent and client tool. Observe the returned response has steps from both created turns.

Turn(
│   input_messages=[
│   │   UserMessage(
│   │   │   content='load https://llama-stack.readthedocs.io/en/latest/introduction/index.html and summarize it',
│   │   │   role='user',
│   │   │   context=None
│   │   )
│   ],
│   output_message=CompletionMessage(
│   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   role='assistant',
│   │   stop_reason='end_of_turn',
│   │   tool_calls=[]
│   ),
│   session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba',
│   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186),
│   steps=[
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content='',
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[
│   │   │   │   │   ToolCall(
│   │   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   │   tool_name='load_url'
│   │   │   │   │   )
│   │   │   │   ]
│   │   │   ),
│   │   │   step_id='d724a238-d02b-4d77-a4bc-a978a54979c6',
│   │   │   step_type='inference',
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535)
│   │   ),
│   │   ToolExecutionStep(
│   │   │   step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89',
│   │   │   step_type='tool_execution',
│   │   │   tool_calls=[
│   │   │   │   ToolCall(
│   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   tool_responses=[
│   │   │   │   ToolResponse(
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   content='{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to produce new hardware.\n\nPreviously, we have been working on our own replacement firmware: RebbleOS. As you can see by the commit history though, progress was slow. Building a production-ready realtime OS for the Pebble is no small feat, and although we were confident we’d get there given enough time, it was never our ideal path. Thanks to the hard work of many people both within Google and not, we finally have our hands on the original source code for PebbleOS. You can read Google’s blog post on this for even more information.\n\nThis does not mean we instantly have the ability to start developing updates for PebbleOS though, we first will need to spend some concentrated time getting it to build. But before we talk about that, let’s talk about Rebble itself.\n"}',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756)
│   │   ),
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[]
│   │   │   ),
│   │   │   step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a',
│   │   │   step_type='inference',
│   │   │   turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449)
│   │   )
│   ],
│   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199),
│   output_attachments=[]
)
```
ehhuang added a commit that referenced this pull request Feb 13, 2025
Summary:

In #102, we made a turn's behavior more complete by automatically passing back the tool response and create another turn when client tool is used.

However, this creates a problem with the non-streaming API where the response object only contains information since the last tool call.

This PR is a hacky attemp to address this, by combining the Turn responses into one. I think ideally we should move all the loop logic to only be on the server side, where a turn would pause and the client SDK would pass tool reponses back to resume a turn.

I also changed it to not yield ToolResponseMessage but instead yield a proper ToolExecutionStep event so that it can be treated the same as server side tool execution in terms of logging. I.e. it now outputs:
"tool_execution> Tool:load_url Response:{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our" instead of "CustomTool> {"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to "

Test Plan:

Added test in llamastack/llama-stack#1078

Run a simple script with Agent and client tool. Observe the returned response has steps from both created turns.

Turn(
│   input_messages=[
│   │   UserMessage(
│   │   │   content='load https://llama-stack.readthedocs.io/en/latest/introduction/index.html and summarize it',
│   │   │   role='user',
│   │   │   context=None
│   │   )
│   ],
│   output_message=CompletionMessage(
│   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   role='assistant',
│   │   stop_reason='end_of_turn',
│   │   tool_calls=[]
│   ),
│   session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba',
│   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186),
│   steps=[
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content='',
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[
│   │   │   │   │   ToolCall(
│   │   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   │   tool_name='load_url'
│   │   │   │   │   )
│   │   │   │   ]
│   │   │   ),
│   │   │   step_id='d724a238-d02b-4d77-a4bc-a978a54979c6',
│   │   │   step_type='inference',
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535)
│   │   ),
│   │   ToolExecutionStep(
│   │   │   step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89',
│   │   │   step_type='tool_execution',
│   │   │   tool_calls=[
│   │   │   │   ToolCall(
│   │   │   │   │   arguments={'url': 'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   tool_responses=[
│   │   │   │   ToolResponse(
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   content='{"content": "\nToday Google announced that they have released the source code to PebbleOS. This is massive for Rebble, and will accelerate our efforts to produce new hardware.\n\nPreviously, we have been working on our own replacement firmware: RebbleOS. As you can see by the commit history though, progress was slow. Building a production-ready realtime OS for the Pebble is no small feat, and although we were confident we’d get there given enough time, it was never our ideal path. Thanks to the hard work of many people both within Google and not, we finally have our hands on the original source code for PebbleOS. You can read Google’s blog post on this for even more information.\n\nThis does not mean we instantly have the ability to start developing updates for PebbleOS though, we first will need to spend some concentrated time getting it to build. But before we talk about that, let’s talk about Rebble itself.\n"}',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756)
│   │   ),
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content="The document from the given URL is about Google releasing the source code to PebbleOS, which is a significant development for Rebble. This allows Rebble to accelerate its efforts to produce new hardware. Rebble had been working on its own replacement firmware, RebbleOS, but the release of PebbleOS's source code will help Rebble to build a production-ready real-time OS for the Pebble.",
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[]
│   │   │   ),
│   │   │   step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a',
│   │   │   step_type='inference',
│   │   │   turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449',
│   │   │   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107),
│   │   │   started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449)
│   │   )
│   ],
│   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199),
│   output_attachments=[]
)
```
hardikjshah pushed a commit that referenced this pull request Feb 14, 2025
Summary:

In #102, we
made a turn's behavior more complete by automatically passing back the
tool response and create another turn when client tool is used.

However, this creates a problem with the non-streaming API where the
response object only contains information since the last tool call.

This PR is a hacky attemp to address this, by combining the Turn
responses into one. I think ideally we should move all the loop logic to
only be on the server side, where a turn would pause and the client SDK
would pass tool reponses back to resume a turn.

I also changed it to not yield ToolResponseMessage but instead yield a
proper ToolExecutionStep event so that it can be treated the same as
server side tool execution in terms of logging. I.e. it now outputs:
"tool_execution> Tool:load_url Response:{"content": "\nToday Google
announced that they have released the source code to PebbleOS. This is
massive for Rebble, and will accelerate our" instead of "CustomTool>
{"content": "\nToday Google announced that they have released the source
code to PebbleOS. This is massive for Rebble, and will accelerate our
efforts to "

Test Plan:

Added test in llamastack/llama-stack#1078

Run a simple script with Agent and client tool. Observe the returned
response has steps from both created turns.

Turn(
│   input_messages=[
│   │   UserMessage(
│ │ │ content='load
https://llama-stack.readthedocs.io/en/latest/introduction/index.html and
summarize it',
│   │   │   role='user',
│   │   │   context=None
│   │   )
│   ],
│   output_message=CompletionMessage(
│ │ content="The document from the given URL is about Google releasing
the source code to PebbleOS, which is a significant development for
Rebble. This allows Rebble to accelerate its efforts to produce new
hardware. Rebble had been working on its own replacement firmware,
RebbleOS, but the release of PebbleOS's source code will help Rebble to
build a production-ready real-time OS for the Pebble.",
│   │   role='assistant',
│   │   stop_reason='end_of_turn',
│   │   tool_calls=[]
│   ),
│   session_id='dec1c6c0-ed9b-42c1-97d7-906871acd5ba',
│   started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 643186),
│   steps=[
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│   │   │   │   content='',
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[
│   │   │   │   │   ToolCall(
│ │ │ │ │ │ arguments={'url':
'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   │   tool_name='load_url'
│   │   │   │   │   )
│   │   │   │   ]
│   │   │   ),
│   │   │   step_id='d724a238-d02b-4d77-a4bc-a978a54979c6',
│   │   │   step_type='inference',
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 523310),
│ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 14, 654535)
│   │   ),
│   │   ToolExecutionStep(
│   │   │   step_id='49f19a5e-6a1e-4b1c-9232-fbafb82f2f89',
│   │   │   step_type='tool_execution',
│   │   │   tool_calls=[
│   │   │   │   ToolCall(
│ │ │ │ │ arguments={'url':
'https://llama-stack.readthedocs.io/en/latest/introduction/index.html'},
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   tool_responses=[
│   │   │   │   ToolResponse(
│   │   │   │   │   call_id='5d09151b-8a53-4292-be8d-f21e134d5142',
│ │ │ │ │ content='{"content": "\nToday Google announced that they have
released the source code to PebbleOS. This is massive for Rebble, and
will accelerate our efforts to produce new hardware.\n\nPreviously, we
have been working on our own replacement firmware: RebbleOS. As you can
see by the commit history though, progress was slow. Building a
production-ready realtime OS for the Pebble is no small feat, and
although we were confident we’d get there given enough time, it was
never our ideal path. Thanks to the hard work of many people both within
Google and not, we finally have our hands on the original source code
for PebbleOS. You can read Google’s blog post on this for even more
information.\n\nThis does not mean we instantly have the ability to
start developing updates for PebbleOS though, we first will need to
spend some concentrated time getting it to build. But before we talk
about that, let’s talk about Rebble itself.\n"}',
│   │   │   │   │   tool_name='load_url'
│   │   │   │   )
│   │   │   ],
│   │   │   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534830),
│ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 534756)
│   │   ),
│   │   InferenceStep(
│   │   │   api_model_response=CompletionMessage(
│ │ │ │ content="The document from the given URL is about Google
releasing the source code to PebbleOS, which is a significant
development for Rebble. This allows Rebble to accelerate its efforts to
produce new hardware. Rebble had been working on its own replacement
firmware, RebbleOS, but the release of PebbleOS's source code will help
Rebble to build a production-ready real-time OS for the Pebble.",
│   │   │   │   role='assistant',
│   │   │   │   stop_reason='end_of_turn',
│   │   │   │   tool_calls=[]
│   │   │   ),
│   │   │   step_id='5e6daa91-e689-4d7a-a7f9-d7c3da2eca5a',
│   │   │   step_type='inference',
│   │   │   turn_id='8f65d88d-7643-4dd7-acc7-48cd9e8aa449',
│ │ │ completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 179107),
│ │ │ started_at=datetime.datetime(2025, 2, 12, 16, 38, 15, 561449)
│   │   )
│   ],
│   turn_id='0496c654-cd02-48bb-a2ab-d1a0a5e91aba',
│   completed_at=datetime.datetime(2025, 2, 12, 16, 38, 16, 191199),
│   output_attachments=[]
)
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants