Skip to content

Conversation

@cdoern
Copy link
Contributor

@cdoern cdoern commented Feb 26, 2025

What does this PR do?

today, to chat with a model, a user has to run one command per completion. add --session which enables a user to have an interactive chat session with the inference model. --session can be passed with or without --message. If no --message is passed, the user is prompted to give the first message.

This is useful as it also enables context to be saved between each completion unlike today.

llama-stack-client inference chat-completion --session
>>> hi whats up!
Assistant> Not much! How's your day going so far? Is there something I can help you with or would you like to chat?
>>> what color is the sky?
Assistant> The color of the sky can vary depending on the time of day and atmospheric conditions. Here are some common colors you might see:

* During the daytime, when the sun is overhead, the sky typically appears blue.
* At sunrise and sunset, the sky can take on hues of red, orange, pink, and purple due to the scattering of light by atmospheric particles.
* On a clear day with no clouds, the sky can appear a bright blue, often referred to as "cerulean."
* In areas with high levels of pollution or dust, the sky can appear more hazy or grayish.
* At night, the sky can be dark and black, although some stars and moonlight can make it visible.

So, what's your favorite color of the sky?
>>>

Test Plan

tested locally with and without --message

@cdoern cdoern changed the title add support for chat sessions feat: add support for chat sessions Feb 26, 2025
Copy link

@jwm4 jwm4 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a nice addition to me.

@jaideepr97
Copy link

neat!

@jaideepr97
Copy link

maybe a follow up PR could add the ability to supply chat templates to the model

Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Copy link

@terrytangyuan terrytangyuan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great!

today, to chat with a model, a user has to run one command per completion
add --session which enables a user to have an interactive chat session with the inference model.
--session can be passes with or without --message. If no --message is passed, the user is prompted to give the first message

```
llama-stack-client inference chat-completion --session
>>> hi whats up!
Assistant> Not much! How's your day going so far? Is there something I can help you with or would you like to chat?
>>> what color is the sky?
Assistant> The color of the sky can vary depending on the time of day and atmospheric conditions. Here are some common colors you might see:

* During the daytime, when the sun is overhead, the sky typically appears blue.
* At sunrise and sunset, the sky can take on hues of red, orange, pink, and purple due to the scattering of light by atmospheric particles.
* On a clear day with no clouds, the sky can appear a bright blue, often referred to as "cerulean."
* In areas with high levels of pollution or dust, the sky can appear more hazy or grayish.
* At night, the sky can be dark and black, although some stars and moonlight can make it visible.

So, what's your favorite color of the sky?
>>>
```

Signed-off-by: Charlie Doern <cdoern@redhat.com>
@cdoern
Copy link
Contributor Author

cdoern commented Feb 26, 2025

@ashwinb added your recommendations! thanks for the feedback

@ashwinb ashwinb merged commit ce3b30f into llamastack:main Feb 26, 2025
2 checks passed
@stainless-app stainless-app bot mentioned this pull request Aug 14, 2025
ashwinb pushed a commit that referenced this pull request Aug 14, 2025
Automated Release PR
---


## 0.2.18-alpha.3 (2025-08-14)

Full Changelog:
[v0.2.18-alpha.2...v0.2.18-alpha.3](v0.2.18-alpha.2...v0.2.18-alpha.3)

### Features

* `llama-stack-client providers inspect PROVIDER_ID`
([#181](#181))
([6d18aae](6d18aae))
* add client-side utility for getting OAuth tokens simply
([#230](#230))
([91156dc](91156dc))
* add client.chat.completions.create() and client.completions.create()
([#226](#226))
([ee0e65e](ee0e65e))
* Add llama-stack-client datasets unregister command
([#222](#222))
([38cd91c](38cd91c))
* add support for chat sessions
([#167](#167))
([ce3b30f](ce3b30f))
* add type hints to event logger util
([#140](#140))
([26f3c33](26f3c33))
* add updated batch inference types
([#220](#220))
([ddb93ca](ddb93ca))
* add weighted_average aggregation function support
([#208](#208))
([b62ac6c](b62ac6c))
* **agent:** support multiple tool calls
([#192](#192))
([43ea2f6](43ea2f6))
* **agent:** support plain function as client_tool
([#187](#187))
([2ec8044](2ec8044))
* **api:** update via SDK Studio
([48fd19c](48fd19c))
* async agent wrapper
([#169](#169))
([fc9907c](fc9907c))
* autogen llama-stack-client CLI reference doc
([#190](#190))
([e7b19a5](e7b19a5))
* client.responses.create() and client.responses.retrieve()
([#227](#227))
([fba5102](fba5102))
* datasets api updates
([#203](#203))
([b664564](b664564))
* enable_persist: sync updates from stainless branch: yanxi0830/dev
([#145](#145))
([59a02f0](59a02f0))
* new Agent API
([#178](#178))
([c2f73b1](c2f73b1))
* support client tool output metadata
([#180](#180))
([8e4fd56](8e4fd56))
* Sync updates from stainless branch: ehhuang/dev
([#149](#149))
([367da69](367da69))
* unify max infer iters with server/client tools
([#173](#173))
([548f2de](548f2de))
* update react with new agent api
([#189](#189))
([ac9d1e2](ac9d1e2))


### Bug Fixes

* `llama-stack-client provider inspect` should use retrieve
([#202](#202))
([e33b5bf](e33b5bf))
* accept extra_headers in agent.create_turn and pass them faithfully
([#228](#228))
([e72d9e8](e72d9e8))
* added uv.lock
([546e0df](546e0df))
* **agent:** better error handling
([#207](#207))
([5746f91](5746f91))
* **agent:** initialize toolgroups/client_tools
([#186](#186))
([458e207](458e207))
* broken .retrieve call using `identifier=`
([#135](#135))
([626805a](626805a))
* bump to 0.2.1
([edb6173](edb6173))
* bump version
([b6d45b8](b6d45b8))
* bump version in another place
([7253433](7253433))
* **cli:** align cli toolgroups register to the new arguments
([#231](#231))
([a87b6f7](a87b6f7))
* correct toolgroups_id parameter name on unregister call
([#235](#235))
([1be7904](1be7904))
* fix duplicate model get help text
([#188](#188))
([4bab07a](4bab07a))
* llama-stack-client providers list
([#134](#134))
([930138a](930138a))
* react agent
([#200](#200))
([b779979](b779979))
* React Agent for non-llama models
([#174](#174))
([ee5dd2b](ee5dd2b))
* React agent should be able to work with provided config
([#146](#146))
([08ab5df](08ab5df))
* react agent with custom tool parser n_iters
([#184](#184))
([aaff961](aaff961))
* remove the alpha suffix in run_benchmark.py
([#179](#179))
([638f7f2](638f7f2))
* update CONTRIBUTING.md to point to uv instead of rye
([3fbe0cd](3fbe0cd))
* update uv lock
([cc072c8](cc072c8))
* validate endpoint url
([#196](#196))
([6fa8095](6fa8095))


### Chores

* api sync, deprecate allow_resume_turn + rename
task_config-&gt;benchmark_config (Sync updates from stainless branch:
yanxi0830/dev)
([#176](#176))
([96749af](96749af))
* AsyncAgent should use ToolResponse instead of ToolResponseMessage
([#197](#197))
([6191aa5](6191aa5))
* **copy:** Copy changes over from llamastack/ org repository
([#255](#255))
([7ade969](7ade969))
* deprecate eval task (Sync updates from stainless branch: main)
([#150](#150))
([39b1248](39b1248))
* remove litellm type conversion
([#193](#193))
([ab3f844](ab3f844))
* sync repo
([099bfc6](099bfc6))
* Sync updates from stainless branch: ehhuang/dev
([#182](#182))
([e33aa4a](e33aa4a))
* Sync updates from stainless branch: ehhuang/dev
([#199](#199))
([fa73d7d](fa73d7d))
* Sync updates from stainless branch: main
([#201](#201))
([f063f2d](f063f2d))
* use rich to format logs
([#177](#177))
([303054b](303054b))


### Refactors

* update react_agent to use tool_config
([#139](#139))
([b5dce10](b5dce10))


### Build System

* Bump version to 0.1.19
([ccd52f8](ccd52f8))
* Bump version to 0.1.8
([0144e85](0144e85))
* Bump version to 0.1.9
([7e00b78](7e00b78))
* Bump version to 0.2.10
([05e41a6](05e41a6))
* Bump version to 0.2.11
([d2e7537](d2e7537))
* Bump version to 0.2.12
([e3d812e](e3d812e))
* Bump version to 0.2.13
([b6c6c5e](b6c6c5e))
* Bump version to 0.2.2
([47f8fd5](47f8fd5))
* Bump version to 0.2.4
([7e6f5fc](7e6f5fc))
* Bump version to 0.2.5
([62bd127](62bd127))
* Bump version to 0.2.6
([3dd707f](3dd707f))
* Bump version to 0.2.7
([e39ba88](e39ba88))
* Bump version to 0.2.8
([645d219](645d219))
* Bump version to 0.2.9
([d360557](d360557))

---
This pull request is managed by Stainless's [GitHub
App](https://github.com/apps/stainless-app).

The [semver version
number](https://semver.org/#semantic-versioning-specification-semver) is
based on included [commit
messages](https://www.conventionalcommits.org/en/v1.0.0/).
Alternatively, you can manually set the version number in the title of
this pull request.

For a better experience, it is recommended to use either rebase-merge or
squash-merge when merging this pull request.

🔗 Stainless [website](https://www.stainlessapi.com)
📚 Read the [docs](https://app.stainlessapi.com/docs)
🙋 [Reach out](mailto:support@stainlessapi.com) for help or questions

---------

Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants