feat: add support for chat sessions #167

cdoern · 2025-02-26T15:10:07Z

What does this PR do?

today, to chat with a model, a user has to run one command per completion. add --session which enables a user to have an interactive chat session with the inference model. --session can be passed with or without --message. If no --message is passed, the user is prompted to give the first message.

This is useful as it also enables context to be saved between each completion unlike today.

llama-stack-client inference chat-completion --session
>>> hi whats up!
Assistant> Not much! How's your day going so far? Is there something I can help you with or would you like to chat?
>>> what color is the sky?
Assistant> The color of the sky can vary depending on the time of day and atmospheric conditions. Here are some common colors you might see:

* During the daytime, when the sun is overhead, the sky typically appears blue.
* At sunrise and sunset, the sky can take on hues of red, orange, pink, and purple due to the scattering of light by atmospheric particles.
* On a clear day with no clouds, the sky can appear a bright blue, often referred to as "cerulean."
* In areas with high levels of pollution or dust, the sky can appear more hazy or grayish.
* At night, the sky can be dark and black, although some stars and moonlight can make it visible.

So, what's your favorite color of the sky?
>>>

Test Plan

tested locally with and without --message

jwm4

This seems like a nice addition to me.

jaideepr97 · 2025-02-26T15:34:34Z

neat!

jaideepr97 · 2025-02-26T15:40:32Z

maybe a follow up PR could add the ability to supply chat templates to the model

src/llama_stack_client/lib/cli/inference/inference.py

ashwinb

nice

terrytangyuan

This is great!

today, to chat with a model, a user has to run one command per completion add --session which enables a user to have an interactive chat session with the inference model. --session can be passes with or without --message. If no --message is passed, the user is prompted to give the first message ``` llama-stack-client inference chat-completion --session >>> hi whats up! Assistant> Not much! How's your day going so far? Is there something I can help you with or would you like to chat? >>> what color is the sky? Assistant> The color of the sky can vary depending on the time of day and atmospheric conditions. Here are some common colors you might see: * During the daytime, when the sun is overhead, the sky typically appears blue. * At sunrise and sunset, the sky can take on hues of red, orange, pink, and purple due to the scattering of light by atmospheric particles. * On a clear day with no clouds, the sky can appear a bright blue, often referred to as "cerulean." * In areas with high levels of pollution or dust, the sky can appear more hazy or grayish. * At night, the sky can be dark and black, although some stars and moonlight can make it visible. So, what's your favorite color of the sky? >>> ``` Signed-off-by: Charlie Doern <cdoern@redhat.com>

cdoern · 2025-02-26T20:36:12Z

@ashwinb added your recommendations! thanks for the feedback

Automated Release PR --- ## 0.2.18-alpha.3 (2025-08-14) Full Changelog: [v0.2.18-alpha.2...v0.2.18-alpha.3](v0.2.18-alpha.2...v0.2.18-alpha.3) ### Features * `llama-stack-client providers inspect PROVIDER_ID` ([#181](#181)) ([6d18aae](6d18aae)) * add client-side utility for getting OAuth tokens simply ([#230](#230)) ([91156dc](91156dc)) * add client.chat.completions.create() and client.completions.create() ([#226](#226)) ([ee0e65e](ee0e65e)) * Add llama-stack-client datasets unregister command ([#222](#222)) ([38cd91c](38cd91c)) * add support for chat sessions ([#167](#167)) ([ce3b30f](ce3b30f)) * add type hints to event logger util ([#140](#140)) ([26f3c33](26f3c33)) * add updated batch inference types ([#220](#220)) ([ddb93ca](ddb93ca)) * add weighted_average aggregation function support ([#208](#208)) ([b62ac6c](b62ac6c)) * **agent:** support multiple tool calls ([#192](#192)) ([43ea2f6](43ea2f6)) * **agent:** support plain function as client_tool ([#187](#187)) ([2ec8044](2ec8044)) * **api:** update via SDK Studio ([48fd19c](48fd19c)) * async agent wrapper ([#169](#169)) ([fc9907c](fc9907c)) * autogen llama-stack-client CLI reference doc ([#190](#190)) ([e7b19a5](e7b19a5)) * client.responses.create() and client.responses.retrieve() ([#227](#227)) ([fba5102](fba5102)) * datasets api updates ([#203](#203)) ([b664564](b664564)) * enable_persist: sync updates from stainless branch: yanxi0830/dev ([#145](#145)) ([59a02f0](59a02f0)) * new Agent API ([#178](#178)) ([c2f73b1](c2f73b1)) * support client tool output metadata ([#180](#180)) ([8e4fd56](8e4fd56)) * Sync updates from stainless branch: ehhuang/dev ([#149](#149)) ([367da69](367da69)) * unify max infer iters with server/client tools ([#173](#173)) ([548f2de](548f2de)) * update react with new agent api ([#189](#189)) ([ac9d1e2](ac9d1e2)) ### Bug Fixes * `llama-stack-client provider inspect` should use retrieve ([#202](#202)) ([e33b5bf](e33b5bf)) * accept extra_headers in agent.create_turn and pass them faithfully ([#228](#228)) ([e72d9e8](e72d9e8)) * added uv.lock ([546e0df](546e0df)) * **agent:** better error handling ([#207](#207)) ([5746f91](5746f91)) * **agent:** initialize toolgroups/client_tools ([#186](#186)) ([458e207](458e207)) * broken .retrieve call using `identifier=` ([#135](#135)) ([626805a](626805a)) * bump to 0.2.1 ([edb6173](edb6173)) * bump version ([b6d45b8](b6d45b8)) * bump version in another place ([7253433](7253433)) * **cli:** align cli toolgroups register to the new arguments ([#231](#231)) ([a87b6f7](a87b6f7)) * correct toolgroups_id parameter name on unregister call ([#235](#235)) ([1be7904](1be7904)) * fix duplicate model get help text ([#188](#188)) ([4bab07a](4bab07a)) * llama-stack-client providers list ([#134](#134)) ([930138a](930138a)) * react agent ([#200](#200)) ([b779979](b779979)) * React Agent for non-llama models ([#174](#174)) ([ee5dd2b](ee5dd2b)) * React agent should be able to work with provided config ([#146](#146)) ([08ab5df](08ab5df)) * react agent with custom tool parser n_iters ([#184](#184)) ([aaff961](aaff961)) * remove the alpha suffix in run_benchmark.py ([#179](#179)) ([638f7f2](638f7f2)) * update CONTRIBUTING.md to point to uv instead of rye ([3fbe0cd](3fbe0cd)) * update uv lock ([cc072c8](cc072c8)) * validate endpoint url ([#196](#196)) ([6fa8095](6fa8095)) ### Chores * api sync, deprecate allow_resume_turn + rename task_config->benchmark_config (Sync updates from stainless branch: yanxi0830/dev) ([#176](#176)) ([96749af](96749af)) * AsyncAgent should use ToolResponse instead of ToolResponseMessage ([#197](#197)) ([6191aa5](6191aa5)) * **copy:** Copy changes over from llamastack/ org repository ([#255](#255)) ([7ade969](7ade969)) * deprecate eval task (Sync updates from stainless branch: main) ([#150](#150)) ([39b1248](39b1248)) * remove litellm type conversion ([#193](#193)) ([ab3f844](ab3f844)) * sync repo ([099bfc6](099bfc6)) * Sync updates from stainless branch: ehhuang/dev ([#182](#182)) ([e33aa4a](e33aa4a)) * Sync updates from stainless branch: ehhuang/dev ([#199](#199)) ([fa73d7d](fa73d7d)) * Sync updates from stainless branch: main ([#201](#201)) ([f063f2d](f063f2d)) * use rich to format logs ([#177](#177)) ([303054b](303054b)) ### Refactors * update react_agent to use tool_config ([#139](#139)) ([b5dce10](b5dce10)) ### Build System * Bump version to 0.1.19 ([ccd52f8](ccd52f8)) * Bump version to 0.1.8 ([0144e85](0144e85)) * Bump version to 0.1.9 ([7e00b78](7e00b78)) * Bump version to 0.2.10 ([05e41a6](05e41a6)) * Bump version to 0.2.11 ([d2e7537](d2e7537)) * Bump version to 0.2.12 ([e3d812e](e3d812e)) * Bump version to 0.2.13 ([b6c6c5e](b6c6c5e)) * Bump version to 0.2.2 ([47f8fd5](47f8fd5)) * Bump version to 0.2.4 ([7e6f5fc](7e6f5fc)) * Bump version to 0.2.5 ([62bd127](62bd127)) * Bump version to 0.2.6 ([3dd707f](3dd707f)) * Bump version to 0.2.7 ([e39ba88](e39ba88)) * Bump version to 0.2.8 ([645d219](645d219)) * Bump version to 0.2.9 ([d360557](d360557)) --- This pull request is managed by Stainless's [GitHub App](https://github.com/apps/stainless-app). The [semver version number](https://semver.org/#semantic-versioning-specification-semver) is based on included [commit messages](https://www.conventionalcommits.org/en/v1.0.0/). Alternatively, you can manually set the version number in the title of this pull request. For a better experience, it is recommended to use either rebase-merge or squash-merge when merging this pull request. 🔗 Stainless [website](https://www.stainlessapi.com) 📚 Read the [docs](https://app.stainlessapi.com/docs) 🙋 [Reach out](mailto:support@stainlessapi.com) for help or questions --------- Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com>

cdoern requested review from SLR722, ashwinb, dineshyv, dltn, ehhuang, hardikjshah, raghotham, vladimirivic and yanxi0830 as code owners February 26, 2025 15:10

cdoern changed the title ~~add support for chat sessions~~ feat: add support for chat sessions Feb 26, 2025

facebook-github-bot added the cla signed label Feb 26, 2025

jwm4 approved these changes Feb 26, 2025

View reviewed changes

ashwinb reviewed Feb 26, 2025

View reviewed changes

src/llama_stack_client/lib/cli/inference/inference.py Outdated Show resolved Hide resolved

ashwinb reviewed Feb 26, 2025

View reviewed changes

src/llama_stack_client/lib/cli/inference/inference.py Show resolved Hide resolved

ashwinb reviewed Feb 26, 2025

View reviewed changes

src/llama_stack_client/lib/cli/inference/inference.py Show resolved Hide resolved

ashwinb approved these changes Feb 26, 2025

View reviewed changes

terrytangyuan approved these changes Feb 26, 2025

View reviewed changes

cdoern force-pushed the session branch from e95991c to 853c249 Compare February 26, 2025 20:35

ashwinb merged commit ce3b30f into llamastack:main Feb 26, 2025
2 checks passed

stainless-app bot mentioned this pull request Aug 14, 2025

release: 0.2.18-alpha.3 #256

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: add support for chat sessions #167

feat: add support for chat sessions #167

Uh oh!

cdoern commented Feb 26, 2025 •

edited

Loading

Uh oh!

jwm4 left a comment

Uh oh!

jaideepr97 commented Feb 26, 2025

Uh oh!

jaideepr97 commented Feb 26, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ashwinb left a comment

Uh oh!

terrytangyuan left a comment

Uh oh!

cdoern commented Feb 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

feat: add support for chat sessions #167

feat: add support for chat sessions #167

Uh oh!

Conversation

cdoern commented Feb 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Test Plan

Uh oh!

jwm4 left a comment

Choose a reason for hiding this comment

Uh oh!

jaideepr97 commented Feb 26, 2025

Uh oh!

jaideepr97 commented Feb 26, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

ashwinb left a comment

Choose a reason for hiding this comment

Uh oh!

terrytangyuan left a comment

Choose a reason for hiding this comment

Uh oh!

cdoern commented Feb 26, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

cdoern commented Feb 26, 2025 •

edited

Loading