-
Notifications
You must be signed in to change notification settings - Fork 95
feat: add support for chat sessions #167
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a nice addition to me.
|
neat! |
|
maybe a follow up PR could add the ability to supply chat templates to the model |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great!
today, to chat with a model, a user has to run one command per completion add --session which enables a user to have an interactive chat session with the inference model. --session can be passes with or without --message. If no --message is passed, the user is prompted to give the first message ``` llama-stack-client inference chat-completion --session >>> hi whats up! Assistant> Not much! How's your day going so far? Is there something I can help you with or would you like to chat? >>> what color is the sky? Assistant> The color of the sky can vary depending on the time of day and atmospheric conditions. Here are some common colors you might see: * During the daytime, when the sun is overhead, the sky typically appears blue. * At sunrise and sunset, the sky can take on hues of red, orange, pink, and purple due to the scattering of light by atmospheric particles. * On a clear day with no clouds, the sky can appear a bright blue, often referred to as "cerulean." * In areas with high levels of pollution or dust, the sky can appear more hazy or grayish. * At night, the sky can be dark and black, although some stars and moonlight can make it visible. So, what's your favorite color of the sky? >>> ``` Signed-off-by: Charlie Doern <cdoern@redhat.com>
|
@ashwinb added your recommendations! thanks for the feedback |
Automated Release PR --- ## 0.2.18-alpha.3 (2025-08-14) Full Changelog: [v0.2.18-alpha.2...v0.2.18-alpha.3](v0.2.18-alpha.2...v0.2.18-alpha.3) ### Features * `llama-stack-client providers inspect PROVIDER_ID` ([#181](#181)) ([6d18aae](6d18aae)) * add client-side utility for getting OAuth tokens simply ([#230](#230)) ([91156dc](91156dc)) * add client.chat.completions.create() and client.completions.create() ([#226](#226)) ([ee0e65e](ee0e65e)) * Add llama-stack-client datasets unregister command ([#222](#222)) ([38cd91c](38cd91c)) * add support for chat sessions ([#167](#167)) ([ce3b30f](ce3b30f)) * add type hints to event logger util ([#140](#140)) ([26f3c33](26f3c33)) * add updated batch inference types ([#220](#220)) ([ddb93ca](ddb93ca)) * add weighted_average aggregation function support ([#208](#208)) ([b62ac6c](b62ac6c)) * **agent:** support multiple tool calls ([#192](#192)) ([43ea2f6](43ea2f6)) * **agent:** support plain function as client_tool ([#187](#187)) ([2ec8044](2ec8044)) * **api:** update via SDK Studio ([48fd19c](48fd19c)) * async agent wrapper ([#169](#169)) ([fc9907c](fc9907c)) * autogen llama-stack-client CLI reference doc ([#190](#190)) ([e7b19a5](e7b19a5)) * client.responses.create() and client.responses.retrieve() ([#227](#227)) ([fba5102](fba5102)) * datasets api updates ([#203](#203)) ([b664564](b664564)) * enable_persist: sync updates from stainless branch: yanxi0830/dev ([#145](#145)) ([59a02f0](59a02f0)) * new Agent API ([#178](#178)) ([c2f73b1](c2f73b1)) * support client tool output metadata ([#180](#180)) ([8e4fd56](8e4fd56)) * Sync updates from stainless branch: ehhuang/dev ([#149](#149)) ([367da69](367da69)) * unify max infer iters with server/client tools ([#173](#173)) ([548f2de](548f2de)) * update react with new agent api ([#189](#189)) ([ac9d1e2](ac9d1e2)) ### Bug Fixes * `llama-stack-client provider inspect` should use retrieve ([#202](#202)) ([e33b5bf](e33b5bf)) * accept extra_headers in agent.create_turn and pass them faithfully ([#228](#228)) ([e72d9e8](e72d9e8)) * added uv.lock ([546e0df](546e0df)) * **agent:** better error handling ([#207](#207)) ([5746f91](5746f91)) * **agent:** initialize toolgroups/client_tools ([#186](#186)) ([458e207](458e207)) * broken .retrieve call using `identifier=` ([#135](#135)) ([626805a](626805a)) * bump to 0.2.1 ([edb6173](edb6173)) * bump version ([b6d45b8](b6d45b8)) * bump version in another place ([7253433](7253433)) * **cli:** align cli toolgroups register to the new arguments ([#231](#231)) ([a87b6f7](a87b6f7)) * correct toolgroups_id parameter name on unregister call ([#235](#235)) ([1be7904](1be7904)) * fix duplicate model get help text ([#188](#188)) ([4bab07a](4bab07a)) * llama-stack-client providers list ([#134](#134)) ([930138a](930138a)) * react agent ([#200](#200)) ([b779979](b779979)) * React Agent for non-llama models ([#174](#174)) ([ee5dd2b](ee5dd2b)) * React agent should be able to work with provided config ([#146](#146)) ([08ab5df](08ab5df)) * react agent with custom tool parser n_iters ([#184](#184)) ([aaff961](aaff961)) * remove the alpha suffix in run_benchmark.py ([#179](#179)) ([638f7f2](638f7f2)) * update CONTRIBUTING.md to point to uv instead of rye ([3fbe0cd](3fbe0cd)) * update uv lock ([cc072c8](cc072c8)) * validate endpoint url ([#196](#196)) ([6fa8095](6fa8095)) ### Chores * api sync, deprecate allow_resume_turn + rename task_config->benchmark_config (Sync updates from stainless branch: yanxi0830/dev) ([#176](#176)) ([96749af](96749af)) * AsyncAgent should use ToolResponse instead of ToolResponseMessage ([#197](#197)) ([6191aa5](6191aa5)) * **copy:** Copy changes over from llamastack/ org repository ([#255](#255)) ([7ade969](7ade969)) * deprecate eval task (Sync updates from stainless branch: main) ([#150](#150)) ([39b1248](39b1248)) * remove litellm type conversion ([#193](#193)) ([ab3f844](ab3f844)) * sync repo ([099bfc6](099bfc6)) * Sync updates from stainless branch: ehhuang/dev ([#182](#182)) ([e33aa4a](e33aa4a)) * Sync updates from stainless branch: ehhuang/dev ([#199](#199)) ([fa73d7d](fa73d7d)) * Sync updates from stainless branch: main ([#201](#201)) ([f063f2d](f063f2d)) * use rich to format logs ([#177](#177)) ([303054b](303054b)) ### Refactors * update react_agent to use tool_config ([#139](#139)) ([b5dce10](b5dce10)) ### Build System * Bump version to 0.1.19 ([ccd52f8](ccd52f8)) * Bump version to 0.1.8 ([0144e85](0144e85)) * Bump version to 0.1.9 ([7e00b78](7e00b78)) * Bump version to 0.2.10 ([05e41a6](05e41a6)) * Bump version to 0.2.11 ([d2e7537](d2e7537)) * Bump version to 0.2.12 ([e3d812e](e3d812e)) * Bump version to 0.2.13 ([b6c6c5e](b6c6c5e)) * Bump version to 0.2.2 ([47f8fd5](47f8fd5)) * Bump version to 0.2.4 ([7e6f5fc](7e6f5fc)) * Bump version to 0.2.5 ([62bd127](62bd127)) * Bump version to 0.2.6 ([3dd707f](3dd707f)) * Bump version to 0.2.7 ([e39ba88](e39ba88)) * Bump version to 0.2.8 ([645d219](645d219)) * Bump version to 0.2.9 ([d360557](d360557)) --- This pull request is managed by Stainless's [GitHub App](https://github.com/apps/stainless-app). The [semver version number](https://semver.org/#semantic-versioning-specification-semver) is based on included [commit messages](https://www.conventionalcommits.org/en/v1.0.0/). Alternatively, you can manually set the version number in the title of this pull request. For a better experience, it is recommended to use either rebase-merge or squash-merge when merging this pull request. 🔗 Stainless [website](https://www.stainlessapi.com) 📚 Read the [docs](https://app.stainlessapi.com/docs) 🙋 [Reach out](mailto:support@stainlessapi.com) for help or questions --------- Co-authored-by: stainless-app[bot] <142633134+stainless-app[bot]@users.noreply.github.com>
What does this PR do?
today, to chat with a model, a user has to run one command per completion. add --session which enables a user to have an interactive chat session with the inference model. --session can be passed with or without --message. If no --message is passed, the user is prompted to give the first message.
This is useful as it also enables context to be saved between each completion unlike today.
Test Plan
tested locally with and without --message