Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .release-please-manifest.json
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
{
".": "0.2.23-alpha.1"
".": "0.3.0-alpha.1"
}
8 changes: 4 additions & 4 deletions .stats.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
configured_endpoints: 111
openapi_spec_url: https://storage.googleapis.com/stainless-sdk-openapi-specs/llamastack%2Fllama-stack-client-f252873ea1e1f38fd207331ef2621c511154d5be3f4076e59cc15754fc58eee4.yml
openapi_spec_hash: 10cbb4337a06a9fdd7d08612dd6044c3
config_hash: 0358112cc0f3d880b4d55debdbe1cfa3
configured_endpoints: 105
openapi_spec_url: https://storage.googleapis.com/stainless-sdk-openapi-specs/llamastack%2Fllama-stack-client-d7bea816190382a93511491e33d1f37f707620926ab133ae8ce0883d763df741.yml
openapi_spec_hash: f73b3af77108625edae3f25972b9e665
config_hash: 548f336ac1b68ab1dfe385b79df764dd
31 changes: 31 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,36 @@
# Changelog

## 0.3.0-alpha.1 (2025-09-30)

Full Changelog: [v0.2.23-alpha.1...v0.3.0-alpha.1](https://github.com/llamastack/llama-stack-client-python/compare/v0.2.23-alpha.1...v0.3.0-alpha.1)

### ⚠ BREAKING CHANGES

* **api:** fixes to remove deprecated inference resources

### Features

* **api:** expires_after changes for /files ([7f24c43](https://github.com/llamastack/llama-stack-client-python/commit/7f24c432dc1859312710a4a1ff4a80f6f861bee8))
* **api:** fixes to remove deprecated inference resources ([04834d2](https://github.com/llamastack/llama-stack-client-python/commit/04834d2189ae4e4b8cd2c9370d1d39857bc6e9ec))
* **api:** removing openai/v1 ([a918b43](https://github.com/llamastack/llama-stack-client-python/commit/a918b4323118c18f77c2abe7e1a3054c1eebeaac))
* **api:** updating post /v1/files to have correct multipart/form-data ([433a996](https://github.com/llamastack/llama-stack-client-python/commit/433a996527bcca131ada4730376d8993f34ad6f5))


### Bug Fixes

* clean up deprecated code ([f10ead0](https://github.com/llamastack/llama-stack-client-python/commit/f10ead00522b7ca803cd7dc3617da0d451efa7da))
* Don't retry for non-recoverable server http errors ([#212](https://github.com/llamastack/llama-stack-client-python/issues/212)) ([6782e8f](https://github.com/llamastack/llama-stack-client-python/commit/6782e8fc5931369223ed4446f8e7732f62712eff))


### Documentation

* update examples ([f896747](https://github.com/llamastack/llama-stack-client-python/commit/f89674726f55915a8cda0e2b4284be3c92978121))


### Build System

* Bump version to 0.2.23 ([0d4dc64](https://github.com/llamastack/llama-stack-client-python/commit/0d4dc6449224fa2a0f6d20f6229dd9d1a5427861))

## 0.2.23-alpha.1 (2025-09-26)

Full Changelog: [v0.2.19-alpha.1...v0.2.23-alpha.1](https://github.com/llamastack/llama-stack-client-python/compare/v0.2.19-alpha.1...v0.2.23-alpha.1)
Expand Down
135 changes: 118 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,6 +109,50 @@ asyncio.run(main())

Functionality between the synchronous and asynchronous clients is otherwise identical.

## Streaming responses

We provide support for streaming responses using Server Side Events (SSE).

```python
from llama_stack_client import LlamaStackClient

client = LlamaStackClient()

stream = client.chat.completions.create(
messages=[
{
"content": "string",
"role": "user",
}
],
model="model",
stream=True,
)
for completion in stream:
print(completion)
```

The async client uses the exact same interface.

```python
from llama_stack_client import AsyncLlamaStackClient

client = AsyncLlamaStackClient()

stream = await client.chat.completions.create(
messages=[
{
"content": "string",
"role": "user",
}
],
model="model",
stream=True,
)
async for completion in stream:
print(completion)
```

## Using types

Nested request parameters are [TypedDicts](https://docs.python.org/3/library/typing.html#typing.TypedDict). Responses are [Pydantic models](https://docs.pydantic.dev) which also provide helper methods for things like:
Expand All @@ -118,6 +162,40 @@ Nested request parameters are [TypedDicts](https://docs.python.org/3/library/typ

Typed requests and responses provide autocomplete and documentation within your editor. If you would like to see type errors in VS Code to help catch bugs earlier, set `python.analysis.typeCheckingMode` to `basic`.

## Nested params

Nested parameters are dictionaries, typed using `TypedDict`, for example:

```python
from llama_stack_client import LlamaStackClient

client = LlamaStackClient()

client.toolgroups.register(
provider_id="provider_id",
toolgroup_id="toolgroup_id",
mcp_endpoint={"uri": "uri"},
)
```

## File uploads

Request parameters that correspond to file uploads can be passed as `bytes`, or a [`PathLike`](https://docs.python.org/3/library/os.html#os.PathLike) instance or a tuple of `(filename, contents, media type)`.

```python
from pathlib import Path
from llama_stack_client import LlamaStackClient

client = LlamaStackClient()

client.files.create(
file=Path("/path/to/file"),
purpose="assistants",
)
```

The async client uses the exact same interface. If you pass a [`PathLike`](https://docs.python.org/3/library/os.html#os.PathLike) instance, the file contents will be read asynchronously automatically.

## Handling errors

When the library is unable to connect to the API (for example, due to network connection problems or a timeout), a subclass of `llama_stack_client.APIConnectionError` is raised.
Expand All @@ -134,9 +212,14 @@ from llama_stack_client import LlamaStackClient
client = LlamaStackClient()

try:
client.agents.sessions.create(
agent_id="agent_id",
session_name="session_name",
client.chat.completions.create(
messages=[
{
"content": "string",
"role": "user",
}
],
model="model",
)
except llama_stack_client.APIConnectionError as e:
print("The server could not be reached")
Expand Down Expand Up @@ -180,9 +263,14 @@ client = LlamaStackClient(
)

# Or, configure per-request:
client.with_options(max_retries=5).agents.sessions.create(
agent_id="agent_id",
session_name="session_name",
client.with_options(max_retries=5).chat.completions.create(
messages=[
{
"content": "string",
"role": "user",
}
],
model="model",
)
```

Expand All @@ -206,9 +294,14 @@ client = LlamaStackClient(
)

# Override per-request:
client.with_options(timeout=5.0).agents.sessions.create(
agent_id="agent_id",
session_name="session_name",
client.with_options(timeout=5.0).chat.completions.create(
messages=[
{
"content": "string",
"role": "user",
}
],
model="model",
)
```

Expand Down Expand Up @@ -248,14 +341,17 @@ The "raw" Response object can be accessed by prefixing `.with_raw_response.` to
from llama_stack_client import LlamaStackClient

client = LlamaStackClient()
response = client.agents.sessions.with_raw_response.create(
agent_id="agent_id",
session_name="session_name",
response = client.chat.completions.with_raw_response.create(
messages=[{
"content": "string",
"role": "user",
}],
model="model",
)
print(response.headers.get('X-My-Header'))

session = response.parse() # get the object that `agents.sessions.create()` would have returned
print(session.session_id)
completion = response.parse() # get the object that `chat.completions.create()` would have returned
print(completion)
```

These methods return an [`APIResponse`](https://github.com/meta-llama/llama-stack-python/tree/main/src/llama_stack_client/_response.py) object.
Expand All @@ -269,9 +365,14 @@ The above interface eagerly reads the full response body when you make the reque
To stream the response body, use `.with_streaming_response` instead, which requires a context manager and only reads the response body once you call `.read()`, `.text()`, `.json()`, `.iter_bytes()`, `.iter_text()`, `.iter_lines()` or `.parse()`. In the async client, these are async methods.

```python
with client.agents.sessions.with_streaming_response.create(
agent_id="agent_id",
session_name="session_name",
with client.chat.completions.with_streaming_response.create(
messages=[
{
"content": "string",
"role": "user",
}
],
model="model",
) as response:
print(response.headers.get("X-My-Header"))

Expand Down
Loading
Loading