Skip to content

Conversation

dltn
Copy link
Contributor

@dltn dltn commented Nov 7, 2024

The Llama Stack primarily operates as a client/server model. However, there are scenarios where hosting a distribution can be cumbersome (e.g., testing, Jupyter), making it more desirable to utilize the Llama Stack as a library.

This introduces a clever hack that extends the Stainless Python client. It intercepts GET/POST requests intended for HTTP transmission and uses reflection to deserialize and route them directly to their implementations.

Is this roundabout serialization the most efficient method? Certainly not. However, the convenience of having this as a drop-in solution is significant, and it is negligible compared to GPU latency.

Copy link
Contributor

@ashwinb ashwinb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh man this is beautiful <3

@dltn dltn merged commit 0901251 into main Nov 7, 2024
3 checks passed
@dltn dltn deleted the add-direct-client branch November 7, 2024 21:27

from llama_stack.distribution.datatypes import StackRunConfig
from llama_stack.distribution.distribution import get_provider_registry
from llama_stack.distribution.resolver import resolve_impls
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add llama-stack as a dependency for the llama-stack-client package?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope it should be the reverse as we talked about. this code should always be exercised when the person already has llama-stack in their environment (as a library or as pip)

Copy link
Contributor

@yanxi0830 yanxi0830 Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, should this class LlamaStackDirectClient be inside the llama-stack repo instead of the llama-stack-client-python repo?

  1. User who want to use llama-stack as a library. Install llama-stack package (dependent on llama-stack-client package). Is able to use LlamaStackDirectClient.

  2. User who just installs llama-stack-client package. They cannot use LlamaStackDirectClient without installing llama-stack.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yanxi0830 yeah I think that makes sense to me actually.

skamenan7 pushed a commit to skamenan7/llama-stack that referenced this pull request Aug 13, 2025
…mastack#2930)

# What does this PR do?
`AgentEventLogger` only supports streaming responses, so I suggest
adding a comment near the bottom of `demo_script.py` letting the user
know this, e.g., if they change the `stream` value to `False` in the
call to `create_turn`, they need to comment out the logging lines.

See llamastack/llama-stack-client-python#15 

<!-- If resolving an issue, uncomment and update the line below -->
<!-- Closes #[issue-number] -->

## Test Plan
<!-- Describe the tests you ran to verify your changes with result
summaries. *Provide clear instructions so the plan can be easily
re-executed.* -->

---------

Signed-off-by: Dean Wampler <dean.wampler@ibm.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants