Use llama-stack to retrieve LLM output #25

tisnik · 2025-05-13T08:41:57Z

Description

Use llama-stack to retrieve LLM output

Type of change

manstis · 2025-05-14T13:27:57Z

src/app/endpoints/query.py

+
+    logger.info("Model: %s", model_id)
+
+    response = client.inference.chat_completion(


You should really be using a llama-stack Agent to handle this.

Agents in llama-stack can handle all of steps in a LLM call:

Safety Shields on incoming message ("Question validity")

Inference

MCP (etc) Tool calling

Safety Shields on outgoing responses ("Answer redaction")

You are currently only using llama-stack's inference provider.

Happy to help guide you more.. this is why it's important we sync up.

Use llama-stack to retrieve LLM output

fc1a86d

tisnik merged commit 1f626e5 into lightspeed-core:main May 13, 2025
2 checks passed

manstis reviewed May 14, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Use llama-stack to retrieve LLM output #25

Use llama-stack to retrieve LLM output #25

Uh oh!

tisnik commented May 13, 2025

Uh oh!

Uh oh!

manstis May 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants


		logger.info("Model: %s", model_id)

		response = client.inference.chat_completion(

Use llama-stack to retrieve LLM output #25

Use llama-stack to retrieve LLM output #25

Uh oh!

Conversation

tisnik commented May 13, 2025

Description

Type of change

Uh oh!

Uh oh!

manstis May 14, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants