Skip to content

Conversation

@tisnik
Copy link
Contributor

@tisnik tisnik commented May 13, 2025

Description

Use llama-stack to retrieve LLM output

Type of change

  • Refactor
  • New feature
  • Bug fix
  • CVE fix
  • Optimization
  • Documentation Update
  • Configuration Update
  • Bump-up dependent library
  • Bump-up library or tool used for development (does not change the final image)
  • CI configuration change
  • Konflux configuration change

@tisnik tisnik merged commit 1f626e5 into lightspeed-core:main May 13, 2025
2 checks passed

logger.info("Model: %s", model_id)

response = client.inference.chat_completion(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should really be using a llama-stack Agent to handle this.

Agents in llama-stack can handle all of steps in a LLM call:

  • Safety Shields on incoming message ("Question validity")
  • Inference
  • MCP (etc) Tool calling
  • Safety Shields on outgoing responses ("Answer redaction")

You are currently only using llama-stack's inference provider.

Happy to help guide you more.. this is why it's important we sync up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants