Skip to content

Conversation

franciscojavierarceo
Copy link
Collaborator

@franciscojavierarceo franciscojavierarceo commented Oct 14, 2025

What does this PR do?

Enables automatic embedding model detection for vector stores and by using a default_configured boolean that can be defined in the run.yaml.

Test Plan

  • Unit tests
  • Integration tests
  • Simple example below:

Spin up the stack:

uv run llama stack build --distro starter --image-type venv --run

Then test with OpenAI's client:

from openai import OpenAI
client = OpenAI(base_url="http://localhost:8321/v1/", api_key="none")
vs = client.vector_stores.create()

Previously you needed:

vs = client.vector_stores.create(
    extra_body={
        "embedding_model": "sentence-transformers/all-MiniLM-L6-v2",
        "embedding_dimension": 384,
    }
)

The extra_body is now unnecessary.

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Oct 14, 2025
@franciscojavierarceo franciscojavierarceo force-pushed the default-embedding-model branch 6 times, most recently from b8168ff to f8cb3c4 Compare October 14, 2025 20:20
Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Comment on lines 514 to 517
raise ValueError(
f"Multiple embedding models marked as default_configured=True: {model_ids}. "
"Only one embedding model can be marked as default."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be checked when Stack was initialized instead.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added additional validation when the stack is initialized (as well as a test to confirm) for this I think we can keep it here as well in case we allow for models to be dynamically registered and a second default were to slip in. Let me know if you'd like me to remove it though. 👍

Comment on lines +373 to +374
# Embedding model was provided but dimension wasn't, look it up
embedding_dimension = await self._get_embedding_dimension_for_model(embedding_model)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just call this in _get_default_embedding_model_and_dimension?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated. 👍

if param_name in body:
value = body.get(param_name)
if param_name in exclude_params:
converted_body[param_name] = value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe unrelated to this PR, but reading due to the change below: do we only allow one such parameter? if so, assert?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's a break above that silently skips the others. i can add some validation to the above if we want.

Comment on lines +980 to +981
"embedding_dimension": 768,
"default_configured": True,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

non-blocking: we should formalize these parameters in a EmbeddingModel class.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah i can do that as a follow up

Signed-off-by: Francisco Javier Arceo <farceo@redhat.com>
Copy link
Contributor

@ehhuang ehhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG!

@ehhuang ehhuang merged commit ef4bc70 into llamastack:main Oct 15, 2025
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants