Configuration API

### 🚀 Describe the new functionality needed

# Configuration API

adding providers outside of the current scope will likely necessitate the following: 

1. bespoke configuration based on hardware (GPU, CPU, etc) that should apply to multiple providers in order for them to work properly.
2. hyperparameters for specific providers that should be both auto-detected and able to be selected using a CLI.
3. a way to check available configurations, currently assigned configurations, etc.

I imagine this functionality working similarly to `Models` or `Inspect` where these are a high level API. Additionally these objects should be applicable for other providers to "register" one of them. `Configurations` similarly to models, should operate as an "overarching" API that one can register, list, get, and unregister a configuration.

usage pattern:

llama stack build && llama stack run (administrator starts stack)

a user could run:

`llama-stack-client configurations inspect`

```
 providers:
  agents:
  - config: {}
    provider_id: meta-reference
    provider_type: inline::meta-reference
  datasetio: []
  eval: []
  inference:
  - config:
      url: http://localhost:12345
    provider_id: ollama
    provider_type: remote::ollama
  safety: []
  scoring:
  - config: {}
    provider_id: braintrust
    provider_type: inline::braintrust
  telemetry:
  - config: {}
    provider_id: meta-reference
    provider_type: inline::meta-reference
  tool_runtime:
  - config: {}
    provider_id: brave-search
    provider_type: remote::brave-search
  - config: {}
    provider_id: tavily-search
    provider_type: remote::tavily-search
  vector_io:
  - config: {}
    provider_id: faiss
    provider_type: inline::faiss
  - config: {}
    provider_id: sqlite_vec
    provider_type: inline::sqlite_vec
```

`llama-stack-client configurations register --config <file_path>`

or using the SDK:


```python
current_config = client.configurations.inspect()
print(current_config)
config = { "inference": [{'provider_id': 'ollama', 'provider_type': 'remote::ollama', 'config': {'url': 'http://localhost:12345'}}]}
config = json.dumps(config)
config = client.configurations.register(config=config)
print(config)
```

the configuration API would look something like: 

```python
@json_schema_type
class Configuration(BaseModel):
    type: Literal[ResourceType.configuration.value] = ResourceType.configuration.value
    config: StackRunConfig

class ConfigListResponse(BaseModel):
    data: List[dict[str, Any]]


@runtime_checkable
@trace_protocol
class Configurations(Protocol):
    """Llama Stack Configuration API for storing and applying hyperparameters for given tasks.
    
    """

    @webmethod(route="/configurations/register", method="POST")
    async def register_config(
        self,
        config,
    ) -> dict[str, Any]: ...
```

With the inspect API expanded to have a /configurations endpoint: 

```python
@runtime_checkable
class Inspect(Protocol):

    @webmethod(route="/inspect/configurations", method="GET")
    async def inspect_config(
        self,
    ) -> InspectConfigResponse: ...
                  
```

## UserConfig vs StackRunConfig

A key part of this API are the fields exposed in both the inspection and registration. A Configuration object contains a `StackRunConfig` within it. However, the data within this config is a `UserConfig`. A UserConfig is a `StackRunConfig` but only with specific fields displayed to the user. Since each provider has its own config class that feeds into the StackRunConfig the following can be used to label certain fields as "User Configurable":

`url: str = Field(DEFAULT_OLLAMA_URL, json_schema_extra={"user_field": True})`

the pydantic `json_schema_extra` field can then be used when creating a `Configuration` object to create an intermediary `UserConfig`. The User Config will only have fields labeled as user_field meaning that if a user tries to register a configuration with non-user fields specified, they will be dropped, and an inspected configuration will only contain user fields for viewing as well. In the above example the `url` is the only field given the `user_field` schema which is why it is one of the few things showing up. 


## Server Side Device Discovery for Initial Configuration 

Before a user can inspect or register a config of their own, it would make sense to allow providers to utilize a centralized hardware discovery service built into llama-stack. Providers could then act on this information inside of their configuration initialization methods to apply certain defaults depending on the hardware discovered as opposed to a blanket set of defaults.

### 💡 Why is this needed? What if we don't build it?

Without a system like the above, it will be difficult to orchestrate a sequence of providers intended to "work together" or even a single complex provider to be easily accessible to users. Additionally, the more complex APIs and providers that are introduced, the greater odds runtime manipulation of key configuration fields will be necessary.

Say someone provides a data generation, training, and evaluation methodology as separate providers, and each of these depends on specific hardware requirements, hyper parameters, etc to interact with one another _and_ these parameters change per hardware (H100 vs A100 vs L40).

Exposing the current provider configuration to a user will help them understand what they will be running for various providers as functionality gets more complex (SDG, Evals, Training, etc). Additionally, allowing a user to apply parts of a config on top of a running stack as opposed to taking the stack down and having the admin apply a full run config again seems like a more sustainable  workflow.

### Other thoughts

I would like to work on this in collaboration with anyone if possible!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Configuration API #993

🚀 Describe the new functionality needed

Configuration API

UserConfig vs StackRunConfig

Server Side Device Discovery for Initial Configuration

💡 Why is this needed? What if we don't build it?

Other thoughts

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Configuration API #993

Description

🚀 Describe the new functionality needed

Configuration API

UserConfig vs StackRunConfig

Server Side Device Discovery for Initial Configuration

💡 Why is this needed? What if we don't build it?

Other thoughts

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions