Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

planning: API for Active models #1688

Open
2 tasks
gabrielle-ong opened this issue Nov 15, 2024 · 0 comments
Open
2 tasks

planning: API for Active models #1688

gabrielle-ong opened this issue Nov 15, 2024 · 0 comments
Assignees
Labels
category: model management Model pull, yaml, model state type: planning Opening up a discussion

Comments

@gabrielle-ong
Copy link
Contributor

Goal

  • We currently have the CLI version of cortex ps to show active models and consumed resources (Ram/vram usage and how long its been running)
  • we have an unofficial API GET http://localhost:39281/inferences/server/models
  • we should make this an official /v1 API

Discussion / Success Criteria: @louis-jan

  1. Should it 1 API to show all running models & consumed resources? ie == CLI cortex ps
  2. Or separate APIs to show active status for /models (per model or all models?)
  3. And separate API to show resources consumed /system

Tasklist

  • add unofficial API to docs (interim) - @gabrielle-ong
  • implement official API endpoint(s)

Current

GET http://localhost:39281/inferences/server/models

Response:

{
    "data": [
        {
              "engine": "cortex.llamacpp",
              "id": "llama3.2-1b-instruct",
              "model_size": 123,
              "object": "model",
              "ram": 123,
              "start_time": 123,
              "vram": 123,              
        }
    ],
    "object": "list"
}

Future: API / CLI

API

1. Feature

GET /v1/endpoint

Body:

{
    "key": "value"
}

Response

200
{
}
Error
{
}

User request

image
@gabrielle-ong gabrielle-ong added the type: epic A major feature or initiative label Nov 15, 2024
@github-project-automation github-project-automation bot moved this to Investigating in Menlo Nov 15, 2024
@gabrielle-ong gabrielle-ong added type: planning Opening up a discussion category: model management Model pull, yaml, model state and removed type: epic A major feature or initiative labels Nov 15, 2024
@gabrielle-ong gabrielle-ong modified the milestones: v1.0.3, v1.0.4 Nov 15, 2024
@gabrielle-ong gabrielle-ong moved this from Investigating to Icebox in Menlo Nov 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
category: model management Model pull, yaml, model state type: planning Opening up a discussion
Projects
Status: Icebox
Development

No branches or pull requests

2 participants