Skip to content

vLLM backend for Cortex #1890

Open
@ramonpzg

Description

@ramonpzg

At the moment, the process by which a user can build a custom python engine to deploy a model via cortex is not straightforward in code or clear in the docs. The plan is:

  • improve the process for building a custom python engine
  • remove unnecessary parameters in the model.yml config
  • improve the documentation
  • add examples of different custom engines to the docs

Goals

  • vllm is python lib to run on linux machine
  • vllm can run llm

Tasks

Obstacle

  • huge effort to maintain this in cortex

Out of scope

  • Audio

Metadata

Metadata

Labels

Projects

Status

In Progress

Status

In Progress

Relationships

None yet

Development

No branches or pull requests

Issue actions