Skip to content

kubernetes-sigs/gateway-api-inference-extension

Kubernetes LLM Instance Gateway

The LLM Instance Gateway came out of wg-serving and is sponsored by SIG Apps. This repo contains: the load balancing algorithm, ext-proc code, CRDs, and controllers to support the LLM Instance Gateway.

This Gateway is intented to provide value to multiplexed LLM services on a shared pool of compute. See the proposal for more info.

Status

This project is currently in development.

For more rapid testing, our PoC is in the ./examples/ dir.

Getting Started

Install the CRDs into the cluster:

make install

Delete the APIs(CRDs) from the cluster:

make uninstall

Deploying the ext-proc image Refer to this README on how to deploy the Ext-Proc image used to support Instance Gateway.

Contributing

Our community meeting is weekly at Th 10AM PDT; zoom link here.

We currently utilize the #wg-serving slack channel for communications.

Contributions are readily welcomed, thanks for joining us!

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

About

LLM Instance gateway implementation.

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published