The LeaderWorkerSet API (LWS)

LeaderWorkerSet: An API for deploying a group of pods as a unit of replication. It aims to address common deployment patterns of AI/ML inference workloads, especially multi-host inference workloads where the LLM will be sharded and run across multiple devices on multiple nodes. The initial design and proposal can be found at: http://bit.ly/k8s-LWS.

Conceptual view

Feature overview

Group of Pods as a unit: Supports a tightly managed group of pods that represent a “super pod”
- Unique pod identity: Each pod in the group has a unique index from 0 to n-1.
- Parallel creation: Pods in the group will have the same lifecycle and be created in parallel.
Dual-template, one for leader and one for the workers: A replica is a group of a single leader and a set of workers, and allow to specify a template for the workers and optionally use a second one for the leader pod.
Multiple groups with identical specifications: Supports creating multiple “replicas” of the above mentioned group. Each group is a single unit for rolling update, scaling, and maps to a single exclusive topology for placement.
A scale subresource: A scale endpoint is exposed for HPA to dynamically scale the number replicas (aka number of groups)
Rollout and Rolling update: Supports performing rollout and rolling update at the group level, which means the groups are upgraded one by one as a unit (i.e. the pods within a group are updated together).
Topology-aware placement: Opt-in support for pods in the same group to be co-located in the same topology.
All-or-nothing restart for failure handling: Opt-in support for all pods in the group to be recreated if one pod in the group failed or one container in the pods is restarted.

Installation

Read the installation guide to learn more.

Examples

Read the examples to learn more.

Community, discussion, contribution, and support

Learn how to engage with the Kubernetes community on the community page.

You can reach the maintainers of this project at:

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

Name		Name	Last commit message	Last commit date
Latest commit History 387 Commits
.github		.github
.tekton		.tekton
api/leaderworkerset/v1		api/leaderworkerset/v1
bundle		bundle
charts/lws		charts/lws
client-go		client-go
cmd		cmd
config		config
docs		docs
hack		hack
keps		keps
pkg		pkg
test		test
.gitignore		.gitignore
.golangci.yaml		.golangci.yaml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
Makefile-deps.mk		Makefile-deps.mk
OWNERS		OWNERS
PROJECT		PROJECT
README.md		README.md
RELEASE.md		RELEASE.md
SECURITY.md		SECURITY.md
SECURITY_CONTACTS		SECURITY_CONTACTS
bundle.Dockerfile		bundle.Dockerfile
cloudbuild.yaml		cloudbuild.yaml
code-of-conduct.md		code-of-conduct.md
go.mod		go.mod
go.sum		go.sum

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The LeaderWorkerSet API (LWS)

Conceptual view

Feature overview

Installation

Examples

Community, discussion, contribution, and support

Code of conduct

About

Releases

Packages

Languages

License

openshift/kubernetes-sigs-lws

Folders and files

Latest commit

History

Repository files navigation

The LeaderWorkerSet API (LWS)

Conceptual view

Feature overview

Installation

Examples

Community, discussion, contribution, and support

Code of conduct

About

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages