-
Notifications
You must be signed in to change notification settings - Fork 3
docs: add initial egress gateway proposal #11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
|
||
| # Relevant Links | ||
|
|
||
| * [Istio's implementation of Egress Gateways](https://istio.io/latest/docs/tasks/traffic-management/egress/egress-gateway/) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're working on getting this into istio's docs (ETA: Istio 1.28 this month), but the ambient egress gateway approach defined here works with open source: https://www.solo.io/blog/egress-gateways-made-easy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just approving for now so others have time to review. Big +1 from me; I think this is the foundation use-case for this WG, and this doc covers the AI-oriented aspects of the "What" and "Why" very well
/approve
| services. All of this points to a need to provide standards for how Kubernetes | ||
| workloads reach these external inference sources, and provide the same AI | ||
| Gateway security, control and management capabilities that are required for the | ||
| ingress use case. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These lines read as 2 related things, with a dependency between them.
- standards for how Kubernetes workloads reach external inference services
- using the same ai security capabilities with egress as you would with ingress (which would build on those standards)
Is there a standard for either of these things more generally in kubernetes for egress and egress security (non ai related)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a few things sprinkled all over the ecosystem, weird tricks with externalname services, technically serviceimport in MCS kind of has some play here, and then several implementations of egress gateways in CNIs and service meshes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree that it could be a reasonable approach to use existing solutions as a sort of de-facto standard to build on.
Taking at look at gateways from:
Common features => static egress IP, routing policies, identity aware auth and mTLS for workload level security, rate limiting, application layer telemetry and workload attribution (super important for agents).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's some NetworkPolicy egress capabilities too, but egress functionality in meshes is definitely more mature and comprehensive than anything that has made its way into core Kubernetes or spec CRDs thus far.
Signed-off-by: Shane Utt <shaneutt@linux.com>
|
|
||
| * As a developer of an application that requires inference as part of its | ||
| function, I need fail-over to 3rd party providers if local AI workloads are | ||
| overwhelmed or in a failure state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the intent to scope the egress gateway only on inferencing, or should egress consider also user stories for agents invoking external tools or other agents as described in Agentic Networking for Kubernetes ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will need to provide:
- General L7 support
- AI Inference Support
- AI Agentic Support
So yes it's a combination of many things that need this, and you'll see in the comment history of that doc that we agreed this is something where the sub-project and the WG will need to work together.
| Gateway security, control and management capabilities that are required for the | ||
| ingress use case. | ||
|
|
||
| ## User Stories |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize this doesn't need to be comprehensive, but calling back to @david-martin's question about typical egress stories I can think of some egress use cases which might also apply to AI workloads (but in a way that would benefit from protocol level understanding):
"As a platform operator I need to attribute outbound traffic per namespace or workload to enforce rate or API utilization limits."
"As a compliance engineer I need to guarantee that outbound traffic to third-party AI resources obeys regulatory restrictions such as region locks."
| * As a cluster admin I need to provide inference to workloads on my cluster, | ||
| but I provide a dedicated cluster for this so that I can manage it | ||
| separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely overlaps the most with existing Gateway API Inference Extension scope, specifically kubernetes-sigs/gateway-api-inference-extension#1374 (worth reading linked design doc too for alternatives considered and potential future alternative approaches)
|
/approve the main aspects of eagress gw are well covered and this looks good to me as a first iteration. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: keithmattix, nirrozenbaum, shaneutt The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
First pass a #10, focusing on building consensus on the "What?" and "Why?" before worrying about how we're going to implement it, and which SIGs and sub-projects we will propose to (those will come in follow-up PRs).