Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-3866: kube-proxy nftables mode #3824

Merged
merged 1 commit into from
Oct 6, 2023

Conversation

danwinship
Copy link
Contributor

@danwinship danwinship commented Feb 2, 2023

The default kube-proxy implementation on Linux is currently based on
iptables. IPTables was the preferred packet filtering and processing
system in the Linux kernel for many years (starting with the 2.4
kernel in 2001). However, problems with iptables led to the
development of a successor, nftables, first made available in the 3.13
kernel in 2014, and growing increasingly featureful and usable as a
replacement for iptables since then. Development on iptables has
mostly stopped, with new features and performance improvements
primarily going into nftables instead.

This KEP proposes the creation of a new official/supported nftables
backend for kube-proxy. While it is hoped that this backend will
eventually replace both the `iptables` and `ipvs` backends and become
the default kube-proxy mode on Linux, that replacement/deprecation
would be handled in a separate future KEP.

@k8s-ci-robot
Copy link
Contributor

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Feb 2, 2023
@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Feb 2, 2023
@astoycos
Copy link
Contributor

astoycos commented Feb 2, 2023

/cc

@k8s-ci-robot k8s-ci-robot requested a review from astoycos February 2, 2023 17:41
backends of kube-proxy, and does not feel like it can ditch `ipvs`
because of performance issues with `iptables`. If we create a new
backend which is as functional and non-buggy as `iptables` but as
performant as `ipvs`, then we could (eventually) deprecate both of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we should validate this assumption

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which assumption? That we could ditch ipvs if we had something better?

Also as mentioned elsewhere, "ditch" may mean "push out of tree a la cri-dockerd"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that ipvs proxy is more performant than the iptables ones,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to be careful what we promise - removing iptables and IPVS modes may take a VERY long time

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you setup a service with 10,000 endpoints, each SYN will have to be compared with ~5,000 probability rules. Ipvs uses a hash. So for TCP connect there is a difference. Once the connection is setup conntrack takes over.

Copy link
Member

@aojea aojea Feb 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is undeniable, let me explain myself better , there are multiple dimensions in scalability, it is clear and acknowledged what are the weakness of iptables, but all the reports and blogs comparing iptables proxy with proxy-foo were focused on stress the iptables weakness, but kube-proxy iptables was running and being tested in very large clusters (5000 nodes) for a lot of years , and tested daily.

http://perf-dash.k8s.io/#/?jobname=gce-5000Nodes&metriccategoryname=Network&metricname=Load_NetworkLatency&Metric=InClusterNetworkLatency

What I mean here is that we should do this exercise, I'd like to see similar tests that the ones we are running so we can compare apples and apples, also analysis of the performance vs scale.
It doesn't matter if it can send 1Mpps in a single node if, per example, it can not run reliable in 1000 nodes and keep the same performance ... size matters (and I've seen reports that iptables beats other implementations at scale, yeah, also eBPF ones) ;)

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reluncantly , ACK. I didn't get into all the details of nft (I need to learn some stuff) but overall the idea seems sound.

Can we fix some other long-lingering issues with iptables mode, too?

  • when do we drop packets vs when do we ICMP-REJECT them?
  • where should 3rd parties (e.g. sysadmins, debug docs, etc) drop custom rules that need to not get nuked upon reconcile?
  • default-deny for protocols/ports on service IPs that are not explicitly accepted (e.g. no nodeports on service VIPs)

Also...should we aim to make this the vehicle for ClusterIPs as Gateways? (and NodePorts?) It's an expansion but I really want us to do it eventually...

backends of kube-proxy, and does not feel like it can ditch `ipvs`
because of performance issues with `iptables`. If we create a new
backend which is as functional and non-buggy as `iptables` but as
performant as `ipvs`, then we could (eventually) deprecate both of the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I want to be careful what we promise - removing iptables and IPVS modes may take a VERY long time

keps/sig-network/NNNN-nftables-proxy/README.md Outdated Show resolved Hide resolved
keps/sig-network/NNNN-nftables-proxy/README.md Outdated Show resolved Hide resolved
keps/sig-network/NNNN-nftables-proxy/README.md Outdated Show resolved Hide resolved
keps/sig-network/NNNN-nftables-proxy/README.md Outdated Show resolved Hide resolved
@aojea
Copy link
Member

aojea commented Feb 5, 2023

Also...should we aim to make this the vehicle for ClusterIPs as Gateways? (and NodePorts?) It's an expansion but I really want us to do it eventually...

I was going to suggest that and I think that this KEP is perfect for that, instead of doing a s/iptables/nftables/ we fix the known problems by leveraging Gateway API ... and we can stop overloading Services

@danwinship
Copy link
Contributor Author

danwinship commented Feb 5, 2023

Can we fix some other long-lingering issues with iptables mode, too?

image

@danwinship
Copy link
Contributor Author

danwinship commented Feb 5, 2023

I didn't get into all the details of nft

(I had been feeling like "nft" was kind of ruined as an acronym so I've mostly been saying "nftables" when talking about this... Although NFTs are less in the news now than they were a few months ago so maybe we'll get that acronym back? At any rate, although the command-line client is nft, the API is generally referred to as "nftables".)

Can we fix some other long-lingering issues with iptables mode, too?

  • when do we drop packets vs when do we ICMP-REJECT them?

I had thought the current rules for this were intentional: if a service has no endpoints at all, we reject connections to it, because we want the client to get "connection refused". If the service has Local traffic policy, and no local endpoints, we drop connections, because we assume the LB is just transiently out-of-date, and if the client tries again, hopefully it will get routed to a correct node the next time (and the client is more likely to try again if it looks like its first connection attempt got lost in the internet, whereas it is not likely to retry if it gets back "connection refused").

(I say "connection refused", but in most cases the client actually gets "no route to host", which isn't really what we want, but IIRC we can't actually fix that if we're using ICMP to reject; we'd have to arrange to redirect the packet to a port that actually isn't listening so the kernel will send back a TCP RST or whatever.)

  • where should 3rd parties (e.g. sysadmins, debug docs, etc) drop custom rules that need to not get nuked upon reconcile?

Yeah... so as I was explaining to Antonio, the normal way to do things in nftables is that everyone creates their own tables and doesn't interfere with anyone else's. In your table you create your own chains that connect to the standard hooks (forward, prerouting, etc) with particular priority values. So, if we document the hooks and priority levels where we do particular things, then people ought to be able to create their own rules in their own tables that run either before or after our rules as needed, just by using higher or lower priority values.

If people want to stick rules into particular service chains or whatever, then that will likely run into the same problems you get now with the rules getting overwritten by resyncs... But if that's a use case we want to support, we can definitely think about how to make it work.

  • default-deny for protocols/ports on service IPs that are not explicitly accepted (e.g. no nodeports on service VIPs)

Right. I was mentally including that along with localhost nodeports: we should be a lot more precise about exactly where nodeports work (and maybe try to default to "they only work on one interface" rather than "they work everywhere".)

Also...should we aim to make this the vehicle for ClusterIPs as Gateways? (and NodePorts?) It's an expansion but I really want us to do it eventually...

I don't have a good sense of what that would mean...

and write rules, much like how we use command-line tools in the
`iptables` and `ipvs` backends. However, the `nft` tool is mostly just
a thin wrapper around `libnftables`, and it would be possible to use
that directly instead in the future, given a cgo wrapper.
Copy link
Member

@aojea aojea Feb 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that I read somewhere some problems about using CGO ... we should check if this is a problem for this

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, that's just a bad meme. Someone wrote a blog post about how cgo is bad, and now everyone knows that, but the original author was just pointing out that a cgo function call has like 10x-100x the overhead of a go function call, so trying to optimize your performance-critical code by calling into C is often a lose. But the overhead of go→C is much less than the overhead of userspace→kernel so if you're calling into a C function that makes a syscall then the cgo overhead is just noise anyway. (And the cgo overhead is way less than the overhead of spawning a binary to make the syscall for you.)

@thockin
Copy link
Member

thockin commented Feb 5, 2023

I didn't get into all the details of nft

(I had been feeling like "nft" was kind of ruined as an acronym so I've mostly been saying "nftables" when talking about this... Although NFTs are less in the news now than they were a few months ago so maybe we'll get that acronym back? At any rate, although the command-line client is nft, the API is generally referred to as "nftables".)

ACK :)

  • when do we drop packets vs when do we ICMP-REJECT them?

I had thought the current rules for this were intentional: if a service has no endpoints at all, we reject connections to it, because we want the client to get "connection refused". If the service has Local traffic policy, and no local endpoints, we drop connections, because we assume the LB is just transiently out-of-date, and if the client tries again, hopefully it will get routed to a correct node the next time (and the client is more likely to try again if it looks like its first connection attempt got lost in the internet, whereas it is not likely to retry if it gets back "connection refused").

(I say "connection refused", but in most cases the client actually gets "no route to host", which isn't really what we want, but IIRC we can't actually fix that if we're using ICMP to reject; we'd have to arrange to redirect the packet to a port that actually isn't listening so the kernel will send back a TCP RST or whatever.)

We also have the case of addressing a service IP on a port that isn't declared - I think we DROP that, but I think it should be a fast error.

  • where should 3rd parties (e.g. sysadmins, debug docs, etc) drop custom rules that need to not get nuked upon reconcile?

Yeah... so as I was explaining to Antonio, the normal way to do things in nftables is that everyone creates their own tables and doesn't interfere with anyone else's. In your table you create your own chains that connect to the standard hooks (forward, prerouting, etc) with particular priority values. So, if we document the hooks and priority levels where we do particular things, then people ought to be able to create their own rules in their own tables that run either before or after our rules as needed, just by using higher or lower priority values.

If people want to stick rules into particular service chains or whatever, then that will likely run into the same problems you get now with the rules getting overwritten by resyncs... But if that's a use case we want to support, we can definitely think about how to make it work.

Yeah, I guess what I want is docs and some sort of contract - we expect X and Y and as long as you don't break that, you can attach Here, Here, and Here to you r heart's content.

Also...should we aim to make this the vehicle for ClusterIPs as Gateways? (and NodePorts?) It's an expansion but I really want us to do it eventually...

I don't have a good sense of what that would mean...

If we have a "ClusterIP" gateway class, this would require a per-node agent to watch gateways (or something else of lower cardinality, derived from gateways of this class) and implement the moral equivalent of a ClusterIP service. But it needs some design work. But nobody is doing the design work because we don't have any impl to prove it out on...

@danwinship
Copy link
Contributor Author

We also have the case of addressing a service IP on a port that isn't declared - I think we DROP that, but I think it should be a fast error.

ah, yeah, we don't actively -j DROP those packets, we just don't do anything with them. (So they get dropped.)

This is one of several things that would be easier to deal with if kube-proxy knew the service CIDR(s). Then we could just add a rule at the end saying "all other traffic to the service CIDR gets rejected". (But even without that, we could implement this now in iptables kube-proxy at the cost of an extra iptables rule per clusterIP, and we could implement it in nftables with just one rule and a set containing all clusterIPs.)

Also...should we aim to make this the vehicle for ClusterIPs as Gateways? (and NodePorts?) It's an expansion but I really want us to do it eventually...

I don't have a good sense of what that would mean...

If we have a "ClusterIP" gateway class, this would require a per-node agent to watch gateways (or something else of lower cardinality, derived from gateways of this class) and implement the moral equivalent of a ClusterIP service. But it needs some design work. But nobody is doing the design work because we don't have any impl to prove it out on...

That doesn't seem that backend dependent though... does implementing Gateway require any backend-level functionality beyond what implementing Service requires? It seems like the changes would mostly be in the pkg/proxy-level code shared by all the backends...

@danwinship danwinship mentioned this pull request Feb 9, 2023
10 tasks
@danwinship danwinship changed the title WIP: kube-proxy nftables mode KEP-3866: kube-proxy nftables mode Feb 10, 2023
with ipvs). Eventually, if nftables becomes the default, then this
would be flipped around to having a legacy "iptables" job.

The handful of e2e tests that specifically examine iptables rules will
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if this still exist, but is not great for e2e test to have this implementation dependencies, I rather rewrite them to focus on the expected behavior or drop them entirely, I think you already did that for some of them

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting... I had thought there were more of these. I guess they mostly got fixed. (You're right, I did just remove one of them.) There is only one remaining test that parses iptables rules, "It should recreate its iptables rules if they are deleted [Disruptive]", which tests 2 things:

  1. kubelet recreates KUBELET-IPTABLES-HINT if it is deleted (which is iptables-specific and should stick around regardless of kube-proxy, for as long as we expect third-party components to use iptables).
  2. Services eventually start working again even if you delete all KUBE-* iptables chains. It doesn't actually test that they stop working after you delete those chains, so it would still pass in nftables mode, it just wouldn't test anything interesting. (And as discussed in the KEP text, we probably don't need to do monitoring for nftables.)

There are also a few tests that manually invoke iptables to break the network via TestUnderTemporaryNetworkFailure (plus one that doesn't use that function but should). This will eventually need to be updated to have an nftables-based version, but that's independent of kube-proxy.

I think that's it? Yay us.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

filed kubernetes/kubernetes#119047 about TestUnderTemporaryNetworkFailure


The actual feature gate enablement/disablement itself is not
interesting, since it only controls whether `--proxy-mode nftables`
can be selected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the ideal world I would expect the clearing to happen out of the box. But here we have a tradeoff with enabling some logic even if the feature is off by-default.

My above thinking was a bit of a mental shortcut, so let me clarify that.

I think that we should split our thinking into two separate flows:
(1) allowing users to choose nftables as a backend
(2) making nftables as a default backend for their clusters

If we would be talking about (2), then I'm definitely with you - that path show have cleanup enabled by default in the version that is previous that enabled nftables as a default backend.

That said, this KEP is purely about (1). So an operator has to make an explicit action to enable the feature.
In such case, I expect that an operator is actually aware of consequences and know that some additional actions may need to be taken on rollback.
Because you don't need these additional actions by default - you only need them if you explicitly opted-in for using this feature for your cluster. Especially, that would really be the case for enabling a Beta feature.

I expect that together with graduating this KEP to beta, we will enable the cleanup logic by default. so once the feature reaches GA and someone opts-in for that when it's in GA, they will get cleanup for free on rollback [because it will be rolling back to state with feature already enabled].

Does that make sense to you?

performance as good as iptables and ipvs.

- Documentation describes any changes in behavior between the
`iptables` and `ipvs` modes and the `nftables` mode. Any warnings
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - this is a perfect example of thing that we should have (for Beta).

implementations, etc, may need updates in order to work with it. (This
may further limit the amount of testing the new mode can get during
the Alpha phase, if it is not yet compatible with popular network
plugins at that point.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for adding this - do we have some mitigation here that we want to mention?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not really; we can try to avoid adding making it gratuitously different, but we can't avoid the fact that it's going to be different.

(This is a separate "don't be too different" concern from the user-level behavioral differences.)

presented: it allows you to run a docker registry in a pod and then
have nodes use a NodePort service via `127.0.0.1` to access that
registry. Docker treats `127.0.0.1` as an "insecure registry" by
default (though containerd and cri-o do not) and so does not require
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you're all better positioned to make this decision. My only request for Beta will be that will not work the same way in nftables mode is easily to discover for operators of existing clusters (metrics).

@jayunit100
Copy link
Member

Anyway sig network can decide wether the code will go in tree or not ? Seemed like there was openness to considering different options ( out vs in tree )....


[KEP-3786]: https://github.com/kubernetes/enhancements/issues/3786

### Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a goal like "Design and document where and how external actors can plug into the packet processing logic safely". We never made this easy in iptables mode, e.g. if I wanted to -j LOG for a particular service.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this below, but maybe a nod here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's there in Goals:

    - Document specific details of the nftables implementation that we
      want to consider as "API". In particular, document the
      high-level behavior that authors of network plugins can rely
      on. We may also document ways that third parties or
      administrators can integrate with kube-proxy's rules at a lower
      level.

Copy link
Member

@thockin thockin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, wonderful KEP. Very comprehensive and clear. Thanks.

Second, I did not get to the PRR part yet, but here are my comments.

Third, this is getting ahead of ourselves, but.... In the future, I hope to see Service LBs be replaced by Gateway and maybe even Servicer ClusterIPs. This would enable things like all-ports and all-protocols. As we go about buildingin this it would be great to keep an eye on "if we were watching Gateways, that logic would go here"?

presented: it allows you to run a docker registry in a pod and then
have nodes use a NodePort service via `127.0.0.1` to access that
registry. Docker treats `127.0.0.1` as an "insecure registry" by
default (though containerd and cri-o do not) and so does not require
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that we can, cautiously, kill this. We should do some outreach through usual channels and see if anyone screams.

```
<<[UNRESOLVED dnat-but-no-route_localnet ]>>

As a possible compromise, we could make the `nftables` backend create
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's keep this in our pocket, in case a real use-case emerges, but not implement until then?

Comment on lines +1101 to +1175
The nftables proxy should default to only opening NodePorts on a
single interface, probably the interface with the default route by
default. (Ideally, you really want it to accept NodePorts on the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we are revisiting this from scratch, let's dial it ALL the way back. I like starting with the IP with default route. If that is not sufficient, IPs matching a flag is pretty powerful. If that is not sufficient, we can look at other options (same as detect-local-mode)

```
<<[UNRESOLVED unused service IP ports ]>>

@thockin has suggested that service IPs should reject connections on
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should reject or drop, but almost certainly not ignore such connections. I lean towards reject, but I can see arguments in both directions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reject improves supportability, the error is more meaningful "connection refused" indicates someone does not want you to go through , drop is "connection timeout"
I've seen thousands of those red herrings in customers tickets at SUSE, at RedHat and at Google now, it is easy to write a playbook as "connect to a service and connection refused check if there are endpoints", however, "connect timeout check if there are endpoints and if everything in the path is not filtering the packet"


@thockin has suggested that service IPs should reject connections on
ports they aren't using. (This would most easily be implemented by
adding a `--service-cidr` flag to kube-proxy so we could just "reject
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the less things that we teach about things like service-cidr the better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

with KEP-1880 kube-proxy could just watch the ServiceCIDR objects, and no one would need to explicitly tell it anything

<<[UNRESOLVED service IP pings ]>>

Users sometimes get confused by the fact that service IPs do not
respond to ICMP pings, and perhaps this is something we could change.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Supporting ping is nice and friendly, and I don't see any reason NOT to really, but I also don't see a strong reason in favor. One could argue that it's an info-leak but DNS is already in play :)


[KEP-3786]: https://github.com/kubernetes/enhancements/issues/3786

### Goals
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this below, but maybe a nod here?

<<[/UNRESOLVED]>>
```

#### Defining an API for integration with admin/debug/third-party rules
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this!

@danwinship
Copy link
Contributor Author

Hopefully everyone should be happy with the latest updates:

  • Small note about third-party interactions
  • Note that (unlike in the examples here) we can probably used unhashed chain names
  • Add an UNRESOLVED about session affinity rather than just mentioning it in passing
  • Clarified how the kube-proxy "repair loop" works and that we will still have that in nftables, it just won't have an explicit "your firewall ate my rules" check.
  • Added a Beta requirement for more unit tests

Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have just one minor (non-blocking) comment at this point - it looks good enough for Alpha to me.

@danwinship - I'm waiting for SIG-level approval and then ping me on slack for PRR approval.


The actual feature gate enablement/disablement itself is not
interesting, since it only controls whether `--proxy-mode nftables`
can be selected.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would make sense to summarize this comment in the KEP for the future (or maybe just link this comment thread)?

Copy link
Contributor

@sftim sftim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 options here IMO.

  • mark it provisional, merge it as is. Iterate.
  • leave it implementable, do the tidy up to be ready for alpha and trackable as beta, merge that

Honestly I recommend the provisional and iterate approach. Not my call though.

@wojtek-t
Copy link
Member

wojtek-t commented Oct 2, 2023

this LGTM for alpha from PRR pov

/approve PRR

@wojtek-t
Copy link
Member

wojtek-t commented Oct 2, 2023

Honestly I recommend the provisional and iterate approach. Not my call though.

This is pretty much ready for alpha - it's fine to have unresolved items, these are blocker for beta, not for alpha.

@vrabbi
Copy link

vrabbi commented Oct 3, 2023

Just my 2cents, Is there a reason we are continuing to bloat in tree vs advancing the out of tree options like is being done extensively throughout the k8s codebase. I get the need for nftables but would think pushing forwards an agreed upon out of tree model would make more sense than bloating up in tree logic.

@sftim
Copy link
Contributor

sftim commented Oct 3, 2023

Just my 2cents, Is there a reason we are continuing to bloat in tree vs advancing the out of tree options like is being done extensively throughout the k8s codebase. I get the need for nftables but would think pushing forwards an agreed upon out of tree model would make more sense than bloating up in tree logic.

The move out of tree might be a different enhancement. This one is a time-sensitive piece specifically about nftables.

@aojea
Copy link
Member

aojea commented Oct 3, 2023

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Oct 3, 2023
@npolshakova
Copy link

Hi @kubernetes/sig-network-leads, 👋 1.29 Enhancement Lead here. Just wanted to confirm- is this waiting for an approval label or is more discussion required?

@thockin
Copy link
Member

thockin commented Oct 6, 2023

Sorry, I thought it was approved!

/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: danwinship, soltysh, thockin, wojtek-t

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 6, 2023
@thockin
Copy link
Member

thockin commented Oct 6, 2023

I would like to give this a go-ahead for 1.29 please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/network Categorizes an issue or PR as relevant to SIG Network. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.