-
Notifications
You must be signed in to change notification settings - Fork 910
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unnumbered BGP Peering design document #2508
Conversation
d3ec2bd
to
f0befef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks! great work, did a first pass
design/unnumbered-bgp.md
Outdated
- Modify the testing infra to include point-to-point links which means | ||
connect containers using VETH, and not through a bridge. We need NET-ADMIN cap (or sudo) when creating the VETH | ||
- Create/Delete peer per specific test not to take resources from runners outside the runtime of the test | ||
- This peer needs different configuration (not IPs, interface names), keep a raw config |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iiuc this will all be covered in the initial implementation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you elaborate what you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
asking if these points will be covered or there's still a question about them, while the last point (about primary interfaces) still needs to be thought of
design/unnumbered-bgp.md
Outdated
connect containers using VETH, and not through a bridge. We need NET-ADMIN cap (or sudo) when creating the VETH | ||
- Create/Delete peer per specific test not to take resources from runners outside the runtime of the test | ||
- This peer needs different configuration (not IPs, interface names), keep a raw config | ||
- There is no straightforward way to test unnumbered in the primary node interface. The testing setup (kind creation) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there's some value in adding another worker+container that are peered using unnumbered (no p2p, but there will be no conflicts as they're the only routers in the network using it) to get some feedback on peering with the primary node interface? wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mean to have a single worker as speaker, and have single external container frr connected on the same docker network bridge?
My concern is that locally the linux bridge does not forwarding RA when LLA address is the source (at least in my fedora laptop).
For sure we need to verifiy how that works on primary interface, and especially when the interface has also a non-LLA address configured.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern is that locally the linux bridge does not forwarding RA when LLA address is the source (at least in my fedora laptop).
oh, didn't you run into the "random" peering we talked about?
Do you mean to have a single worker as speaker, and have single external container frr connected on the same docker network bridge?
right (adding another worker / designating one), then the "randomness" will be okay - since there are only 2 peers on the same network using unnumbered they'll always peer with each other
411fb01
to
ead4263
Compare
54f2204
to
d077dc3
Compare
design/unnumbered-bgp.md
Outdated
exit | ||
``` | ||
|
||
An example ToR configuration when based on Junos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure we need this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo yes, maybe not our task, but unless I see that real config in real setup, I am not sure our use cases are valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let's move it to the bottom in an addendum. If we can make this work with frr, it's likely to work with other routers as its part of the spec. Here it's distracting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am removing it as I was not able to find a real world example
design/unnumbered-bgp.md
Outdated
implementation will shadow all the existing errors and will fail in the | ||
runtime. | ||
|
||
#### Option 2. Add a new Interface field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is the most valid one, and one that matches FRR the most in a sense (since it seems there's a distinction between interfaces and addresses) without locking us. Also, the docs don't need to be as complicated (stating all of the cases) as it's simply following the what frr is doing, that is we can just shortly say, if address is not specified the peering happens over the interface (unnumbered), if address is specified it is assumed the it is lla.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 on this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK then, will put that as but we need to decide if to further #2423
and have this
// 1. if address and no interface then as normal
// 2. if non valid address and no interface then fail
// 3. if address is LLA and interface not empty, then LLA support
// == extra FRR directive `neighbor fe80::a876:4dff:fe77:408 interface eth0`
// 4. If address is LLA and empty interface, then fail
// 5. If address is empty and interface has value then the BGP unnumber peering config will take place
// == modify FRR directive `neighbor PEER interface remote-as <ASN>`
// && replace any address directive to use interface e.g. `neighbor net0 timers 180 540`
// && enable RAs on this interface
// 6. If address has value (non-LLA) and interface non empty, then fail
```
as the end API
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understand - implementation wise the issue will be solved for free(?) and documentation wise it seems sufficient to document the Interface
field something like "if specified without address then peering happens over the interface (unnumbered), if together with address then it is assumed to be LLA"?
design/unnumbered-bgp.md
Outdated
// 6. If address has value (non-LLA) and interface non empty, then fail | ||
``` | ||
That option might be not ideal (subjective opinion) because documentation looks complicated, | ||
and user might have to do trial-and-error approach. Plus user needs to understand corner cases. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the user will need to understand the usecases as it does for frr, is there extra complexity beyond that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
imo yes :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so I think it should be more detailed, because it is not really clear atm
design/unnumbered-bgp.md
Outdated
- Modify the testing infra to include point-to-point links which means connect | ||
containers using VETH, and not through a bridge. We need NET-ADMIN cap (or sudo) | ||
when creating the VETH. | ||
|
||
- Create/Delete peer per that specific test not to take resources from runners | ||
outside the runtime of the test. This peer needs different configuration (not | ||
IPs and interface names), keep separate raw configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure I understand, you're proposing a separate test for the unnumbered case or expanding the existing containers setup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I propose a separate isolated test for that use case.
(but no strong argument against expanding container setup).
Do you have a preference here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say if it's possible to have all the tests ran on unnumbered in a clean way it's preferable
8ec949e
to
1cc83ee
Compare
design/unnumbered-bgp.md
Outdated
Using BGP unnumbered peering, which dynamically discovers IPV6 neighbors, | ||
reduces the burden of configuring all interfaces (on the network fabric | ||
or on the cluster nodes) to have IPv4 addressing just for the BGP peering. | ||
Unnumbered BGP utilizes IPv6 link local address to automatically decide who to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnumbered BGP utilizes IPv6 link local address to automatically decide who to | |
Unnumbered BGP utilizes IPv6 link local address to automatically decide which peer to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
design/unnumbered-bgp.md
Outdated
reduces the burden of configuring all interfaces (on the network fabric | ||
or on the cluster nodes) to have IPv4 addressing just for the BGP peering. | ||
Unnumbered BGP utilizes IPv6 link local address to automatically decide who to | ||
connect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connect. | |
connect to. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
design/unnumbered-bgp.md
Outdated
BGP Unnumbered allows routers to peer with each other when direct connected | ||
without the need for IPs (config uses interface names). Once peering takes | ||
places, exchange IPv4 or IPv6 prefixes can take place. That feature can be used | ||
by FRR router instance configured by MetalLB to simplify specific setups. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd avoid the last sentence, as we are talking about MetalLB in general and not about frr. We can talk about frr when we go in the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed FRR reference.
design/unnumbered-bgp.md
Outdated
## Limitation/Requirements | ||
|
||
- To use unnumbered BGP feature, there must be a link (point-to-point, directly | ||
connected) between the interfaces two BGP daemons are using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
connected) between the interfaces two BGP daemons are using | |
connected) between the interfaces two BGP speakers are using |
Note that the other peer is normally a router
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the other peer is a ToR which, which acts both as L2 switch, and L3 router. Do you see any confusion though here?
design/unnumbered-bgp.md
Outdated
unnumber: true | ||
``` | ||
|
||
### Testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
### Testing | |
### Test plan |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
design/unnumbered-bgp.md
Outdated
There is no straightforward way to test unnumbered in the primary node | ||
interface. The testing setup (kind creation) creates bridge and connect | ||
containers. Ideally we need to setup a kind cluster, and replace the bridge | ||
with a containers that runs FRR on each other side of the veth, and at the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with a containers that runs FRR on each other side of the veth, and at the same | |
with a container that runs FRR on each other side of the veth, and at the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
design/unnumbered-bgp.md
Outdated
containers. Ideally we need to setup a kind cluster, and replace the bridge | ||
with a containers that runs FRR on each other side of the veth, and at the same | ||
time to provide switch/upstream. Until we find a way to automate that infra setup | ||
we test that scenario manual. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we test that scenario manual. | |
we will test that scenario manually. |
Manually how? What we can do manually that we can't automate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As discussed is not worth the effort to refactor how we set up the infra.
design/unnumbered-bgp.md
Outdated
time to provide switch/upstream. Until we find a way to automate that infra setup | ||
we test that scenario manual. | ||
|
||
#### E2E test for [Unnumber on secondary interface](design/unnumbered-bgp.md#story-1-unnumbered-iebgp-on-secondary-node-interface) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#### E2E test for [Unnumber on secondary interface](design/unnumbered-bgp.md#story-1-unnumbered-iebgp-on-secondary-node-interface) | |
#### E2E test for [Unnumbered on secondary interface](design/unnumbered-bgp.md#story-1-unnumbered-iebgp-on-secondary-node-interface) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
design/unnumbered-bgp.md
Outdated
connect containers using VETH, and not through a bridge. We need NET-ADMIN cap | ||
(or sudo) when creating the VETH. | ||
|
||
##### Option 1 - have a separate test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd start with this, the majority of the scenarios are functional and should (!!!) be orthogonal to how we establish the session
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, i keep the text as is, one we have option 1 I would say we can merge, but I am in favor to then try to implement also option
1cc83ee
to
0173e68
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the feedback
design/unnumbered-bgp.md
Outdated
BGP Unnumbered allows routers to peer with each other when direct connected | ||
without the need for IPs (config uses interface names). Once peering takes | ||
places, exchange IPv4 or IPv6 prefixes can take place. That feature can be used | ||
by FRR router instance configured by MetalLB to simplify specific setups. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed FRR reference.
design/unnumbered-bgp.md
Outdated
Using BGP unnumbered peering, which dynamically discovers IPV6 neighbors, | ||
reduces the burden of configuring all interfaces (on the network fabric | ||
or on the cluster nodes) to have IPv4 addressing just for the BGP peering. | ||
Unnumbered BGP utilizes IPv6 link local address to automatically decide who to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
design/unnumbered-bgp.md
Outdated
reduces the burden of configuring all interfaces (on the network fabric | ||
or on the cluster nodes) to have IPv4 addressing just for the BGP peering. | ||
Unnumbered BGP utilizes IPv6 link local address to automatically decide who to | ||
connect. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
design/unnumbered-bgp.md
Outdated
## Limitation/Requirements | ||
|
||
- To use unnumbered BGP feature, there must be a link (point-to-point, directly | ||
connected) between the interfaces two BGP daemons are using |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the other peer is a ToR which, which acts both as L2 switch, and L3 router. Do you see any confusion though here?
design/unnumbered-bgp.md
Outdated
|
||
- To use unnumbered BGP feature, there must be a link (point-to-point, directly | ||
connected) between the interfaces two BGP daemons are using | ||
- IPv6 must be enabled on that interfaces, and in specific LLA (link Local IPv6 address) enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
design/unnumbered-bgp.md
Outdated
connect containers using VETH, and not through a bridge. We need NET-ADMIN cap | ||
(or sudo) when creating the VETH. | ||
|
||
##### Option 1 - have a separate test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, i keep the text as is, one we have option 1 I would say we can merge, but I am in favor to then try to implement also option
design/unnumbered-bgp.md
Outdated
nexthop via inet6 fe80::dcad:beff:feff:1162 dev eth02 weight 1 | ||
``` | ||
|
||
### The API Extension |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
design/unnumbered-bgp.md
Outdated
|
||
### The API Extension | ||
|
||
We have three option: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Three options how the API could look like, but that was removed.
design/unnumbered-bgp.md
Outdated
|
||
#### BGP RouterID | ||
|
||
//TODO is there a requirement for that? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was expected if you could explain me how the RouterID is important or not. Example when we have single IPv6 stack. But I have that for discussion, removed
design/unnumbered-bgp.md
Outdated
|
||
#### Loopback IPv4 address | ||
|
||
Linux sends and receives IPv4 packets over an interface that does not have an IPv4 address. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO, explain better
0173e68
to
68481d6
Compare
074ad7f
to
d3c9b53
Compare
design/unnumbered-bgp.md
Outdated
- Make it work over no point-to-point connection | ||
- Define/Discuss dual ToR architecture, on top of bond interfaces or other HA strategies | ||
- [Dynamic BGP Peering](https://ipwithease.com/dynamic-bgp-peering/) is a | ||
feature different than unnumbered BGP and not supported by Metallb. Both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove the "not supported by MetalLB" (it might be in the future, and unrelated to this)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
design/unnumbered-bgp.md
Outdated
feature different than unnumbered BGP and not supported by Metallb. Both | ||
feature reduce the configuration needed for the BGP peering. The main difference is | ||
that Dynamic BGP peering still needs IP addresses and does not require point-to-point links. | ||
- Support `remote-as` to accept `<external|internal|auto>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also remove this (seems unrelated)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
design/unnumbered-bgp.md
Outdated
We can introduce a new field `Interface string` which if defined then Unnumber | ||
BGP takes place. In that case, the address must NOT have value. For | ||
completeness, we could use field might be used to support `LLA%zone` [user case](https://github.com/metallb/metallb/issues/2423). | ||
So at the end, the API documentation will look like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as the other comment, I think this could be a lot shorter (no need to specify everything and its consequence)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, but I prefer to remove once @fedepaol take another feedback and he is ok with the consequence or not (we could decide no to support the LLA)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's focus on unnumbered and maybe implement LLA afterwards if we need / want to.
What's important here is that the API is compatible with that. If we want to mention it, let's add a subsection in the bottom
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, removed the reference to LLA
d3c9b53
to
a162717
Compare
design/unnumbered-bgp.md
Outdated
We can introduce a new field `Interface string` which if defined then Unnumber | ||
BGP takes place. In that case, the address must NOT have value. For | ||
completeness, we could use field might be used to support `LLA%zone` [user case](https://github.com/metallb/metallb/issues/2423). | ||
So at the end, the API documentation will look like |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's focus on unnumbered and maybe implement LLA afterwards if we need / want to.
What's important here is that the API is compatible with that. If we want to mention it, let's add a subsection in the bottom
design/unnumbered-bgp.md
Outdated
|
||
### Option 2. Add a new Interface field | ||
|
||
We can introduce a new field `Interface string` which if defined then Unnumber |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is the proposed change, let's reword with A new interface field will be introduced
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
design/unnumbered-bgp.md
Outdated
|
||
## The API Changes | ||
|
||
### Option 2. Add a new Interface field |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not an option anymore, but the proposed change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: karampok <karampok@gmail.com>
a162717
to
b5d7cf3
Compare
LGTM, thanks for this! |
/kind design
What this PR does / why we need it:
Special notes for your reviewer:
In part of this PR there is
My plan would be that first we discuss how the raw frr config would look like, then the API in frr-k8s, then in metallb.
Thanks
Release note: