-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[libteam] Support multiple member ports for LACP fallback #10823
base: master
Are you sure you want to change the base?
Conversation
0e667bf
to
2d5d46c
Compare
2d5d46c
to
c027eb4
Compare
c027eb4
to
8daab64
Compare
@saiarcot895 @vaibhavhd based on recent PRs, you might be the right people to ask for a review here? Any help would be appreciated. |
@rodnymolina @liat-grozovik Could you please review this PR? |
@saiarcot895 Ping |
+ } | ||
+ if (lacp_port->state == PORT_STATE_CURRENT || lacp_port->state == PORT_STATE_EXPIRED) { | ||
+ /* If at least one port is currently not defaulted, don't do any fallback. */ | ||
+ have_selected_port = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you rename this to something like have_active_port
or similar? have_selected_port
sounds like a port was selected in this loop.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saiarcot895 Acutally "selected" is a terminology used in IEEE 802.3ad. When an aggregator port received LACPDU, the RX state machine will transit to "CURRENT" state and the selection logic update the state machine variable "selected" to "SELECTED", "UNSELECTED" or "STANDBY". See below excerpted from IEEE 802.3ad:
"
A value of SELECTED indicates that the Selection Logic has selected an appropriate
Aggregator. A value of UNSELECTED indicates that no Aggregator is currently selected. A
value of STANDBY indicates that although the Selection Logic has selected an appropriate
Aggregator, aggregation restrictions currently prevent the Aggregation Port from being enabled
as part of the aggregation, and so the Aggregation Port is being held in a standby condition.
This variable can only be set to SELECTED or STANDBY by the operation of the Aggregation
Port’s Selection Logic. It can be set to UNSELECTED by the operation of the Aggregation
Port’s Receive machine, or by the operation of the Selection Logic associated with another
Aggregation Port.
"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not certain that selected
is perfectly accurate here, so I've renamed this to do_fallback
, which is a better description of the purpose anyways - it's supposed to track whether we want to use fallback logic or not.
Note that you'll need to resolve the merge conflicts as well. |
+ } | ||
+ if (lacp_port->state == PORT_STATE_CURRENT || lacp_port->state == PORT_STATE_EXPIRED) { | ||
+ /* If at least one port is currently not defaulted, don't do any fallback. */ | ||
+ have_selected_port = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@saiarcot895 Acutally "selected" is a terminology used in IEEE 802.3ad. When an aggregator port received LACPDU, the RX state machine will transit to "CURRENT" state and the selection logic update the state machine variable "selected" to "SELECTED", "UNSELECTED" or "STANDBY". See below excerpted from IEEE 802.3ad:
"
A value of SELECTED indicates that the Selection Logic has selected an appropriate
Aggregator. A value of UNSELECTED indicates that no Aggregator is currently selected. A
value of STANDBY indicates that although the Selection Logic has selected an appropriate
Aggregator, aggregation restrictions currently prevent the Aggregation Port from being enabled
as part of the aggregation, and so the Aggregation Port is being held in a standby condition.
This variable can only be set to SELECTED or STANDBY by the operation of the Aggregation
Port’s Selection Logic. It can be set to UNSELECTED by the operation of the Aggregation
Port’s Receive machine, or by the operation of the Selection Logic associated with another
Aggregation Port.
"
src/libteam/patch/series
Outdated
@@ -12,3 +12,4 @@ | |||
0012-Increase-min_ports-upper-limit-to-1024.patch | |||
0013-set-port-to-disabled-state-during-removal.patch | |||
0014-dont-move-the-port-state-from-disabled-when-admin-state-is-down.patch | |||
0015-Extend-LACP-fallback-support-to-multiple-ports.patch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukasstockner Need your help to modify this line because 0015 had already been used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukasstockner Ping
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rebased now, sorry for the delay.
Fixes sonic-net#7555 and sonic-net#8866. While the LACP fallback HLD correctly describes the behavior for LAGs with multiple ports, the actual implementation only worked for single-port LAGs. When multiple ports are present, the fallback mode would select all ports, therefore making it useless for the purpose described in the HLD (PXE-booting). Therefore, this commit adds a patch to implement the desired behavior: If all non-disabled ports are defaulted, *one* port goes into fallback mode and gets selected for the LAG. As soon as any port receives a LACPDU, the fallback port goes back to defaulted. The fallback port is selected by higher port priority or, if both have equal priority (the default), by lower port number (as described in the HLD). Signed-off-by: Lukas Stockner <lstockner@genesiscloud.com>
8daab64
to
d2d2816
Compare
@saiarcot895 Please review this PR again. Thanks. |
/azpw run Azure.sonic-buildimage |
Hi @yxieca and @zhangyanzhao this is the PR I mention it today meeting. If it can be backported to release 202305, that would be great. If not, the earliest release available is fine. Thank you. |
Hi @saiarcot895, is there anything needed to move forward with this PR. I can confirm this PR works and fixes the issue. Thanks. |
Why I did it
Fixes #7555 and #8866.
While the LACP fallback HLD correctly describes the behavior for LAGs with multiple ports, the actual implementation only worked for single-port LAGs.
When multiple ports are present, the fallback mode would select all ports, therefore making it useless for the purpose described in the HLD (PXE-booting).
How I did it
This commit adds a patch to implement the desired behavior:
If all non-disabled ports are defaulted, one port goes into fallback mode and gets selected for the LAG. As soon as any port receives a LACPDU, the fallback port goes back to defaulted.
The fallback port is selected by higher port priority or, if both have equal priority (the default), by lower port number (as described in the HLD).
How to verify it
See #7555 - start a DVS container, create a Portchannel with two members and fallback enabled, create a bond on the non-DVS end of the interfaces, bring it up and down and observe the state of the Portchannel.
Which release branch to backport (provide reason below if selected)
Not sure if this counts as a fix or feature, but I'd argue fix since this behavior was described in the HLD all along and the fallback is not particularly useful without it.
Description for the changelog
Support multiple member ports for LACP fallback.
Link to config_db schema for YANG module changes
A picture of a cute animal (not mandatory but encouraged)