Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libteam] Add fallback support for single-member-port LAG #1118

Merged
merged 6 commits into from
Dec 16, 2017
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions dockers/docker-teamd/teamd.j2
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,12 @@
"runner": {
"name": "lacp",
"active": true,
{% if PORTCHANNEL[pc]['fallback'] %}
"fallback": {{ PORTCHANNEL[pc]['fallback'] }},
{% else %}
{# Use 75% links upperbound as min-links #}
"min_ports": {{ (PORTCHANNEL[pc]['members'] | length * 0.75) | round(0, 'ceil') | int }},
{% endif %}
"tx_hash": ["eth", "ipv4", "ipv6"]
},
"link_watch": {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,151 @@
From 08f4c83a6960fd922d1baac2293a95250f36fa96 Mon Sep 17 00:00:00 2001
From: Haiyang Zheng <haiyang.z@alibaba-inc.com>
Date: Mon, 6 Nov 2017 00:24:00 -0800
Subject: [patch libteam] libteam: Add lacp fallback support for
single-member-port LAG

Signed-off-by: Haiyang Zheng <haiyang.z@alibaba-inc.com>
---
teamd/teamd_runner_lacp.c | 67 ++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 61 insertions(+), 6 deletions(-)

diff --git a/teamd/teamd_runner_lacp.c b/teamd/teamd_runner_lacp.c
index 9c77fae..872af8a 100644
--- a/teamd/teamd_runner_lacp.c
+++ b/teamd/teamd_runner_lacp.c
@@ -138,6 +138,8 @@ struct lacp {
#define LACP_CFG_DFLT_SYS_PRIO 0xffff
bool fast_rate;
#define LACP_CFG_DFLT_FAST_RATE false
+ bool fallback;
+#define LACP_CFG_DFLT_FALLBACK false
int min_ports;
#define LACP_CFG_DFLT_MIN_PORTS 1
enum lacp_agg_select_policy agg_select_policy;
@@ -272,6 +274,11 @@ static int lacp_load_config(struct teamd_context *ctx, struct lacp *lacp)
lacp->cfg.fast_rate = LACP_CFG_DFLT_FAST_RATE;
teamd_log_dbg("Using fast_rate \"%d\".", lacp->cfg.fast_rate);

+ err = teamd_config_bool_get(ctx, &lacp->cfg.fallback, "$.runner.fallback");
+ if (err)
+ lacp->cfg.fallback = LACP_CFG_DFLT_FALLBACK;
+ teamd_log_dbg("Using fallback \"%d\".", lacp->cfg.fallback);
+
err = teamd_config_int_get(ctx, &tmp, "$.runner.min_ports");
if (err) {
lacp->cfg.min_ports = LACP_CFG_DFLT_MIN_PORTS;
@@ -308,9 +315,24 @@ static bool lacp_port_loopback_free(struct lacp_port *lacp_port)
return true;
}

+/*
+ * is_lacp_fallback_eligible - is lacp_port eligible to go into lacp fallback mode
+ *
+ * Return true if it is, false otherwise
+ */
+static bool is_lacp_fallback_eligible(struct lacp_port *lacp_port)
+{
+ teamd_log_dbg("%s fallback eligible state \"%d \" cfg \"%d\".",
+ lacp_port->tdport->ifname, lacp_port->state,
+ lacp_port->lacp->cfg.fallback);
+ return lacp_port->state == PORT_STATE_DEFAULTED &&
+ lacp_port->lacp->cfg.fallback;
+}
+
static bool lacp_port_selectable_state(struct lacp_port *lacp_port)
{
- if (lacp_port->state == PORT_STATE_CURRENT)
+ if (lacp_port->state == PORT_STATE_CURRENT ||
+ is_lacp_fallback_eligible(lacp_port))
return true;
return false;
}
@@ -318,7 +340,8 @@ static bool lacp_port_selectable_state(struct lacp_port *lacp_port)
static bool lacp_port_unselectable_state(struct lacp_port *lacp_port)
{
if (lacp_port->state == PORT_STATE_CURRENT ||
- lacp_port->state == PORT_STATE_EXPIRED)
+ lacp_port->state == PORT_STATE_EXPIRED ||
+ is_lacp_fallback_eligible(lacp_port))
return false;
return true;
}
@@ -1002,6 +1025,12 @@ static int lacp_port_link_update(struct lacp_port *lacp_port)
uint32_t speed = team_get_port_speed(team_port);
uint8_t duplex = team_get_port_duplex(team_port);
int err;
+ bool admin_state;
+
+ admin_state = team_get_ifinfo_admin_state(lacp_port->ctx->ifinfo);
+ teamd_log_dbg("%s admin_state \"%d\" link_up \"%d\".",
+ lacp_port->tdport->ifname,
+ admin_state, linkup);

if (linkup != lacp_port->__link_last.up ||
duplex != lacp_port->__link_last.duplex) {
@@ -1011,8 +1040,9 @@ static int lacp_port_link_update(struct lacp_port *lacp_port)
* will provide speed == 0 and duplex == 0. If that is the
* case now, do not set disabled state and allow such devices
* to work properly.
+ * Only enable port if the it is admin_up and link_up.
*/
- if (linkup && (!duplex == !speed))
+ if (linkup && (!duplex == !speed) && admin_state)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it required to check admin state? If link is up, then we definitely know that admin state is also up.

Copy link
Contributor Author

@hzheng5 hzheng5 Nov 15, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Marian @marian-pritsak for the code review.

Yes, I removed the lacp fallback timer, which is the transition time from defaulted state to fallback state. Since we reached an agreement to not modify existing LACP receive state machine, I removed the fallback state and the fallback timer is no longer needed. Also I have updated the design document

https://github.com/Azure/SONiC/blob/gh-pages/doc/LACP%20Fallback%20Feature%20for%20SONiC_v0.5.md

We need to check the admin state because if we create a Lag PortChannel01 with member ports Ethernet1. The PortChannel will be by default admin Shut, but the member port will be in admin UP state. If we don't check the admin status of the LAG, the member will be moved into defaulted state since it's link up, and then be moved to fallback state if lacp fallback is enabled.

Please let me know if you still have questions.

Thanks,
Haiyang

Copy link
Contributor Author

@hzheng5 hzheng5 Nov 27, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marian-pritsak
Hi Marian,

"If link is up, then we definitely know that admin state is also up."
This is not always true in libteam, here the admin state is actually the admin state of the LAG . So it's possible that the PortChannel is admin down, but the member port is admin up and link up. In this case, the member port will be moved into defaulted state (or fallback state if enabled), which is not desired as the PortChannel is admin shutdown.

Please let me know if you still have concerns regarding this admin check. If so, we can setup a skype call to discuss further.

FYI. Some log I captured after shutting down the PortChannel11, the admin_state is "0" (down), but the member port Ethernet11 is still link up

admin@ASW3.ET2:~$ teamshow
Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available, S - selected, D - deselected
  No.  Team Dev       Protocol     Ports
-----  -------------  -----------  -------------
   11  PortChannel11  LACP(A)(Up)  Ethernet11(S)

admin@ASW3.ET2:~$ sudo ifconfig PortChannel11 down

admin@ASW3.ET2:~$ teamshow
Flags: A - active, I - inactive, Up - up, Dw - Down, N/A - not available, S - selected, D - deselected
  No.  Team Dev       Protocol     Ports
-----  -------------  -----------  -------------
   11  PortChannel11  LACP(A)(Dw)  Ethernet11(D)

admin@ASW3.ET2:~$ sudo grep admin_state /var/log/teamd.log  | grep PortChannel11
Nov 27 16:08:49.718118 ASW3.et2 DEBUG teamd_PortChannel11[63]: Ethernet11 **admin_state  "0" link_up.**
admin@ASW3.ET2:~$
admin@ASW3.ET2:~$ teamdctl PortChannel11 state
setup:
  runner: lacp
ports:
  Ethernet11
    link watches:
      link summary: up
      instance[link_watch_0]:
        name: ethtool
        link: up
        down count: 0
    runner:
      aggregator ID: 0
      selected: no
      state: disabled
runner:
  active: yes
  fast rate: no

admin@ASW3.ET2:~$ ip link show Ethernet11
22: Ethernet11: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast master PortChannel11 state UP mode DEFAULT group default qlen 1000
    link/ether 00:05:64:2f:2b:d5 brd ff:ff:ff:ff:ff:ff

admin@ASW3.ET2:~$ ip link show PortChannel11
5: PortChannel11: <BROADCAST,MULTICAST> mtu 1500 qdisc noqueue state DOWN mode DEFAULT group default
    link/ether 00:05:64:2f:2b:d5 brd ff:ff:ff:ff:ff:ff

admin@ASW3.ET2:~$ bcmcmd "ps xe11"
ps xe11
                 ena/        speed/ link auto    STP                  lrn  inter   max   cut   loop
           port  link  Lns   duplex scan neg?   state   pause  discrd ops   face frame  thru?  back
      xe11( 12)  up     1   25G  FD   SW  No   Forward  TX RX   None   FA     KR  1522    No

drivshell>
admin@ASW3.ET2:~$

Thanks,
Haiyang

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@hzheng5 , this is actually separate issue. whether or not member port admin status depends on the lag admin status is a separate issue. i tested on arista switch, when the lag is down, the member port is errdisabled. is this issue related to lacp fallback mechanism? can you explain?

err = lacp_port_set_state(lacp_port, PORT_STATE_EXPIRED);
else
err = lacp_port_set_state(lacp_port, PORT_STATE_DISABLED);
@@ -1344,9 +1374,19 @@ static int lacp_event_watch_admin_state_changed(struct teamd_context *ctx,

teamd_for_each_tdport(tdport, ctx) {
struct lacp_port *lacp_port = lacp_port_get(lacp, tdport);
-
- err = lacp_port_set_state(lacp_port,
- admin_state?PORT_STATE_EXPIRED:PORT_STATE_DISABLED);
+ /* Only enable port if it is admin_up and link_up.*/
+ if (lacp_port->__link_last.up) {
+ teamd_log_dbg("%s admin_state \"%d\" link_up.",
+ lacp_port->tdport->ifname,
+ admin_state);
+ err = lacp_port_set_state(lacp_port,
+ admin_state?PORT_STATE_EXPIRED:PORT_STATE_DISABLED);
+ } else {
+ teamd_log_dbg("%s admin_state \"%d\" link_down.",
+ lacp_port->tdport->ifname,
+ admin_state);
+ err = lacp_port_set_state(lacp_port, PORT_STATE_DISABLED);
+ }
if (err)
return err;
}
@@ -1452,6 +1492,16 @@ static int lacp_state_fast_rate_get(struct teamd_context *ctx,
return 0;
}

+static int lacp_state_fallback_get(struct teamd_context *ctx,
+ struct team_state_gsc *gsc,
+ void *priv)
+{
+ struct lacp *lacp = priv;
+
+ gsc->data.bool_val = lacp->cfg.fallback;
+ return 0;
+}
+
static int lacp_state_select_policy_get(struct teamd_context *ctx,
struct team_state_gsc *gsc,
void *priv)
@@ -1479,6 +1529,11 @@ static const struct teamd_state_val lacp_state_vals[] = {
.getter = lacp_state_fast_rate_get,
},
{
+ .subpath = "fallback",
+ .type = TEAMD_STATE_ITEM_TYPE_BOOL,
+ .getter = lacp_state_fallback_get,
+ },
+ {
.subpath = "select_policy",
.type = TEAMD_STATE_ITEM_TYPE_STRING,
.getter = lacp_state_select_policy_get,
--
2.7.4

1 change: 1 addition & 0 deletions src/libteam/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ $(addprefix $(DEST)/, $(MAIN_TARGET)): $(DEST)/% :
git apply ../0001-libteam-Add-team_get_port_enabled-function.patch
git apply ../0002-libteam-Temporarily-remove-redundant-debug-mes.patch
git apply ../0003-teamd-lacp-runner-will-send-lacp-update-right-after-.patch
git apply ../0004-libteam-Add-lacp-fallback-support-for-single-member-.patch
popd

# Obtain debian packaging
Expand Down
5 changes: 4 additions & 1 deletion src/sonic-config-engine/minigraph.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,10 @@ def parse_dpg(dpg, hname):
pcmbr_list = pcintfmbr.split(';')
for i, member in enumerate(pcmbr_list):
pcmbr_list[i] = port_alias_map.get(member, member)
pcs[pcintfname] = {'members': pcmbr_list}
if pcintf.find(str(QName(ns, "Fallback"))) != None:
pcs[pcintfname] = {'members': pcmbr_list, 'fallback': pcintf.find(str(QName(ns, "Fallback"))).text}
else:
pcs[pcintfname] = {'members': pcmbr_list}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that "fallback" feature is currently not supporting multi-member portchannels, but what would be the behavior if a user (deliberately/accidentally) enables "fallback" for more than one member? Shouldn't we write some logic in minigraph/config_db parsers to prevent those scenarios? If the current "fallback" logic were to allow more than one port to come up, I'm concerned about the implications that may have as we would be keeping two L2 "doors" wide-opened to the same server (traffic-loops, broadcast-storms, etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rodnymolina for the code review.

Ideally, we should have cfgmgr to check such misconfig and reject if fallback is configured on multiple member port. However, the infra is not ready yet. If we add logic in minigraph/config_db parsers to reject such case, there will be discrepancy between the config_db and the config we applied on the LAG unless the parser is able to update the config DB.

The fallback is disabled by default on all LAGs, and we will make it clear in the release note. Anyone using such feature should avoid such misconfig.

Please let me know if you think such check logic is needed. I can add it to the teamd.j2 template to disable the member port is more than 1. However the user may still manually add more member port after the LAG is created using teamd cli.
There is no way to complete block fallback on a multiple member LAG>

Thanks,
Haiyang

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand all that. In fact, you can indeed prevent users from enabling fallback on multiple members (that's what vendors do), but that requires extra logic on teamd to always determine a single "winner" port (based on mac, or port name, etc). But I'm not suggesting to do all that. What i'm saying is that given the potential consequences a bogus configuration may bring, i would suggest you to enforce some prevention logic somewhere (perhaps teamd.j2 is the right location while we wait for a proper configMgr). Personally, i don't know how a server may behave in early bootup-stages. For example, it could see both NICs as UP, and it may be tempted to send traffic through both of them. Now, what if it uses the same source IP? This may trigger mac-move/learning events across your entire broadcast-domain. In essence, this is not a problem i would like to troubleshoot in my network. I will leave the final decision to you, so i'll approve this review now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rodnymolina

As you suggested, I added some logic to teamd.j2 to only enable fallback if it is a single-member-port LAG.

+{% if PORTCHANNEL[pc]['fallback'] and ((PORTCHANNEL[pc]['members'] | length) == 1) %}
+{# Set fallback only if it has single member port #}

Thanks,
Haiyang


vlanintfs = child.find(str(QName(ns, "VlanInterfaces")))
vlan_intfs = []
Expand Down