Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[libteam] Add fallback support for single-member-port LAG #1118

Merged
merged 6 commits into from
Dec 16, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions dockers/docker-teamd/teamd.j2
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,12 @@
"runner": {
"name": "lacp",
"active": true,
{% if PORTCHANNEL[pc]['fallback'] and ((PORTCHANNEL[pc]['members'] | length) == 1) %}
"fallback": {{ PORTCHANNEL[pc]['fallback'] }},
{% else %}
{# Use 75% links upperbound as min-links #}
"min_ports": {{ (PORTCHANNEL[pc]['members'] | length * 0.75) | round(0, 'ceil') | int }},
{% endif %}
"tx_hash": ["eth", "ipv4", "ipv6"]
},
"link_watch": {
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,104 @@
From ec966f9a0229bd7226e3abe15b56659b36af9d66 Mon Sep 17 00:00:00 2001
From: Haiyang Zheng <haiyang.z@alibaba-inc.com>
Date: Fri, 15 Dec 2017 21:07:53 -0800
Subject: [patch libteam] [libteam] Add fallback support for single-member-port
LAG

Signed-off-by: Haiyang Zheng <haiyang.z@alibaba-inc.com>
---
teamd/teamd_runner_lacp.c | 42 ++++++++++++++++++++++++++++++++++++++++--
1 file changed, 40 insertions(+), 2 deletions(-)

diff --git a/teamd/teamd_runner_lacp.c b/teamd/teamd_runner_lacp.c
index 9c77fae..a3646a6 100644
--- a/teamd/teamd_runner_lacp.c
+++ b/teamd/teamd_runner_lacp.c
@@ -138,6 +138,8 @@ struct lacp {
#define LACP_CFG_DFLT_SYS_PRIO 0xffff
bool fast_rate;
#define LACP_CFG_DFLT_FAST_RATE false
+ bool fallback;
+#define LACP_CFG_DFLT_FALLBACK false
int min_ports;
#define LACP_CFG_DFLT_MIN_PORTS 1
enum lacp_agg_select_policy agg_select_policy;
@@ -272,6 +274,11 @@ static int lacp_load_config(struct teamd_context *ctx, struct lacp *lacp)
lacp->cfg.fast_rate = LACP_CFG_DFLT_FAST_RATE;
teamd_log_dbg("Using fast_rate \"%d\".", lacp->cfg.fast_rate);

+ err = teamd_config_bool_get(ctx, &lacp->cfg.fallback, "$.runner.fallback");
+ if (err)
+ lacp->cfg.fallback = LACP_CFG_DFLT_FALLBACK;
+ teamd_log_dbg("Using fallback \"%d\".", lacp->cfg.fallback);
+
err = teamd_config_int_get(ctx, &tmp, "$.runner.min_ports");
if (err) {
lacp->cfg.min_ports = LACP_CFG_DFLT_MIN_PORTS;
@@ -308,9 +315,24 @@ static bool lacp_port_loopback_free(struct lacp_port *lacp_port)
return true;
}

+/*
+ * is_lacp_fallback_eligible - is lacp_port eligible to go into lacp fallback mode
+ *
+ * Return true if it is, false otherwise
+ */
+static bool is_lacp_fallback_eligible(struct lacp_port *lacp_port)
+{
+ teamd_log_dbg("%s fallback eligible state \"%d \" cfg \"%d\".",
+ lacp_port->tdport->ifname, lacp_port->state,
+ lacp_port->lacp->cfg.fallback);
+ return lacp_port->state == PORT_STATE_DEFAULTED &&
+ lacp_port->lacp->cfg.fallback;
+}
+
static bool lacp_port_selectable_state(struct lacp_port *lacp_port)
{
- if (lacp_port->state == PORT_STATE_CURRENT)
+ if (lacp_port->state == PORT_STATE_CURRENT ||
+ is_lacp_fallback_eligible(lacp_port))
return true;
return false;
}
@@ -318,7 +340,8 @@ static bool lacp_port_selectable_state(struct lacp_port *lacp_port)
static bool lacp_port_unselectable_state(struct lacp_port *lacp_port)
{
if (lacp_port->state == PORT_STATE_CURRENT ||
- lacp_port->state == PORT_STATE_EXPIRED)
+ lacp_port->state == PORT_STATE_EXPIRED ||
+ is_lacp_fallback_eligible(lacp_port))
return false;
return true;
}
@@ -1452,6 +1475,16 @@ static int lacp_state_fast_rate_get(struct teamd_context *ctx,
return 0;
}

+static int lacp_state_fallback_get(struct teamd_context *ctx,
+ struct team_state_gsc *gsc,
+ void *priv)
+{
+ struct lacp *lacp = priv;
+
+ gsc->data.bool_val = lacp->cfg.fallback;
+ return 0;
+}
+
static int lacp_state_select_policy_get(struct teamd_context *ctx,
struct team_state_gsc *gsc,
void *priv)
@@ -1479,6 +1512,11 @@ static const struct teamd_state_val lacp_state_vals[] = {
.getter = lacp_state_fast_rate_get,
},
{
+ .subpath = "fallback",
+ .type = TEAMD_STATE_ITEM_TYPE_BOOL,
+ .getter = lacp_state_fallback_get,
+ },
+ {
.subpath = "select_policy",
.type = TEAMD_STATE_ITEM_TYPE_STRING,
.getter = lacp_state_select_policy_get,
--
2.7.4

1 change: 1 addition & 0 deletions src/libteam/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ $(addprefix $(DEST)/, $(MAIN_TARGET)): $(DEST)/% :
git apply ../0001-libteam-Add-team_get_port_enabled-function.patch
git apply ../0002-libteam-Temporarily-remove-redundant-debug-mes.patch
git apply ../0003-teamd-lacp-runner-will-send-lacp-update-right-after-.patch
git apply ../0004-libteam-Add-lacp-fallback-support-for-single-member-.patch
popd

# Obtain debian packaging
Expand Down
5 changes: 4 additions & 1 deletion src/sonic-config-engine/minigraph.py
Original file line number Diff line number Diff line change
Expand Up @@ -148,7 +148,10 @@ def parse_dpg(dpg, hname):
pcmbr_list = pcintfmbr.split(';')
for i, member in enumerate(pcmbr_list):
pcmbr_list[i] = port_alias_map.get(member, member)
pcs[pcintfname] = {'members': pcmbr_list}
if pcintf.find(str(QName(ns, "Fallback"))) != None:
pcs[pcintfname] = {'members': pcmbr_list, 'fallback': pcintf.find(str(QName(ns, "Fallback"))).text}
else:
pcs[pcintfname] = {'members': pcmbr_list}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand that "fallback" feature is currently not supporting multi-member portchannels, but what would be the behavior if a user (deliberately/accidentally) enables "fallback" for more than one member? Shouldn't we write some logic in minigraph/config_db parsers to prevent those scenarios? If the current "fallback" logic were to allow more than one port to come up, I'm concerned about the implications that may have as we would be keeping two L2 "doors" wide-opened to the same server (traffic-loops, broadcast-storms, etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rodnymolina for the code review.

Ideally, we should have cfgmgr to check such misconfig and reject if fallback is configured on multiple member port. However, the infra is not ready yet. If we add logic in minigraph/config_db parsers to reject such case, there will be discrepancy between the config_db and the config we applied on the LAG unless the parser is able to update the config DB.

The fallback is disabled by default on all LAGs, and we will make it clear in the release note. Anyone using such feature should avoid such misconfig.

Please let me know if you think such check logic is needed. I can add it to the teamd.j2 template to disable the member port is more than 1. However the user may still manually add more member port after the LAG is created using teamd cli.
There is no way to complete block fallback on a multiple member LAG>

Thanks,
Haiyang

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand all that. In fact, you can indeed prevent users from enabling fallback on multiple members (that's what vendors do), but that requires extra logic on teamd to always determine a single "winner" port (based on mac, or port name, etc). But I'm not suggesting to do all that. What i'm saying is that given the potential consequences a bogus configuration may bring, i would suggest you to enforce some prevention logic somewhere (perhaps teamd.j2 is the right location while we wait for a proper configMgr). Personally, i don't know how a server may behave in early bootup-stages. For example, it could see both NICs as UP, and it may be tempted to send traffic through both of them. Now, what if it uses the same source IP? This may trigger mac-move/learning events across your entire broadcast-domain. In essence, this is not a problem i would like to troubleshoot in my network. I will leave the final decision to you, so i'll approve this review now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @rodnymolina

As you suggested, I added some logic to teamd.j2 to only enable fallback if it is a single-member-port LAG.

+{% if PORTCHANNEL[pc]['fallback'] and ((PORTCHANNEL[pc]['members'] | length) == 1) %}
+{# Set fallback only if it has single member port #}

Thanks,
Haiyang


vlanintfs = child.find(str(QName(ns, "VlanInterfaces")))
vlan_intfs = []
Expand Down