-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LS1021ATSN: Could not enable to tag VLAN when using Mode 3(best_effort_vlan_filtering). #90
Comments
Can you please let me know if these 3 kernel patches solve your issue? From 2349286d74897403311c6d2cfb8e2ead6d85b364 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Wed, 7 Apr 2021 19:01:55 +0300
Subject: [PATCH 1/3] net: dsa: sja1105: use the bridge pvid in
best_effort_vlan_filtering mode
The best-effort VLAN filtering mode is the sja1105 driver's attempt to
allow frame tagging towards the CPU with a unique VLAN ID corresponding
to the source port at the same time as allowing the bridge to freely
alter the VLAN table. It works by making the switch classify all untagged
ingress traffic to a secret pvid managed by net/dsa/tag_8021q.c.
Also, VLAN-tagged frames are retagged to another secret VLAN managed by
tag_8021q. Both these VLANs managed by tag_8021q are called "rx_vid".
The retagged rx_vid has some bits which encode a "sub-VLAN", and the
pvid-based rx_vid has those sub-VLAN bits set to zero. Software looks at
the rx_vid and knows what port and original VLAN the packet came from.
There is a huge oversight in the setup created by the sja1105 driver for
the best-effort VLAN filtering mode. That is:
ip link add br0 type bridge vlan_filtering 1
ip link set swp2 master br0
ip link set swp4 master br0
bridge vlan del dev swp4 vid 1
bridge vlan add dev swp4 vid 1 pvid
Then we send an untagged packet from swp2 and expect the switch to
forward it to swp4 and deliver it with VLAN 1 added. It is forwarded,
except the packet is egress-untagged.
This happens because of the way in which tag_8021q works (there is a
detailed picture in net/dsa/tag_8021q.c, above dsa_8021q_setup_port).
In the example above, the tag_8021q pvid of swp2 is 1026. This VLAN is
added to all other switch ports, to allow untagged traffic forwarding.
On all ports, the tag_8021q pvid of 1026 is installed as egress-untagged,
in order to hide the existence of DSA tag_8021q from the user.
This is fine, except when the real (bridge) pvid is egress-tagged, it
isn't. The user _wants_ to see this VLAN in the outside world, and we
can't really do that, because the sja1105 driver doesn't use that VLAN
but another one which the user knows nothing about.
As a side note, this only happens for untagged traffic on the ingress
port. If the packet arrives as pvid-tagged (i.e. tagged with VID 1)
on a port with tag_8021q, then the packet is classified to VLAN ID 1
(the bridge pvid) as opposed to the tag_8021q pvid. So we don't have
the same problem.
Consider the following more generic example:
Port | sw0p0 sw0p1 sw0p2 sw0p3 | sw1p0 sw1p1 sw1p2 sw1p3
=================+==========================+=============================
tag_8021q rx_vid | 1024 1025 1026 1027 | 1088 1089 1090 1091
Bridge VLAN | 1 1 2 1 | 3 2 2 1
Bridge flags | pvid pvid | pvid pvid
| untag untag |
VLAN 1024 is added to sw0p1, sw0p2, sw0p3, sw1p0, sw1p1, sw1p2, sw1p3
as untagged.
The following pattern emerges:
A VLAN which is pvid on any port in the bridging domain (therefore has a
tag_8021q rx_vid) and is egress-tagged on another (potentially the same)
port will leak the tag_8021q VLAN. Every egress-tagged bridge VLAN that
is a pvid on another port must have a retagging rule from the tag_8021q
rx_vid to the bridge VLAN.
So the data would indicate that at the very least, we should retag the
tag_8021q pvid back towards the original bridge pvid on the egress ports
where this bridge VLAN is installed as egress-tagged. We could do that,
except:
- We only have 32 VLAN retagging entries in the sja1105, and we do use
them for other purposes too.
- VLAN retagging works in hardware by making use of a special "loopback
port" which is limited to only 1Gbps of bandwidth. When using the
loopback port for traffic retagged towards the CPU that's fine because
the CPU port is gigabit anyway, but when we start involving it in the
autonomous forwarding data path we have a problem, because we'd
bottleneck it.
So we take a step back and think a bit more about the problem.
Due to the need to plug another hole - pvid-tagged traffic is not seen
with a tag_8021q rx_vid by software, but with the bridge pvid, say 1 -
sja1105_build_subvlans() already creates VLAN retagging entries towards
the CPU even for the bridge pvid, not just for tagged VLANs.
That is to say, even if we let the bridge pvid be the hardware's pvid in
best-effort VLAN filtering mode, untagged and pvid-tagged packets will
still arrive at the CPU as tagged with the tag_8021q rx_vid, because
they will both hit the same retagging rule.
But that actually means we don't _need_ the tag_8021q module to dictate
a pvid value for us. We can rely on retagging just fine, and let the
bridge dictate the pvid. This solves the problem in a much cleaner way:
because the packets in the autonomous data path are now classified to
the bridge pvid, the egress-tagged setting on the egress port works just
fine.
[ note that this means we can always rely on VLAN retagging towards the
CPU, and never on changing the port's pvid. And because the pvid is no
longer managed by tag_8021q, we can even go as far as enable
Independent VLAN Learning again. But I digress, that is an
optimization to make for net-next, this is just to fix a bug ]
The commit I'm blaming is the one which introduced the problem, but the
fix relies on a mechanism that was only added a few commits later:
3f01c91aab92 ("net: dsa: sja1105: implement VLAN retagging for dsa_8021q
sub-VLANs"). This is fine, since they all went into the same kernel
release (v5.8).
Fixes: 2cafa72e516f ("net: dsa: sja1105: add a new best_effort_vlan_filtering devlink parameter")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
drivers/net/dsa/sja1105/sja1105_main.c | 21 ++++++++++++++++++---
1 file changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index d9c198ca0197..8b380ccd95cf 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2240,10 +2240,10 @@ static int sja1105_commit_pvid(struct sja1105_private *priv)
struct list_head *vlan_list;
int rc = 0;
- if (priv->vlan_state == SJA1105_VLAN_FILTERING_FULL)
- vlan_list = &priv->bridge_vlans;
- else
+ if (priv->vlan_state == SJA1105_VLAN_UNAWARE)
vlan_list = &priv->dsa_8021q_vlans;
+ else
+ vlan_list = &priv->bridge_vlans;
list_for_each_entry(v, vlan_list, list) {
if (v->pvid) {
@@ -2290,6 +2290,21 @@ sja1105_build_dsa_8021q_vlans(struct sja1105_private *priv,
list_for_each_entry(v, &priv->dsa_8021q_vlans, list) {
int match = v->vid;
+ /* In best-effort VLAN filtering mode, the pvid of the port is
+ * no longer the tag_8021q rx_vid, but the bridge pvid is.
+ * The tag_8021q rx_vid is just used for retagging the bridge
+ * pvid towards the CPU. So let's install only the rx_vid
+ * values which are strictly required. This means that the
+ * rxvlan is still installed on the port on which tag_8021q
+ * thinks it must be pvid (the source port) - this is required
+ * by the retagging table - but not on the ports where this
+ * VLAN isn't a pvid (the destination ports).
+ */
+ if (priv->vlan_state == SJA1105_VLAN_BEST_EFFORT &&
+ vid_is_dsa_8021q_rxvlan(v->vid) &&
+ dsa_8021q_rx_subvlan(v->vid) == 0 && !v->pvid)
+ continue;
+
new_vlan[match].vlanid = v->vid;
new_vlan[match].vmemb_port |= BIT(v->port);
new_vlan[match].vlan_bc |= BIT(v->port);
--
2.25.1
2: From ecd4451d0d3e8b46951224dec0e3521a4fbafbe4 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Wed, 7 Apr 2021 20:22:24 +0300
Subject: [PATCH 2/3] net: dsa: sja1105: use 4095 as the private VLAN for
untagged traffic
One thing became visible when writing the blamed commit, and that was
that STP and PTP frames injected by net/dsa/tag_sja1105.c using the
deferred xmit mechanism are always classified to the pvid of the CPU
port, regardless of whatever VLAN there might be in these packets.
So a decision needed to be taken regarding the mechanism through which
we should ensure that delivery of STP and PTP traffic is possible when
we are in a VLAN awareness mode that involves tag_8021q. This is because
tag_8021q is not concerned with managing the pvid of the CPU port, since
as far as tag_8021q is concerned, no traffic should be sent as untagged
from the CPU port. So we end up not actually having a pvid on the CPU
port if we only listen to tag_8021q, and unless we do something about it.
The decision taken at the time was to keep VLAN 1 in the list of
priv->dsa_8021q_vlans, and make it a pvid of the CPU port. This ensures
that STP and PTP frames can always be sent to the outside world.
However there is a problem. If we do the following while we are in
the best_effort_vlan_filtering=true mode:
ip link add br0 type bridge vlan_filtering 1
ip link set swp2 master br0
bridge vlan del dev swp2 vid 1
Then untagged and pvid-tagged frames should be dropped. But we observe
that they aren't, and this is because of the precaution we took that VID
1 is always installed on all ports.
So clearly VLAN 1 is not good for this purpose. What about VLAN 0?
Well, VLAN 0 is managed by the 8021q module, and that module wants to
ensure that 802.1p tagged frames are always received by a port, and are
always transmitted as VLAN-tagged (with VLAN ID 0). Whereas we want our
STP and PTP frames to be untagged if the stack sent them as untagged -
we don't want the driver to just decide out of the blue that it adds
VID 0 to some packets.
So what to do?
Well, there is one other VLAN that is reserved, and that is 4095:
$ ip link add link swp2 name swp2.4095 type vlan id 4095
Error: 8021q: Invalid VLAN id.
$ bridge vlan add dev swp2 vid 4095
Error: bridge: Vlan id is invalid.
After we made this change, VLAN 1 is indeed forwarded and/or dropped
according to the bridge VLAN table, there are no further alterations
done by the sja1105 driver.
Fixes: ec5ae61076d0 ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
drivers/net/dsa/sja1105/sja1105.h | 1 +
drivers/net/dsa/sja1105/sja1105_main.c | 21 +++++++++------------
2 files changed, 10 insertions(+), 12 deletions(-)
diff --git a/drivers/net/dsa/sja1105/sja1105.h b/drivers/net/dsa/sja1105/sja1105.h
index f9e87fb33da0..6957cb853a70 100644
--- a/drivers/net/dsa/sja1105/sja1105.h
+++ b/drivers/net/dsa/sja1105/sja1105.h
@@ -13,6 +13,7 @@
#include <linux/mutex.h>
#include "sja1105_static_config.h"
+#define SJA1105_DEFAULT_VLAN (VLAN_N_VID - 1)
#define SJA1105_NUM_PORTS 5
#define SJA1105_NUM_TC 8
#define SJA1105ET_FDB_BIN_SIZE 4
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 8b380ccd95cf..61133098f588 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -321,6 +321,13 @@ static int sja1105_init_l2_lookup_params(struct sja1105_private *priv)
return 0;
}
+/* Set up a default VLAN for untagged traffic injected from the CPU
+ * using management routes (e.g. STP, PTP) as opposed to tag_8021q.
+ * All DT-defined ports are members of this VLAN, and there are no
+ * restrictions on forwarding (since the CPU selects the destination).
+ * Frames from this VLAN will always be transmitted as untagged, and
+ * neither the bridge nor the 8021q module cannot create this VLAN ID.
+ */
static int sja1105_init_static_vlan(struct sja1105_private *priv)
{
struct sja1105_table *table;
@@ -330,17 +337,13 @@ static int sja1105_init_static_vlan(struct sja1105_private *priv)
.vmemb_port = 0,
.vlan_bc = 0,
.tag_port = 0,
- .vlanid = 1,
+ .vlanid = SJA1105_DEFAULT_VLAN,
};
struct dsa_switch *ds = priv->ds;
int port;
table = &priv->static_config.tables[BLK_IDX_VLAN_LOOKUP];
- /* The static VLAN table will only contain the initial pvid of 1.
- * All other VLANs are to be configured through dynamic entries,
- * and kept in the static configuration table as backing memory.
- */
if (table->entry_count) {
kfree(table->entries);
table->entry_count = 0;
@@ -353,9 +356,6 @@ static int sja1105_init_static_vlan(struct sja1105_private *priv)
table->entry_count = 1;
- /* VLAN 1: all DT-defined ports are members; no restrictions on
- * forwarding; always transmit as untagged.
- */
for (port = 0; port < ds->num_ports; port++) {
struct sja1105_bridge_vlan *v;
@@ -366,15 +366,12 @@ static int sja1105_init_static_vlan(struct sja1105_private *priv)
pvid.vlan_bc |= BIT(port);
pvid.tag_port &= ~BIT(port);
- /* Let traffic that don't need dsa_8021q (e.g. STP, PTP) be
- * transmitted as untagged.
- */
v = kzalloc(sizeof(*v), GFP_KERNEL);
if (!v)
return -ENOMEM;
v->port = port;
- v->vid = 1;
+ v->vid = SJA1105_DEFAULT_VLAN;
v->untagged = true;
if (dsa_is_cpu_port(ds, port))
v->pvid = true;
--
2.25.1
3: From 732fcff3ab3ecf9d473d0ade082fd9d373cf392a Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <vladimir.oltean@nxp.com>
Date: Wed, 7 Apr 2021 20:50:56 +0300
Subject: [PATCH 3/3] net: dsa: sja1105: update existing VLANs from the bridge
VLAN list
When running this sequence of operations:
ip link add br0 type bridge vlan_filtering 1
ip link set swp4 master br0
bridge vlan add dev swp4 vid 1
We observe the traffic sent on swp4 is still untagged, even though the
bridge has overwritten the existing VLAN entry:
port vlan ids
swp4 1 PVID
br0 1 PVID Egress Untagged
This happens because we didn't consider that the 'bridge vlan add'
command just overwrites VLANs like it's nothing. We treat the 'vid 1
pvid untagged' and the 'vid 1' as two separate VLANs, and the first
still has precedence when calling sja1105_build_vlan_table. Obviously
there is a disagreement regarding semantics, and we end up doing
something unexpected from the PoV of the bridge.
Let's actually consider an "existing VLAN" to be one which is on the
same port, and has the same VLAN ID, as one we already have, and update
it if it has different flags than we do.
The first blamed commit is the one introducing the bug, the second one
is the latest on top of which the bugfix still applies.
Fixes: ec5ae61076d0 ("net: dsa: sja1105: save/restore VLANs using a delta commit method")
Fixes: 5899ee367ab3 ("net: dsa: tag_8021q: add a context structure")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
---
drivers/net/dsa/sja1105/sja1105_main.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c
index 61133098f588..5e40ee14030a 100644
--- a/drivers/net/dsa/sja1105/sja1105_main.c
+++ b/drivers/net/dsa/sja1105/sja1105_main.c
@@ -2829,11 +2829,22 @@ static int sja1105_vlan_add_one(struct dsa_switch *ds, int port, u16 vid,
bool pvid = flags & BRIDGE_VLAN_INFO_PVID;
struct sja1105_bridge_vlan *v;
- list_for_each_entry(v, vlan_list, list)
- if (v->port == port && v->vid == vid &&
- v->untagged == untagged && v->pvid == pvid)
+ list_for_each_entry(v, vlan_list, list) {
+ if (v->port == port && v->vid == vid) {
/* Already added */
- return 0;
+ if (v->untagged == untagged && v->pvid == pvid)
+ /* Nothing changed */
+ return 0;
+
+ /* It's the same VLAN, but some of the flags changed
+ * and the user did not bother to delete it first.
+ * Update it and trigger sja1105_build_vlan_table.
+ */
+ v->untagged = untagged;
+ v->pvid = pvid;
+ return 1;
+ }
+ }
v = kzalloc(sizeof(*v), GFP_KERNEL);
if (!v) {
--
2.25.1
The third patch may not apply cleanly to the OpenIL kernel - it isn't critical though. Thanks. |
I will try, thanks so much. By the way, I met quite similar situation in LS1028ATSN. How could I solve that since LS1028ATSN has combined mode 2 and 3 together ? |
What do you mean exactly by "combined mode 2 and 3 together"? You mean that one switch operates in mode 2 and another in mode 3? |
Sorry for misunderstanding. I meant that I wanted to use LS1028ATSN to realize my expected behavior:
I found quite similar probelm as LS1021ATSN. I just could realize expected behavior 1 or 2, not both of them. Thanks! |
Understood. There is nothing specific to the board with this issue. I can reproduce it on the LS1028A-TSN too. |
So, how could I reproduce these three patches on LS1028A-TSN since there is no sja1105 driver on it? I couldn't find these files to change them. |
The kernel is located here: https://github.com/openil/linux
|
Hi, I don't think there is a way to solve the problem without limitations. Let me
When the switch ports are VLAN-unaware (either bridged or standalone), all
Additionally, there is one more possibility, which is to delete everything,
Apart from the limitations, there would be some advantages to the imprecise RX |
Hi, vladimiroltean, thanks a lot!
|
Hello, |
Enironment
Step to reproduce
Connect as Node 1(192.168.200.222)-->{swp4}Switch(192.168.200.23){swp2}-->Node 2(192.168.200.220)
Expected behavior:
Test Method:
Use ping IP, then observe on Wireshark v3.3.4
VLAN Config
Log
Actual Behavior
- Dose it mean that expected behavior 1 & 2 could not coexist, which means Mode 3 is not the expansion of Mode 2???
The text was updated successfully, but these errors were encountered: