Skip to content

Port Mirroring

Petr Machata edited this page Jul 16, 2020 · 20 revisions
Table of Contents
  1. Port Mirroring
  2. Basic Configuration
  3. Flow-Based Mirroring
  4. Mirroring Buffer Drops
  5. Mirroring With VLAN Encapsulation
  6. Mirroring With GRE Encapsulation
    1. Tunnel Configuration
    2. Router Configuration
    3. Resolving Soft Devices in Underlay
    4. Example Configuration
  7. Resource Limitations
  8. Functional Limitations
  9. Further Resources

Port Mirroring

Port mirroring enables the mirroring of any packet going through a physical switch port (ingress or egress) to a different switch port.

mlxsw supports the following basic modes of mirroring:

  • Direct mirroring
  • Mirroring with VLAN encapsulation
  • Mirroring with GRE encapsulation

mlxsw supports several principal triggers of mirroring:

  • A matchall mirroring, where all packets ingressing or egressing a device are mirrored.
  • A flower mirroring, where only matching subset of packets on ingress or egress are mirrored.
  • A buffer drop mirroring, where packets dropped as a result of shared buffer pressure are mirrored.

Note: Mirroring always takes place between two front-panel ports. Mirroring between a switch port and a management port is not supported.

Features by Version

Kernel Version
4.8 SPAN mirroring triggered by tc matchall
4.16 Mirroring triggered by tc flower
4.17 GRE-encapsulated mirroring with front panel port in underlay
4.18 GRE-encapsulated mirroring with bridge, VLAN or LAG in underlay
VLAN-encapsulated mirroring
5.9 Mirroring of early-dropped packets

Basic Configuration

Configuration of packet mirrors is done through TC filters, namely by attaching filters with action mirred egress mirror. See the linked section for details of what filters are, and how to add, remove and list them.

For a quick bootstrap, the following commands configure mirroring of all packets from port swp1 to port swp2:

# tc qdisc add dev swp1 clsact
# tc filter add dev swp1 ingress               \
        matchall skip_sw                       \
        action mirred egress mirror dev swp2

The first ingress refers to the direction of original traffic and could be also configured as egress. The latter egress refers to the queue where the mirrored traffic is put, and must always be egress.

The skip_sw flag indicates that mirroring should only take place in the hardware. Therefore, the packets going via slow path will only be mirrored once.

Note: Mirroring can only be configured once on any combination of source port and direction (ingress or egress):

# tc filter add dev swp1 ingress \
    matchall skip_sw action mirred egress mirror dev swp2
# tc filter add dev swp1 ingress \
    matchall skip_sw action mirred egress mirror dev swp3
RTNETLINK answers: File exists

Flow-Based Mirroring

Starting with kernel 4.16, it is possible to configure packet mirroring as result of a flower match. E.g. mirroring of packets with a given destination IP from swp1 to swp2 can be done thus:

$ tc filter add dev swp1 ingress                      \
	protocol ip flower skip_sw dst_ip 192.168.0.4 \
	action mirred egress mirror dev swp2

For further details about flow-based matching, see ACLs.

Mirroring Buffer Drops

On Spectrum-2 and above, mlxsw can configure mirroring of packets dropped due to buffer pressure. This feature is configured through filters attached to qevent blocks.

Mirroring With VLAN Encapsulation

mlxsw supports a mode of mirroring where traffic is mirrored not as it is, but encapsulated in VLAN header. This is usually referred to as "RSPAN". To configure VLAN-encapsulated mirroring, direct mirrored traffic to a VLAN netdevice that is above a front panel port:

$ ip link add name swp2.10 link swp2 type vlan id 10
$ ip link set dev swp2.10 up
$ tc filter add dev swp1 ingress                \
        matchall skip_sw                        \
        action mirred egress mirror dev swp2.10

When mirroring VLAN-encapsulated traffic to a VLAN netdevice, the resulting mirrored traffic will be double-tagged. It is however still marked with 802.1q Ethernet type, not 802.1ad.

Mirroring With GRE Encapsulation

This mode of mirroring is often called ERSPAN. Technically, ERSPAN refers to a protocol specified in an RFC draft. mlxsw supports offloading of mirror to a gretap or ip6gretap netdevice, which is close to ERSPAN Type I, except that it always uses protocol type 0x2f (GRE) instead of the 0x88BE prescribed by the spec. ERSPAN Type II and III are not supported.

However in the following, instead of the formally correct "mirror to a gretap or ip6gretap netdevice", we will simply say ERSPAN.

The theory of operation of gretap and ip6gretap netdevices is largely similar to GRE, described on this wiki in L3-Tunneling. The important differences are:

  • The encapsulated traffic is not L3, but L2.
  • Only the encapsulation path matters, decapsulation is not offloaded.
  • Permissible tunnel configuration is stricter.
  • Underlay is not offloaded as such, instead the packet path is resolved by mlxsw ahead of time and only the resolution is offloaded, if possible.

Tunnel Configuration

In order to offload ERSPAN, the following needs to hold about the tunnel netdevice:

  • It shall be a gretap or ip6gretap netdevice.
  • No flags whatsoever shall be configured (that includes GRE keys, checksums and sequence numbers, neither of which are supported)
  • TTL shall be set to a particular value (i.e. not "inherit")
  • TOS shall be set to inherit
  • Destination address shall be configured to a particular value (i.e. not "any")
  • There can be a bound device (i.e. what in L3-Tunneling is referred to as hierarchical configuration)

Router Configuration

To offload ERSPAN, mlxsw needs to determine the path that the encapsulated packet would take, and resolve it to a particular front panel port, through which it will egress. From this stem limitations on what is supported in underlay.

  • There shall be a unicast route for the tunnel remote address. That can be either a directly-attached network, or a next-hop route. It can be an ECMP route, but in that case it should be assumed that the path is chosen and kept arbitrarily for extended periods of time, and changed at any time. (I.e. the expected distribution of load will not happen.) The route pins the traffic to a particular L3 egress device.

  • There shall be a valid neighbor entry for the resolved next-hop IP address on the L3 egress device. If there is none, mlxsw will kick start neighbor discovery or ARP as needed. There is no mirroring until the neighbor is resolved.

Resolving Soft Devices in Underlay

The L3 egress device can be a front panel port, but it can also be a software device:

  1. It can be a VLAN device stacked directly on top of a physical port. In this case, the resulting GRE traffic is additionally VLAN-encapsulated.

  2. It can be a bridge device. In that case, the resolved MAC address is looked up in FDB and must be found--flooding is not supported. That will generally be the case, because FDB timeout is longer than the period of neighbor discovery or ARP, and this traffic will prime or refresh the FDB. However if the FDB expiration time is short, mirroring will be unavailable during the time that the MAC address is not learned.

    For 802.1q bridges, the bridge PVID determines the VLAN of the traffic, as is usual. The "egress untagged" setting at the resolved port determines whether the GRE traffic will be additionally VLAN-tagged.

    For 802.1d bridges, the resolved port can be directly a front panel port, or another VLAN device, which again prompts VLAN-wrapping of the GRE traffic.

  3. It can be a VLAN device on top of a 802.1q bridge. In that case the VLAN device determines the VLAN of the traffic entering the bridge, instead of bridge PVID (again, as is usual). See further the case 2. above.

  4. In cases 2. and 3., directly above the physical port can be a LAG port. In that case an arbitrary up front panel port enslaved to that LAG device is used to egress the traffic. The choice of slave to forward to can change arbitrarily.

All devices along the resolution path need to be up, including the tunnel device itself.

To summarize, the following table illustrates the offloadable mirroring scenarios that involve soft devices. Each line describes the path that the packet takes, from device to device.

In the table, "gretap" stands for either gretap or ip6gretap netdevice, "Br.1Q" and "Br.1D" stand for bridge with resp. without VLAN filtering, and "SWP" stands for front panel port. Parenthesized names refer to optional netdevices.

Mirror to Egress
VLAN SWP
gretap VLAN SWP
gretap (VLAN) Br.1Q (LAG) SWP
gretap Br.1D (VLAN) (LAG) SWP

Example Configuration

Consider the following topology.

 +---------------------+                             +---------------------+
 | H1                  |                             |                  H2 |
 |     + swp5          |                             |         swp8 +      |
 |     | 192.0.2.1/28  |                             |  192.0.2.2/28 |     |
 +-----|---------------+                             +---------------|-----+
       |                                                             |
 +-----|-------------------------------------------------------------|-----+
 | SW  o--> mirror                                                   |     |
 | +---|-------------------------------------------------------------|---+ |
 | |   + swp6                     br0                          swp7 +    | |
 | +---------------------------------------------------------------------+ |
 |                                                                         |
 |     + swp9                                       + gt4 (gretap)         |
 |     | 192.0.2.129/28                             : loc=192.0.2.129      |
 |     |                                            : rem=192.0.2.130      |
 |     |                                            : ttl=100              |
 |     |                                            : tos=inherit          |
 |     |                                            :                      |
 +-----|--------------------------------------------:----------------------+
       |                                            :
 +-----|--------------------------------------------:----------------------+
 | H3  + swp10                                      + h3-gt4 (gretap)      |
 |       192.0.2.130/28                               loc=192.0.2.130      |
 |                                                    rem=192.0.2.129      |
 |                                                    ttl=100              |
 |                                                    tos=inherit          |
 |                                                                         |
 +-------------------------------------------------------------------------+

The following snippet configures the switch part of the topology:

$ ip link set dev br0 type bridge vlan_filtering 1
$ ip link set dev swp6 master br0
$ ip link set dev swp7 master br0
$ ip link set dev swp6 up
$ ip link set dev swp7 up
$ ip link set dev br0 up

$ ip link set dev swp9 up
$ ip addr add dev swp9 192.0.2.129/28

$ ip link add name gt4 type gretap           \
	local 192.0.2.129 remote 192.0.2.130 \
	ttl 100 tos inherit
$ ip link set dev gt4 up

Now set up mirroring:

$ tc qdisc add dev swp6 clsact
$ tc filter add dev swp6 ingress  \
	matchall skip_sw          \
	action mirred egress mirror dev gt4

Resource Limitations

A maximum of three mirroring agents is supported on Spectrum machines. Note that the limitation is related to where and how the traffic is mirrored.

E.g. a direct mirror to swp1 counts as one agent regardless of how many mirroring rules forward traffic to swp1. However if there is additionally a mirror to a gretap netdevice, that will take another agent even if it ends up forwarding to swp1 as well, because the mode of operation differs (the traffic is encapsulated).

Functional Limitations

Mirrored packets cannot be matched by TC filters configured on the netdevice to which the packets were mirrored to.

Further Resources

  1. man tc
  2. man tc-matchall
  3. QoS in Linux with TC and Filters by Phil Sutter (part of iproute documentation)
Clone this wiki locally