-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Report VXLAN-encaped packets to support end-host INT #275
Conversation
The author of this PR, osinstom, is not an activated member of this organization on Codecov. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to see more tests for VXLAN packets. If it's not too much to fit in this sprint, can you also add tests for:
- Basic routing of VXLAN
- Ingress and egress drop reports when using VXLAN
Pro tip: instead of creating new test classes, it might be easier to add new pkt types vxlan_udp
and vxlan_tcp
to existing tests ;)
Otherwise, it's ok to create a Jira task and do it in the next sprint.
(IP_VERSION_4, GtpuPresence.NONE): handle_ipv4; | ||
(IP_VERSION_4, GtpuPresence.GTPU_ONLY): strip_ipv4_udp_gtpu; | ||
(IP_VERSION_4, GtpuPresence.GTPU_WITH_PSC): strip_ipv4_udp_gtpu_psc; | ||
transition select(ip_ver, fabric_md.int_report_md.encap_presence) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we have a case to strip vxlan? VXLAN over Ethernet is possible when forwarding overlay packets through the spines.
p4src/include/control/int_parser.p4
Outdated
state strip_ipv4_udp_vxlan { | ||
packet.advance((IPV4_HDR_BYTES + UDP_HDR_BYTES + VXLAN_HDR_BYTES) * 8); | ||
// Skip Ethernet bytes. | ||
// The Ethernet header will be emitted as payload, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the ethernet you skip in this parser state will be emitted as a payload later.
But I think the current design might be correct.
Assume in the VxLAN is an IPv4 packet without VLAN, MPLS, or GTPU, so the packet will looks like this:
Ethernet | IPv4 | UDP | VxLAN | Ethernet | IPv4 | payload
In the parse_eth_hdr
and check_eth_type
state we extract ethernet header and store them:
IPv4 | UDP | VxLAN | Ethernet | IPv4 | payload
In this state, we first skip IPv4, UDP, and VxLAN:
Ethernet | IPv4 | payload
And we skip the ethernet header:
IPv4 | payload
And we jump to the handle_ipv4
state to get the internal IPv4 length value.
And the unparsed part will be treated as a payload and emit later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right that the inner Etnernet header will not be emitted as payload, but removed from the packet instead. However, the current design seems to be in line with how we handle GTP-encaped packets, but your comment makes me re-consider it.
For the GTP packets we assume that the inner Ethernet of the reported packet is copied from the outer Ethernet:
https://github.com/stratum/fabric-tna/blob/main/ptf/tests/ptf/fabric_test.py#L534
For the VXLAN packet I've followed the same approach:
Now, I'm wondering if this is a correct approach. I can understand why we do it for the GTP-encaped packets, because we don't have any inner Ethernet header in an inner packet. However, for the VXLAN packets we should probably keep the inner Ethernet header. What do you think @Yi-Tseng @ccascone ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the meantime, I've fixed comment for the current version.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure what's the best approach, but I believe our INT collector doesn't care about Ethernet. So I would say let's go with the simpler approach :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, the current version is the simpler one :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @osinstom feel free to merge when ready (you now have write permissions).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM,
This PR provides modifications to fabric-tna that enable reporting VXLAN-encaped packets to support end-host INT. As some configurations of K8s uses VXLAN to encapsulate east-west traffic, we need to handle VXLAN in the fabric-tna to get the E2E visibility.