-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parse IPv6 extension headers #51
Conversation
A unit test is added, that currently fails, but high-lite the fragmented IPv6-UPD problem. |
The extension packets are assumed to be in the recommended order, i.e. a fragment header is the last one, if it exist. The IPPROTO_HOPOPTS, IPPROTO_ROUTING and IPPROTO_DSTOPTS extension headers are not tested. It is non-trivial to generate these header for testing. |
/cc @aojea |
The capture-chunk is still 128 byte. I think that the fragment case is the most important, and that doesn't require more |
There may be problems also with IPv4 fragments. I think they will be handled as un-fragmented packets, but the L4-header is missing, so garbage data will be used. I will test it and rename this PR to something like "Parse IPv6 extension headers and handle fragments", if necessary. I make this PR a "draft" until things are sorted out |
FYI https://github.com/uablrek/pcap2go is used to create packet test-data |
pkg/networkpolicy/packet_test.go
Outdated
} | ||
} | ||
|
||
var packetsUdpFragIPv6 = [][]byte{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add a comment explaining how this is generated so future us can create more of these payloads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment added in packet_test.go, and pcap files added in test-data/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better add it under the same folder pkg/networkpolicy/test-data/
and add a small README.md doc in the folder explaining https://github.com/uablrek/pcap2go
Probably not, because when conntrack is used (which it always is in K8s), IPv4 packets are reassembled. (BTW the answer is very informative) |
Considering the above 👆, shall we assume that no IPv4 fragments are ever delivered to user-space, or should we be prepared for the very unlikely event? |
seems reasonable, but leave a comment with the link to that great explanation to document the reason |
I added checks for packets with incomplete headers and corrupted packets (header lengths too small). But the kernel will verify this and discard those packets. If the capture length is changed 128->1280, all those checks should be redundant. @aojea Do you think I should set capture length 1280, and remove all too-small and corrupted tests? (never mind the lint-errors, I'll fix them later) |
IPPROTO_DSTOPTS may appear again after IPPROTO_FRAGMENT in the recommended order. I will fix that when I find a way to test it. |
pkg/networkpolicy/packet_test.go
Outdated
// in the nfqueue | ||
const captureChunk = 128 | ||
|
||
func TestUdpFragmentIPv6(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"UDP" not "Udp". Golang style doesn't force acronyms to "titlecase" (though they can be all-lowercase if needed).
(Likewise everywhere.)
pkg/networkpolicy/packet_test.go
Outdated
}{ | ||
{ | ||
name: "UDP first fragment", | ||
input: packetsUdpFragIPv6[0][14:(14 + captureChunk)], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
14
should not be a magic number
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed
pkg/networkpolicy/packet_test.go
Outdated
} | ||
} | ||
|
||
var packetsUdpFragIPv6 = [][]byte{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like this would make more sense as three separate variables with useful names.
also, maybe split into L2 header, L3 header, payload, with some minimal comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree. If packet test data is generated from pcap, it should not be post-processed. That would make it an exquisite pain to create new test data (exactly what I wanted to avoid by using a capture).
On the other hand, if test-data is manufactured by hand, there should of course be comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I saved the original pcap files in test-data/
so people can analyze it with wireshark
for instance
}, | ||
}, | ||
{ | ||
name: "PSH, ACK", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should only be receiving packets for ctstate new
connections...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, but IMHO parsePacket()
should be more-or-less generic. It should not panic if some TCP packet arrives that is not a SYN for instance. I think "test too much" here should be ok.
TL;DR; SRv6 is a horrible edge use-case! In my quest for IPv6 packets with extension headers I revisited SRv6. I tested it after viewing an excellent presentation by Daniel Bernier, Bell Canada at KubeCon in Valencia 2022. With SRv6 there is a routing header, followed by another IPv6 header, and then the TCP header: This will probably never happen, but network-policies is in the forwarding plane, so it might be possible so see something like this. As I see it we have two choices:
The second can be "reject any packet with a routing header". |
I prefer to make the code more resilient than depending on the kernel to make the right thing There are some linter errors and some job failing that I do not know if is a flake, but this should be ready for review |
I made the PR "ready for review". IPv6 fragments are handled, but there are some "exotic" cases that should also be covered, like SRv6, and repeated destopts. Perhaps extension headers that are not in the recommended order, but I rather reject those. Also unit-tests for extension headers (except fragments) are missing. But since IPv6 fragment handling is most urgent, and may be considered a bug, the remaining stuff can be added in a later kind/feature PR. I'll be happy to continue working with that. |
pkg/networkpolicy/packet.go
Outdated
} | ||
|
||
var dataOffset int | ||
havePorts := true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when is this false? preparing for icmp?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed
pkg/networkpolicy/packet.go
Outdated
} else { | ||
// Not first fragment, we have no L4 header and the | ||
// payload begins after this header | ||
l4offset = -1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is protocol = nxtHeader
correct in this case? it will be syscall.IPPROTO_FRAGMENT
l4offset = -1
. does not seem to be handled later, should this return here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is protocol = nxtHeader correct in this case? it will be syscall.IPPROTO_FRAGMENT
It's intentional, but strictly not correct since there is an L4 protocol in the nxtHeader field. But there is no L4 header! The fragment data begins immediately after the fragment header. So to identify the packet as a fragment, syscall.IPPROTO_FRAGMENT is kept in the protocol
variable. I'll make a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
l4offset = -1. does not seem to be handled later, should this return here?
It's probably a just-in-case left-over. I'll check it.
Updated after comments. I figured that we can return an empty The alternative is to return an empty I have prepared a commit that rejects IPPROTO_ROUTING headers, and handle the remaining case for headers in the recommended order: extra IPPROTO_DSTOPTS headers after the first fragment header. But it needs rebase after these updates. I also add a |
I am still working on tests, both unit-test and real traffic tests, with IPv6 extension headers. I have not figured out an easy way, but am planning to use a tap device that adds some IPv6 extension headers. I.e. packets are routed to a tap device, a user-space program add some headers, and send the packets back. I have done it before, and got quite far. IMO tap devices, same as nfqueue, are often overlooked for feature implementations and testing. A hyped alternative is eBPF, but that is much more complicated, and should be avoided unless performance is really needed. A good example of the power of tap devices is tayga NAT64 (rock-solid since 2011!) |
I discovered that lksctp (Linux kernel SCTP) doesn't fragment on IP-level (this was new to me). The MTU of the outgoing device is examined, and fragmentation is made on SCTP level. This means that IPv6 fragments will never be used for lksctp. I don't know about other sctp implementations, but I will only test lksctp. |
pkg/networkpolicy/controller.go
Outdated
c.nfq.SetVerdict(*a.PacketID, nfqueue.NfAccept) //nolint:errcheck | ||
return 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dan and I were discussing offline, and we concluded that:
- network policies only care about L4 protocol if you set the Ports fields
- if not Port field is specified , is legit to assume that the user/admin can consider the network policy as "any" protocol (some from sftim about users wanting to control ICMP is spot on)
As conclusion, I think we set the packet.proto to empty , then srcPort and dstPort will be 0, and IIUIC the network policy logic will be able do the right thing.
This leaves the parsePacket()
errors to short or malformed packets (module bugs) than we can say we deny by default , as we don't expect at this point to receive a packet from the kernel that is malformed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you just remove this switch case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And just check for an empty v1.Protocol? What is the best if-condition:
if string(packet.proto) == ""
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I couldn't figure that out, hence the switch)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh, not check at all , just pass the packet as is to the next stage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
c.evaluatePacket(packet)
will evaluate the packet only with the IPs information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pushed
pkg/networkpolicy/packet.go
Outdated
// This "should not happen" since only { tcp, udp, sctp } are | ||
// routed to the nfqueue | ||
return t, fmt.Errorf("unknown protocol %d", protocol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are going to route any protocol here, so we can emit verdict on icmp per example, more on this in https://github.com/kubernetes-sigs/kube-network-policies/pull/51/files#r1681040192
3c6b0ad
to
59745d1
Compare
pkg/networkpolicy/packet.go
Outdated
if payloadOffset > len(b) { | ||
// We must have a complete L4 header, however payload may be zero-lengh | ||
return t, ErrorTooShort | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this invalid? why this should be an error ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, ok, if payloadOffset is equal to len(b) is a valid packet
pkg/networkpolicy/packet.go
Outdated
// Obtain the offset of the payload | ||
// TODO allow to filter by the payload | ||
if len(b) >= hdrlen+dataOffset { | ||
t.payload = b[hdrlen+dataOffset:] | ||
if payloadOffset > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can this happen? I don't know if I prefer the old check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dataOffset is a tcp-header thing. It should be used to compute the hdrlen for tcp-headers only. I think there was a bug in the old code for sctp and udp payloads. I looked that the hdrlen was added twice, since it was included in dataOffset (but I didn't really check carefully, since I re-wrote it)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, payloadOffset must be >0 now since we return in the default: branch. I can remove the check.
And add parsePacket unit test
/lgtm Thanks |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: aojea, uablrek The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
IPv6 extension headers must be parsed, at least fragment headers.
Fixes: #15