-
Notifications
You must be signed in to change notification settings - Fork 91
misc service-proxy.md updates #460
misc service-proxy.md updates #460
Conversation
doc/service-proxy.md
Outdated
a warning if you try to DNAT-but-not-SNAT a packet whose source IP | ||
- Likewise, if the proxy supports localhost NodePort connections, | ||
and a host-network process tries to connect to a NodePort service | ||
via `127.0.0.1` or `::1`, it is necessary to SNAT the packet, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the route_localnet hack, no?
this doesn't work for IPv6 no matter what kubernetes/kubernetes#90236
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah! I noticed that this contradicted the comments in the kube-proxy source code but I was thinking this doc was right and the comment was wrong.
connections for a service on a particular node versus when the load | ||
balancer _notices_ that the proxy has stopped accepting new | ||
connections. During that time, if the load balancer sends a connection | ||
to a "bad" node, we drop the packets so that the client will think |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the client will eventually timeout and receive an error, since the tcp stack retries with the same tuple is not likely the intermediate devices will reroute hte packets to other node
tcpdump -i any -n tcp port 80
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
14:28:07.117111 eth0 Out IP 192.168.8.2.58830 > 10.96.61.91.80: Flags [S], seq 3360558320, win 64240, options [mss 1460,sackOK,TS val 2346934370 ecr 0,nop,wscale 7], length 0
14:28:08.119044 eth0 Out IP 192.168.8.2.58830 > 10.96.61.91.80: Flags [S], seq 3360558320, win 64240, options [mss 1460,sackOK,TS val 2346935372 ecr 0,nop,wscale 7], length 0
14:28:10.135037 eth0 Out IP 192.168.8.2.58830 > 10.96.61.91.80: Flags [S], seq 3360558320, win 64240, options [mss 1460,sackOK,TS val 2346937388 ecr 0,nop,wscale 7], length 0
14:28:14.203047 eth0 Out IP 192.168.8.2.58830 > 10.96.61.91.80: Flags [S], seq 3360558320, win 64240, options [mss 1460,sackOK,TS val 2346941456 ecr 0,nop,wscale 7], length 0
the reject will indicate something is wrong and the client will be able to retry sooner, in both cases AFAIK the client will have a network error
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The dropping behavior has been intentional since the beginning ("KEP", implementation). It's possible that we've been doing it wrong all along, I guess?
ad76d7d
to
a86c924
Compare
usable endpoint IPs (with the destination port changed to the | ||
corresponding `.spec.port[].targetPort` value, if that is set). If a | ||
service does not have any usable endpoints, then connections to any of | ||
the above destinations should be rejected (ie, actively refused with |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the ICMP error always be a Type 3 - Destination Unreachable packet? Is that specified somewhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that is what iptables does by default with -j REJECT
and we carrying over with that , maybe we should indicate that it should reject the packet and leave that open, you can also use TCP_RST
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right for UDP I think we'd have to use ICMP but for TCP that makes sense
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ICMP type is not specified anywhere, but... none of the other ICMP types really imply "reject" so it's strongly implied.
(Hm... and looking at the ICMP code to errno conversion table it looks like if we returned ICMP_PORT_UNREACH
instead of the default ICMP_HOST_UNREACH
then the caller would get back "Connection refused" instead of "No route to host"... Maybe we should start doing that, and recommend that...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ICMP_PORT_UNREACH is correct, you connect to a service IP:port and the port is not reachable because doesn't have endpoints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is the iptables default for -j REJECT
, isn't it?
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: astoycos, danwinship The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/cc @aojea @tssurya @astoycos