Skip to content
This repository has been archived by the owner on Jul 16, 2024. It is now read-only.

misc service-proxy.md updates #460

Merged

Conversation

danwinship
Copy link
Contributor

/cc @aojea @tssurya @astoycos

@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Feb 9, 2023
a warning if you try to DNAT-but-not-SNAT a packet whose source IP
- Likewise, if the proxy supports localhost NodePort connections,
and a host-network process tries to connect to a NodePort service
via `127.0.0.1` or `::1`, it is necessary to SNAT the packet,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is the route_localnet hack, no?

this doesn't work for IPv6 no matter what kubernetes/kubernetes#90236

Copy link
Contributor Author

@danwinship danwinship Feb 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I noticed that this contradicted the comments in the kube-proxy source code but I was thinking this doc was right and the comment was wrong.

connections for a service on a particular node versus when the load
balancer _notices_ that the proxy has stopped accepting new
connections. During that time, if the load balancer sends a connection
to a "bad" node, we drop the packets so that the client will think
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the client will eventually timeout and receive an error, since the tcp stack retries with the same tuple is not likely the intermediate devices will reroute hte packets to other node

tcpdump -i any -n tcp port 80
tcpdump: data link type LINUX_SLL2
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
14:28:07.117111 eth0  Out IP 192.168.8.2.58830 > 10.96.61.91.80: Flags [S], seq 3360558320, win 64240, options [mss 1460,sackOK,TS val 2346934370 ecr 0,nop,wscale 7], length 0
14:28:08.119044 eth0  Out IP 192.168.8.2.58830 > 10.96.61.91.80: Flags [S], seq 3360558320, win 64240, options [mss 1460,sackOK,TS val 2346935372 ecr 0,nop,wscale 7], length 0
14:28:10.135037 eth0  Out IP 192.168.8.2.58830 > 10.96.61.91.80: Flags [S], seq 3360558320, win 64240, options [mss 1460,sackOK,TS val 2346937388 ecr 0,nop,wscale 7], length 0
14:28:14.203047 eth0  Out IP 192.168.8.2.58830 > 10.96.61.91.80: Flags [S], seq 3360558320, win 64240, options [mss 1460,sackOK,TS val 2346941456 ecr 0,nop,wscale 7], length 0

the reject will indicate something is wrong and the client will be able to retry sooner, in both cases AFAIK the client will have a network error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The dropping behavior has been intentional since the beginning ("KEP", implementation). It's possible that we've been doing it wrong all along, I guess?

usable endpoint IPs (with the destination port changed to the
corresponding `.spec.port[].targetPort` value, if that is set). If a
service does not have any usable endpoints, then connections to any of
the above destinations should be rejected (ie, actively refused with
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the ICMP error always be a Type 3 - Destination Unreachable packet? Is that specified somewhere?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that is what iptables does by default with -j REJECT and we carrying over with that , maybe we should indicate that it should reject the packet and leave that open, you can also use TCP_RST

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right for UDP I think we'd have to use ICMP but for TCP that makes sense

Copy link
Contributor Author

@danwinship danwinship Feb 10, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ICMP type is not specified anywhere, but... none of the other ICMP types really imply "reject" so it's strongly implied.

(Hm... and looking at the ICMP code to errno conversion table it looks like if we returned ICMP_PORT_UNREACH instead of the default ICMP_HOST_UNREACH then the caller would get back "Connection refused" instead of "No route to host"... Maybe we should start doing that, and recommend that...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ICMP_PORT_UNREACH is correct, you connect to a service IP:port and the port is not reachable because doesn't have endpoints

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the iptables default for -j REJECT , isn't it?

@astoycos
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Feb 13, 2023
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: astoycos, danwinship

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 13, 2023
@k8s-ci-robot k8s-ci-robot merged commit 1a5b551 into kubernetes-retired:master Feb 13, 2023
@danwinship danwinship deleted the proxy-doc-updates branch February 14, 2023 15:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants