Skip to content

conntrack lookup removal in ipt_GLBREDIRECT breaks with network namespaces #111

@jstangroome

Description

@jstangroome

The change to ipt_GLBREDIRECT implemented in PR #67 and discussed in issue #50 breaks deployments where the listening socket is in a different network namespace to where the -j GLBREDIRECT iptables rule is installed.

The observed behaviour is that GUE-encapsulated TCP SYN packets are accepted but all subsequent GUE packets for the same TCP session are then forwarded to the next-hop specified in the GUE private data, instead of being accepted locally.

Taking current master (commit 5387908) and reverting just the PR #67 merge commit 5e1edd0, i.e. git revert -m1 5e1edd0 corrects the behaviour. The behaviour is also mitigated by configuring the GLB with only a single backend since there is no next-hop to forward to but this is not very useful in practice.

The assumption is that the inet_lookup_established call is only considering ESTABLISHED sockets in the host network namespace and the now deleted conntrack lookup code does not exist to discover the conntrack entries related to having directed the connection to another network namespace.

One example where this occurs is on a Kubernetes node with the ip fou tunnel and GLBREDIRECT iptables rule configured on the host network namespace, while an nginx-ingress controller Pod listens on TCP sockets 80 and 443 inside the Pod's network namespace and traffic is routed from the host to the Pod via DNAT iptables rules added by the Kubernetes CNI. I expect the same behaviour can be reproduced without Kubernetes, such as with a Docker container's network namespace, or even just with ip netns add, ip netns exec and appropriate NAT rules.

The problem was experienced on Ubuntu 18.04.5 with kernel 5.4.0-42-generic.

I have not confirmed but I suspect that configuring the fou tunnel and the GLBREDIRECT iptables rule inside the Pod network namespace would also resolve the fault but this is less maintainable in a Kubernetes ingress controller context.

Possible options to fix ipt_GLBREDIRECT:

  • Just revert PR Remove conntrack lookups #67
  • Revert PR Remove conntrack lookups #67 and make it either a conditional compilation option, or enabled at module load with a module parameter, or as an additional iptables argument for -j GLBREDIRECT.
  • Introduce a module/iptable parameter to specify the network namespace to use for inet_lookup_established calls (not sure if feasible, or even friendly to use).
  • Other??

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions