Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Gluetun refuses connection to containers passed trough it (502 Gateway) #2371

Open
Firestorm7893 opened this issue Jul 27, 2024 · 9 comments
Labels
Closed: 👥 Duplicate Issue duplicates an existing issue

Comments

@Firestorm7893
Copy link

Is this urgent?

No

Host OS

Fedora 40

CPU arch

x86_64

VPN service provider

ProtonVPN

What are you using to run the container

docker-compose

What is the version of Gluetun

Running version latest built on 2024-07-12T19:57:02.146Z (commit 9d50c23)

What's the problem 🤔

For some months I have been battling with an issue regarding the pass through gluetun does to the containers it protects trough the vpn. After an undefined amount of time (could be days, could be hours) any container that publishes a port through Gluetun becomes unreachable giving always a bad gateway error.

Eg. qbittorrent:54444 -> gluetun:54444 -> Bad Gateway.

Weirdly enough, Gluetuns api (the one at port 8000) is always reachable. I think there is an issue with gluetun since this issue, when it happens, it also happens to the flaresolverr container at the same time, which also uses the vpn.

Even weirder, the only fix is rebooting qbittorrent and flarsolverr, rebooting gluetun doesn't fix it.

I searched github for a similar issue but nothing else came up (unless I searched with the wrong terms)
Traefik is not the issue, since even by going trough local host still results in a gateway error.

Maybe I am missing something, in my config?

Thanks in advance for any help. I can move this to the discussion if it's not a bug.

Share your logs (at least 10 lines)

Logs from traefik: 
2024-07-27T11:39:44Z DBG github.com/traefik/traefik/v3/pkg/server/service/proxy.go:100 > 502 Bad Gateway error="dial tcp 172.18.0.4:54444: connect: connection refused"
2024-07-27T11:39:44Z DBG github.com/traefik/traefik/v3/pkg/server/service/proxy.go:100 > 499 Client Closed Request error="context canceled"
2024-07-27T11:39:44Z DBG github.com/traefik/traefik/v3/pkg/server/service/proxy.go:100 > 502 Bad Gateway error="dial tcp 172.18.0.4:54444: connect: connection refused"

Logs from gluetun: 
========================================
========================================
=============== gluetun ================
========================================
=========== Made with ❤️ by ============
======= https://github.com/qdm12 =======
========================================
========================================
Running version latest built on 2024-07-12T19:57:02.146Z (commit 9d50c23)
🔧 Need help? https://github.com/qdm12/gluetun/discussions/new
🐛 Bug? https://github.com/qdm12/gluetun/issues/new
✨ New feature? https://github.com/qdm12/gluetun/issues/new
☕ Discussion? https://github.com/qdm12/gluetun/discussions/new
💻 Email? quentin.mcgaw@gmail.com
💰 Help me? https://www.paypal.me/qmcgaw https://github.com/sponsors/qdm12
2024-07-27T11:43:50Z WARN You are using the old environment variable OPENVPN_USER, please consider changing it to VPN_PORT_FORWARDING_USERNAME
2024-07-27T11:43:50Z WARN You are using the old environment variable OPENVPN_PASSWORD, please consider changing it to VPN_PORT_FORWARDING_PASSWORD
2024-07-27T11:43:50Z INFO [routing] default route found: interface eth0, gateway 172.18.0.1, assigned IP 172.18.0.4 and family v4
2024-07-27T11:43:50Z INFO [routing] local ethernet link found: eth0
2024-07-27T11:43:50Z INFO [routing] local ipnet found: 172.18.0.0/16
2024-07-27T11:43:50Z INFO [firewall] enabling...
2024-07-27T11:43:50Z INFO [firewall] enabled successfully
2024-07-27T11:43:51Z INFO [storage] merging by most recent 19425 hardcoded servers and 19572 servers read from /gluetun/servers.json
2024-07-27T11:43:51Z INFO [storage] Using protonvpn servers from file which are 605 days more recent
2024-07-27T11:43:51Z INFO Alpine version: 3.19.2
2024-07-27T11:43:51Z INFO OpenVPN 2.5 version: 2.5.10
2024-07-27T11:43:51Z INFO OpenVPN 2.6 version: 2.6.11
2024-07-27T11:43:51Z INFO Unbound version: 1.20.0
2024-07-27T11:43:51Z INFO IPtables version: v1.8.10
2024-07-27T11:43:51Z INFO Settings summary:
├── VPN settings:
|   ├── VPN provider settings:
|   |   ├── Name: protonvpn
|   |   ├── Server selection settings:
|   |   |   ├── VPN type: openvpn
|   |   |   ├── Countries: Germany
|   |   |   └── OpenVPN server selection settings:
|   |   |       └── Protocol: UDP
|   |   └── Automatic port forwarding settings:
|   |       ├── Redirection listening port: disabled
|   |       ├── Use port forwarding code for current provider
|   |       ├── Forwarded port file path: /tmp/gluetun/forwarded_port
|   |       └── Credentials:
|   |           ├── Username: KYKK50AdGPd8LOAu+pmp
|   |           └── Password: wo...dx
|   └── OpenVPN settings:
|       ├── OpenVPN version: 2.6
|       ├── User: [set]
|       ├── Password: wo...dx
|       ├── Network interface: tun0
|       ├── Run OpenVPN as: root
|       └── Verbosity level: 1
├── DNS settings:
|   ├── Keep existing nameserver(s): no
|   ├── DNS server address to use: 127.0.0.1
|   └── DNS over TLS settings:
|       ├── Enabled: yes
|       ├── Update period: every 24h0m0s
|       ├── Unbound settings:
|       |   ├── Authoritative servers:
|       |   |   └── cloudflare
|       |   ├── Caching: yes
|       |   ├── IPv6: no
|       |   ├── Verbosity level: 1
|       |   ├── Verbosity details level: 0
|       |   ├── Validation log level: 0
|       |   ├── System user: root
|       |   └── Allowed networks:
|       |       ├── 0.0.0.0/0
|       |       └── ::/0
|       └── DNS filtering settings:
|           ├── Block malicious: yes
|           ├── Block ads: no
|           ├── Block surveillance: no
|           └── Blocked IP networks:
|               ├── 127.0.0.1/8
|               ├── 10.0.0.0/8
|               ├── 172.16.0.0/12
|               ├── 192.168.0.0/16
|               ├── 169.254.0.0/16
|               ├── ::1/128
|               ├── fc00::/7
|               ├── fe80::/10
|               ├── ::ffff:127.0.0.1/104
|               ├── ::ffff:10.0.0.0/104
|               ├── ::ffff:169.254.0.0/112
|               ├── ::ffff:172.16.0.0/108
|               └── ::ffff:192.168.0.0/112
├── Firewall settings:
|   └── Enabled: yes
├── Log settings:
|   └── Log level: info
├── Health settings:
|   ├── Server listening address: 127.0.0.1:9999
|   ├── Target address: cloudflare.com:443
|   ├── Duration to wait after success: 5s
|   ├── Read header timeout: 100ms
|   ├── Read timeout: 500ms
|   └── VPN wait durations:
|       ├── Initial duration: 6s
|       └── Additional duration: 5s
├── Shadowsocks server settings:
|   └── Enabled: no
├── HTTP proxy settings:
|   └── Enabled: no
├── Control server settings:
|   ├── Listening address: :8000
|   └── Logging: yes
├── OS Alpine settings:
|   ├── Process UID: 1000
|   └── Process GID: 1000
├── Public IP settings:
|   ├── Fetching: every 12h0m0s
|   ├── IP file path: /tmp/gluetun/ip
|   └── Public IP data API: ipinfo
└── Version settings:
    └── Enabled: yes
2024-07-27T11:43:51Z INFO [routing] default route found: interface eth0, gateway 172.18.0.1, assigned IP 172.18.0.4 and family v4
2024-07-27T11:43:51Z INFO [routing] adding route for 0.0.0.0/0
2024-07-27T11:43:51Z INFO [firewall] setting allowed subnets...
2024-07-27T11:43:51Z INFO [routing] default route found: interface eth0, gateway 172.18.0.1, assigned IP 172.18.0.4 and family v4
2024-07-27T11:43:51Z INFO TUN device is not available: open /dev/net/tun: no such file or directory; creating it...
2024-07-27T11:43:51Z INFO [dns] using plaintext DNS at address 1.1.1.1
2024-07-27T11:43:51Z INFO [http server] http server listening on [::]:8000
2024-07-27T11:43:51Z INFO [healthcheck] listening on 127.0.0.1:9999
2024-07-27T11:43:51Z INFO [firewall] allowing VPN connection...
2024-07-27T11:43:51Z INFO [openvpn] OpenVPN 2.6.11 x86_64-alpine-linux-musl [SSL (OpenSSL)] [LZO] [LZ4] [EPOLL] [MH/PKTINFO] [AEAD]
2024-07-27T11:43:51Z INFO [openvpn] library versions: OpenSSL 3.1.6 4 Jun 2024, LZO 2.10
2024-07-27T11:43:51Z INFO [openvpn] TCP/UDP: Preserving recently used remote address: [AF_INET]185.159.156.29:1194
2024-07-27T11:43:51Z INFO [openvpn] UDPv4 link local: (not bound)
2024-07-27T11:43:51Z INFO [openvpn] UDPv4 link remote: [AF_INET]185.159.156.29:1194
2024-07-27T11:43:51Z INFO [openvpn] [node-de-24.protonvpn.net] Peer Connection Initiated with [AF_INET]185.159.156.29:1194
2024-07-27T11:43:53Z INFO [openvpn] setsockopt TCP_NODELAY=1 failed
2024-07-27T11:43:53Z INFO [openvpn] TUN/TAP device tun0 opened
2024-07-27T11:43:53Z INFO [openvpn] /sbin/ip link set dev tun0 up mtu 1500
2024-07-27T11:43:53Z INFO [openvpn] /sbin/ip link set dev tun0 up
2024-07-27T11:43:53Z INFO [openvpn] /sbin/ip addr add dev tun0 10.21.0.3/16
2024-07-27T11:43:53Z INFO [openvpn] UID set to nonrootuser
2024-07-27T11:43:53Z INFO [openvpn] Initialization Sequence Completed
2024-07-27T11:43:53Z INFO [dns] downloading DNS over TLS cryptographic files
2024-07-27T11:43:58Z INFO [healthcheck] healthy!
2024-07-27T11:44:00Z INFO [dns] downloading hostnames and IP block lists
2024-07-27T11:44:03Z INFO [dns] init module 0: validator
2024-07-27T11:44:03Z INFO [dns] init module 1: iterator
2024-07-27T11:44:04Z INFO [dns] start of service (unbound 1.20.0).
2024-07-27T11:44:04Z INFO [dns] generate keytag query _ta-4a5c-4f66-9728. NULL IN
2024-07-27T11:44:04Z INFO [dns] generate keytag query _ta-4a5c-4f66-9728. NULL IN
2024-07-27T11:44:04Z INFO [dns] ready
2024-07-27T11:44:05Z INFO [ip getter] Public IP address is 217.138.216.140 (Germany, Hesse, Frankfurt am Main)
2024-07-27T11:44:06Z INFO [vpn] You are running 4 commits behind the most recent latest
2024-07-27T11:44:06Z INFO [port forwarding] starting
2024-07-27T11:44:06Z INFO [port forwarding] gateway external IPv4 address is 217.138.216.140
2024-07-27T11:44:06Z INFO [port forwarding] port forwarded is 55140
2024-07-27T11:44:06Z INFO [firewall] setting allowed input port 55140 through interface tun0...
2024-07-27T11:44:06Z INFO [port forwarding] writing port file /tmp/gluetun/forwarded_port

Share your configuration

version: '3'

services:
  gluetun:
    image: qmcgaw/gluetun
    container_name: gluetun
    cap_add:
      - NET_ADMIN
    environment:
      - VPN_PORT_FORWARDING=on
      - VPN_SERVICE_PROVIDER=protonvpn
      - OPENVPN_USER=
      - OPENVPN_PASSWORD=
      - SERVER_COUNTRIES=Germany
    restart: always
    volumes:
      - /home/docker/network/gluetun:/gluetun
    networks:
      - newNetwork
    ports:
      - 8000:8000
      - 8191:8191
      - 54444:54444
    sysctls:
      - net.ipv6.conf.all.disable_ipv6=1
      - net.ipv4.conf.all.rp_filter=2
    labels:
      traefik.enable: true
      traefik.docker.network: newNetwork

      #services
      traefik.http.routers.qb.rule: Host(`########`)
      traefik.http.routers.qb.tls: true
      traefik.http.routers.qb.tls.certresolver: lets-encrypt
      traefik.http.routers.qb.service: qb  
      # appropropriate header changes
      traefik.http.middlewares.qb-headers.headers.customrequestheaders.X-Frame-Options: SAMEORIGIN
      traefik.http.middlewares.qb-headers.headers.customrequestheaders.Referer: ''
      traefik.http.middlewares.qb-headers.headers.customrequestheaders.Origin: ''
      traefik.http.routers.qb.middlewares: qb-headers
      traefik.http.services.qb.loadbalancer.server.port: 54444
      traefik.http.services.qb.loadbalancer.passhostheader: false
     
      
  flaresolverr:
    image: ghcr.io/flaresolverr/flaresolverr:latest
    container_name: flaresolverr
    network_mode: "service:gluetun"
    environment:
      - LOG_LEVEL=${LOG_LEVEL:-info}
      - LOG_HTML=${LOG_HTML:-false}
      - CAPTCHA_SOLVER=${CAPTCHA_SOLVER:-none}
      - TZ=Europe/Rome
    restart: unless-stopped
    labels:
      - traefik.enable=false
      
  qbittorrent:
    image: lscr.io/linuxserver/qbittorrent:latest
    container_name: qbittorrent
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=Europe/Rome
      - WEBUI_PORT=54444
    network_mode: "service:gluetun"
    volumes:
      - /home/docker/services/qbittorrent/config:/config
      - /mnt/Vault/services/transmission/downloads:/downloads
      - /home/docker/services/qbittorrent/VueTorrent:/vuetorrent
    restart: unless-stopped
    labels:
      - traefik.enable=false
    depends_on:
      - gluetun

      
networks:
  newNetwork:
    external: true
Copy link
Contributor

@qdm12 is more or less the only maintainer of this project and works on it in his free time.
Please:

@qdm12
Copy link
Owner

qdm12 commented Jul 28, 2024

If you restart the Gluetun container only (do you have auto-updates like watchtower?), connected containers lose their connection and won't be reachable. Maybe this is what happens here?

@Firestorm7893
Copy link
Author

I don't have a watchtower or anything similar but the container does restart itself in case of crashes. Maybe I'll check in the logs if the container gets restarted for any reason.

@Firestorm7893
Copy link
Author

Firestorm7893 commented Jul 28, 2024

I tried looking for restarts with docker system events but nothing came up. Instead just logging the container's events let me see that these two events kept happening every 2/3 seconds

2024-07-28T18:23:39.501819784+02:00 container exec_start: /bin/sh -c /gluetun-entrypoint healthcheck
0aa1cb468c8def9ded0a82d1fc35f5fbfd2590b3022ecbb324eac372e29b2f1d (com.docker.compose.config-
hash=8e07dc888fbd33ef6f2a0d5f3a8e8695704ec265a37e5b629d5908942d151901, com.docker.compose.container-number=1, 
com.docker.compose.depends_on=, 
com.docker.compose.image=sha256:bd31c8cbe0ba219e7fb86d0f5e6725d774aa5a45b00a4baa397b2f5ac8de9e29, 
com.docker.compose.oneoff=False, com.docker.compose.project=internal, com.docker.compose.project.config_files=/data/
compose/43/docker-compose.yml, com.docker.compose.project.working_dir=/data/compose/43, 
com.docker.compose.replace=d6e0c60986400806053f733578debbb539676d23387109b9c9b8ed70a04abb63, 
com.docker.compose.service=gluetun, com.docker.compose.version=2.24.6, 
execID=7e541bed1fcc930360f9704a2033d71e83a9da7c088154753081d18a1f5b9103, image=qmcgaw/gluetun, name=gluetun,
 org.opencontainers.image.authors=quentin.mcgaw@gmail.com, org.opencontainers.image.created=2024-07-12T19:57:02.146Z, 
org.opencontainers.image.description=VPN client in a thin Docker container for multiple VPN providers, written in Go, and using 
OpenVPN or Wireguard, DNS over TLS, with a few proxy servers built-in., org.opencontainers.image.documentation=https://
github.com/qdm12/gluetun, org.opencontainers.image.licenses=MIT, 
org.opencontainers.image.revision=9d50c2353204a6d497b94fbfa96423c8bda5f529, 
org.opencontainers.image.source=https://github.com/qdm12/gluetun, org.opencontainers.image.title=gluetun, 
org.opencontainers.image.url=https://github.com/qdm12/gluetun, org.opencontainers.image.version=latest, traefik.docker.network=newNetwork, traefik.enable=true, 
traefik.http.middlewares.qb-headers.headers.customrequestheaders.Origin=,
 traefik.http.middlewares.qb-headers.headers.customrequestheaders.Referer=, 
traefik.http.middlewares.qb-headers.headers.customrequestheaders.X-Frame-Options=SAMEORIGIN, traefik.http.routers.qb.middlewares=qb-headers, traefik.http.routers.qb.rule=Host(`#######`),
 traefik.http.routers.qb.service=qb, traefik.http.routers.qb.tls=true, traefik.http.routers.qb.tls.certresolver=lets-encrypt, traefik.http.services.qb.loadbalancer.passhostheader=false, 
traefik.http.services.qb.loadbalancer.server.port=54444)

2024-07-28T18:23:39.600776235+02:00 container exec_die  
0aa1cb468c8def9ded0a82d1fc35f5fbfd2590b3022ecbb324eac372e29b2f1d
 (com.docker.compose.config-hash=8e07dc888fbd33ef6f2a0d5f3a8e8695704ec265a37e5b629d5908942d151901, 
com.docker.compose.container-number=1, com.docker.compose.depends_on=, com.docker.compose.image=sha256:bd31c8cbe0ba219e7fb86d0f5e6725d774aa5a45b00a4baa397b2f5ac8de9e29, 
com.docker.compose.oneoff=False, com.docker.compose.project=internal, 
com.docker.compose.project.config_files=/data/compose/43/docker-compose.yml, 
com.docker.compose.project.working_dir=/data/compose/43, com.docker.compose.replace=d6e0c60986400806053f733578debbb539676d23387109b9c9b8ed70a04abb63, 
com.docker.compose.service=gluetun, com.docker.compose.version=2.24.6, execID=7e541bed1fcc930360f9704a2033d71e83a9da7c088154753081d18a1f5b9103, exitCode=0, image=qmcgaw/gluetun, 
name=gluetun, org.opencontainers.image.authors=quentin.mcgaw@gmail.com, org.opencontainers.image.created=2024-07-12T19:57:02.146Z, 
org.opencontainers.image.description=VPN client in a thin Docker container for multiple VPN providers, written in Go, and using OpenVPN or Wireguard, DNS over TLS, with a few proxy servers built-in., 
org.opencontainers.image.documentation=https://github.com/qdm12/gluetun, org.opencontainers.image.licenses=MIT, 
org.opencontainers.image.revision=9d50c2353204a6d497b94fbfa96423c8bda5f529, 
org.opencontainers.image.source=https://github.com/qdm12/gluetun, org.opencontainers.image.title=gluetun, 
org.opencontainers.image.url=https://github.com/qdm12/gluetun, org.opencontainers.image.version=latest, 
traefik.docker.network=newNetwork, traefik.enable=true, 
traefik.http.middlewares.qb-headers.headers.customrequestheaders.Origin=, 
traefik.http.middlewares.qb-headers.headers.customrequestheaders.Referer=,
traefik.http.middlewares.qb-headers.headers.customrequestheaders.X-Frame-Options=SAMEORIGIN, 
traefik.http.routers.qb.middlewares=qb-headers, 
traefik.http.routers.qb.rule=Host(`########`), traefik.http.routers.qb.service=qb, traefik.http.routers.qb.tls=true, 
traefik.http.routers.qb.tls.certresolver=lets-encrypt, traefik.http.services.qb.loadbalancer.passhostheader=false, 
traefik.http.services.qb.loadbalancer.server.port=54444)

I never really used the healtcheck system provided by docker, but could it be that after some time it decides the container is unealty? I ask this because of the container exec_start: /bin/sh -c /gluetun-entrypoint healthcheck event immediately followed by the container exec_die event

@knaku
Copy link

knaku commented Aug 6, 2024

I have also had these issues, thought I had changed something which created thae issue but I have tried debug my config a bit. I have not verified it but my impression is that the passed through container becomes unavailable after a restart of the gluetun container. I've had some issues with the gluetun container being unhealthy, I thought that was because of a non-updated servers.json but that has been updated for some time now and the issue has persisted.

I changed the hardware and host in april, but I think the issues started in may-june but I am not sure.

As a side note, for my part the health check might fail because of an old slow spinning disk on the host machine, which runs a VM with Gluetun and a couple of other containers - horrible IOPS.

@qdm12 qdm12 added the Closed: 👥 Duplicate Issue duplicates an existing issue label Aug 9, 2024
@Silversurfer79
Copy link

Just wondering it there is a way to mark a container as dependent on another and if there is an issue with a container to force a restart of them in sequence of priority? Due to the frequent releases of Gluetun, it’s been updated a few times (no issue I’m glad on the development and thanks) this week and I always run the latest and use watchtower top check every night at 2am. This means that Gluetun pulls the new container and then Transmission fails to "reconnect". Wondering how we could eliminate this? I know I could make watchtower only check once a week, but it doesn’t fix Transmission dependency on Gluetun.

@epd5
Copy link

epd5 commented Aug 26, 2024

Just wondering it there is a way to mark a container as dependent on another and if there is an issue with a container to force a restart of them in sequence of priority? Due to the frequent releases of Gluetun, it’s been updated a few times (no issue I’m glad on the development and thanks) this week and I always run the latest and use watchtower top check every night at 2am. This means that Gluetun pulls the new container and then Transmission fails to "reconnect". Wondering how we could eliminate this? I know I could make watchtower only check once a week, but it doesn’t fix Transmission dependency on Gluetun.

To force a dependency for my P2P client I run this under my P2P client:

      healthcheck:
         test: ["CMD-SHELL", "nc -z -v localhost 9999 || exit 1"]
         interval: 30s
         timeout: 10s
         retries: 3
         start_period: 10s 

found here: #641 (comment)

P2P Client healthchecks open port check of Gluetun, if it fails marks unhealthy, then willfarrell/autoheal container restarts it. Works well for me with minimal downtime.

@Silversurfer79
Copy link

Just wondering it there is a way to mark a container as dependent on another and if there is an issue with a container to force a restart of them in sequence of priority? Due to the frequent releases of Gluetun, it’s been updated a few times (no issue I’m glad on the development and thanks) this week and I always run the latest and use watchtower top check every night at 2am. This means that Gluetun pulls the new container and then Transmission fails to "reconnect". Wondering how we could eliminate this? I know I could make watchtower only check once a week, but it doesn’t fix Transmission dependency on Gluetun.

To force a dependency for my P2P client I run this under my P2P client:

      healthcheck:
         test: ["CMD-SHELL", "nc -z -v localhost 9999 || exit 1"]
         interval: 30s
         timeout: 10s
         retries: 3
         start_period: 10s 

found here: #641 (comment)

P2P Client healthchecks open port check of Gluetun, if it fails marks unhealthy, then willfarrell/autoheal container restarts it. Works well for me with minimal downtime.

Thats very helpful thanks, Ill give it a try and get back to yhe thread.

@nrgbistro
Copy link

nrgbistro commented Sep 2, 2024

Tried this, it didn't work for me. When I restart my gluetun container I can no longer access my container behind gluetun. The container is still marked healthy even though it cannot be reached. I instead used ping -c 1 1.1.1.1 || exit 1 as my healthcheck command, which seems to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Closed: 👥 Duplicate Issue duplicates an existing issue
Projects
None yet
Development

No branches or pull requests

6 participants