Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port Forwarding Randomly Stops Working #69

Open
Tailslide opened this issue Mar 28, 2023 · 11 comments
Open

Port Forwarding Randomly Stops Working #69

Tailslide opened this issue Mar 28, 2023 · 11 comments

Comments

@Tailslide
Copy link

Tailslide commented Mar 28, 2023

I think this is a problem on PIA's end but it seems like after a week or two (but way before the PF is supposed to expire) the port forwarding stops working until I reset the container. I'll post a log next time I see this.

Currently trying this as a health check since I'm using qbittorrent:

        healthcheck:
            # consider it a health failure if qbittorrent port not open
            test: curl -s "http://127.0.0.1:8080/api/v2/transfer/info" | grep -q "connected"
            interval: 30s
            timeout: 10s
            retries: 3
@thrnz
Copy link
Owner

thrnz commented Mar 29, 2023

Logs of this happening would be really helpful. The container's port forwarding script sends a request to their API every 15 mins to keep the forwarded port alive, so if something goes wrong API-wise then it should be showing up in the container's logs.

If this has been happening regularly on a specific server then let me know and I'll spin up a container and see if I can catch it in action.

@Tailslide
Copy link
Author

Sure I'm running with location as:

        - LOC=swiss

Happened a few days ago, then a week before that, then two weeks before that.
Wish I had grabbed the logs before I recreated the stack.

@Maubil
Copy link

Maubil commented Apr 7, 2023

Same problem here. Here are my logs:

Fetching next-gen PIA server list
Verified OK
Verified server list
Registering public key with PIA endpoint; id: swiss, cn: zurich406, ip: 11.22.33.44
Generating /etc/wireguard/wg0.conf
Using custom DNS servers: 1.1.1.1,8.8.8.8
Port forwarding is available at this location
Successfully generated /etc/wireguard/wg0.conf
Thu Apr  6 12:59:37 UTC 2023: Bringing up WireGuard interface wg0
[#] ip link add wg0 type wireguard
[#] wg setconf wg0 /dev/fd/63
[#] ip -4 address add 10.1.2.3 dev wg0
[#] ip link set mtu 1420 up dev wg0
[#] resolvconf -a wg0 -m 0 -x
[#] wg set wg0 fwmark 51820
[#] ip -4 route add 0.0.0.0/0 dev wg0 table 51820
[#] ip -4 rule add not fwmark 51820 table 51820
[#] ip -4 rule add table main suppress_prefixlength 0
[#] iptables-restore -n

interface: wg0
  public key: <pub_key_hash>
  private key: (hidden)
  listening port: 54694
  fwmark: 0xaaaa

peer: <pub_key_hash>
  endpoint: 11.22.33.44:1337
  allowed ips: 0.0.0.0/0
  transfer: 0 B received, 148 B sent
  persistent keepalive: every 25 seconds

Thu Apr  6 12:59:37 UTC 2023: WireGuard successfully started
Thu Apr  6 12:59:37 UTC 2023: Allowing network access to 172.20.0.2/16 on eth0
Thu Apr  6 12:59:37 UTC 2023: Allowing network access to 172.24.0.3/16 on eth1
Thu Apr  6 12:59:37 UTC 2023: Firewall enabled: Blocking non-WireGuard traffic
Thu Apr  6 12:59:37 UTC 2023: Starting port forward script
Thu Apr  6 12:59:37 UTC 2023: Verifying API requests. CN: zurich406
Thu Apr  6 12:59:37 UTC 2023: Getting PF token
Thu Apr  6 12:59:43 UTC 2023: Obtained PF token. Expires at 2023-06-08T00:59:43.131317649Z
Thu Apr  6 12:59:43 UTC 2023: Server accepted PF bind
Thu Apr  6 12:59:43 UTC 2023: Forwarding on port 45263
Thu Apr  6 12:59:43 UTC 2023: Running /scripts/pf_success.sh
Thu Apr  6 12:59:43 UTC 2023: Allowing incoming traffic on port 45263
Thu Apr  6 12:59:43 UTC 2023: Rebind interval: 900 seconds
Thu Apr  6 12:59:43 UTC 2023: Port dumped to /pia-shared/port.dat
Thu Apr  6 12:59:43 UTC 2023: This script should remain running to keep the forwarded port alive
Thu Apr  6 12:59:43 UTC 2023: Press Ctrl+C to exit
curl: (28) Connection timed out after 15000 milliseconds
curl: (28) Connection timed out after 15001 milliseconds
curl: (28) Connection timed out after 15000 milliseconds
curl: (28) Connection timed out after 15001 milliseconds
curl: (28) Connection timed out after 15001 milliseconds
curl: (28) Connection timed out after 15000 milliseconds
Thu Apr  6 13:32:29 UTC 2023: bindPort error

Thu Apr  6 13:32:29 UTC 2023: Fatal error

@thrnz
Copy link
Owner

thrnz commented Apr 7, 2023

The port forwarding script does an API call every 15mins to keep the port alive. It looks like its timing out when this happens and then failing after several retries. Is the vpn connection itself still working when it happens? If the underlying connection itself is no longer working then there's not much that the script can be done to recover.

@Tailslide
Copy link
Author

Tailslide commented Apr 7, 2023

I am connected to the same server and I see the same timeouts even when everything is running successfully. For example mine has this error three times since my container started last week and port forwarding and the VPN are still working. I don't have the bindPort error (yet).

@thrnz
Copy link
Owner

thrnz commented Apr 7, 2023

The port forwarding script is set to retry each request 5 times before dying, so as long as those curl timeout messages aren't clumped together time-wise and followed by a bindPort error then it should still be working.

I think I might've managed to catch something similar happening on one of their swiss servers. When it happened it looked like the vpn connection itself had died and the wireguard endpoint had stopped responding altogether:

vpnpia-test-vpn-1  | 2023-04-06T08:44:47.348943123Z Thu Apr  6 08:44:47 UTC 2023: Rebind interval: 900 seconds
vpnpia-test-vpn-1  | 2023-04-06T08:44:47.351835483Z Thu Apr  6 08:44:47 UTC 2023: Port dumped to /pia-shared/port.dat
vpnpia-test-vpn-1  | 2023-04-06T08:44:47.354366486Z Thu Apr  6 08:44:47 UTC 2023: This script should remain running to keep the forwarded port alive
vpnpia-test-vpn-1  | 2023-04-06T08:44:47.356463019Z Thu Apr  6 08:44:47 UTC 2023: Press Ctrl+C to exit
vpnpia-test-vpn-1  | 2023-04-06T10:49:32.082462811Z curl: (28) Connection timed out after 15001 milliseconds
vpnpia-test-vpn-1  | 2023-04-06T11:57:03.682250506Z curl: (28) Connection timed out after 15001 milliseconds
vpnpia-test-vpn-1  | 2023-04-06T13:12:43.271096816Z curl: (28) Connection timed out after 15000 milliseconds
vpnpia-test-vpn-1  | 2023-04-06T13:13:13.284536222Z curl: (28) Connection timed out after 15000 milliseconds
vpnpia-test-vpn-1  | 2023-04-06T13:13:43.299309284Z curl: (28) Connection timed out after 15001 milliseconds
vpnpia-test-vpn-1  | 2023-04-06T13:14:13.300349527Z curl: (28) Connection timed out after 15000 milliseconds
vpnpia-test-vpn-1  | 2023-04-06T13:14:43.301106474Z curl: (28) Connection timed out after 15000 milliseconds
vpnpia-test-vpn-1  | 2023-04-06T13:15:13.306693351Z curl: (28) Connection timed out after 15001 milliseconds
vpnpia-test-vpn-1  | 2023-04-06T13:15:13.391668376Z Thu Apr  6 13:15:13 UTC 2023: bindPort error
vpnpia-test-vpn-1  | 2023-04-06T13:15:13.391703576Z
vpnpia-test-vpn-1  | 2023-04-06T13:15:13.391710277Z Thu Apr  6 13:15:13 UTC 2023: Fatal error
$ sudo docker compose exec vpn bash
3c35b6eda727:/scripts# ping google.com
ping: bad address 'google.com'
3c35b6eda727:scripts# wg | grep endpoint
  endpoint: 212.102.37.47:1337
3c35b6eda727:/scripts# ping  212.102.37.47
PING 212.102.37.47 (212.102.37.47): 56 data bytes
^C
--- 212.102.37.47 ping statistics ---
3 packets transmitted, 0 packets received, 100% packet loss
3c35b6eda727:/scripts# wg | grep handshake
  latest handshake: 1 hour, 42 minutes, 26 seconds ago

@thrnz
Copy link
Owner

thrnz commented Apr 7, 2023

My timestamps seems to match up with @Maubil and the container was also connected to zurich406, although with a different IP address endpoint. Perhaps some of their swiss based servers went down/were restarted or something at around the same time.

@Tailslide
Copy link
Author

Tailslide commented May 1, 2023

Ok happened again.. looks like it's getting "FATAL ERROR" in bind_port()
After this error in the logs the VPN is still connected but port forwarding is no longer working.
I guess some handling of this situation should happen like an option to exit the container.

Click me For Logs
Fetching next-gen PIA server list
Verified OK
Verified server list
Registering public key with PIA endpoint; id: swiss, cn: zurich406, ip: 156.146.62.XXX
Generating /etc/wireguard/wg0.conf
Using PIA DNS servers: 10.0.0.243,10.0.0.242
Port forwarding is available at this location
Successfully generated /etc/wireguard/wg0.conf
Sat Apr 29 09:30:18 UTC 2023: Bringing up WireGuard interface wg0
[#] ip link add wg0 type wireguard
RTNETLINK answers: Not supported
[!] Missing WireGuard kernel module. Falling back to slow userspace implementation.
[#] wireguard-go wg0
┌──────────────────────────────────────────────────────┐
│                                                      │
│   Running wireguard-go is not required because this  │
│   kernel has first class support for WireGuard. For  │
│   information on installing the kernel module,       │
│   please visit:                                      │
│         https://www.wireguard.com/install/           │
│                                                      │
└──────────────────────────────────────────────────────┘
[#] wg setconf wg0 /dev/fd/63
[#] ip -4 address add 10.16.198.21 dev wg0
[#] ip link set mtu 1420 up dev wg0
[#] resolvconf -a wg0 -m 0 -x
[#] wg set wg0 fwmark 51820
[#] ip -4 route add 0.0.0.0/0 dev wg0 table 51820
[#] ip -4 rule add not fwmark 51820 table 51820
[#] ip -4 rule add table main suppress_prefixlength 0
[#] iptables-restore -n
iptables-restore v1.8.8 (legacy): iptables-restore: unable to initialize table 'raw'
Error occurred at line: 1
Try `iptables-restore -h' or 'iptables-restore --help' for more information.
interface: wg0
  public key: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
  private key: (hidden)
  listening port: 45523
  fwmark: 0xca6c
peer: XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX=
  endpoint: 156.146.62.XXX:1337
  allowed ips: 0.0.0.0/0
Sat Apr 29 09:30:18 UTC 2023: WireGuard successfully started
Sat Apr 29 09:30:18 UTC 2023: Allowing network access to 172.23.0.4/16 on eth0
Sat Apr 29 09:30:18 UTC 2023: Firewall enabled: Blocking non-WireGuard traffic
Sat Apr 29 09:30:18 UTC 2023: Starting port forward script
Sat Apr 29 09:30:18 UTC 2023: Verifying API requests. CN: zurich406
Sat Apr 29 09:30:18 UTC 2023: Getting PF token
Sat Apr 29 09:30:19 UTC 2023: Obtained PF token. Expires at 2023-06-30T21:30:18.974533474Z
Sat Apr 29 09:30:19 UTC 2023: Server accepted PF bind
Sat Apr 29 09:30:19 UTC 2023: Forwarding on port 24558
Sat Apr 29 09:30:19 UTC 2023: Running /scripts/pf_success.sh
Sat Apr 29 09:30:19 UTC 2023: Allowing incoming traffic on port 24558
Sat Apr 29 09:30:19 UTC 2023: Rebind interval: 900 seconds
Sat Apr 29 09:30:19 UTC 2023: Running user-defined script: /pia/port_script.sh
Sat Apr 29 09:30:19 UTC 2023: Port dumped to /pia-shared/port.dat
Sat Apr 29 09:30:19 UTC 2023: This script should remain running to keep the forwarded port alive
Setting qBittorrent port settings (24558)...
Sat Apr 29 09:30:19 UTC 2023: Press Ctrl+C to exit
Ok.Qbittorrent port updated successfully (24558)...
Sun Apr 30 12:04:00 UTC 2023: bindPort error
Sun Apr 30 12:04:00 UTC 2023: Curl error/s:
curl: (28) Connection timed out after 15000 milliseconds
curl: (28) Connection timed out after 15000 milliseconds
curl: (28) Connection timed out after 15001 milliseconds
curl: (28) Connection timed out after 15001 milliseconds
curl: (28) Connection timed out after 15001 milliseconds
curl: (28) Connection timed out after 15001 milliseconds
Sun Apr 30 12:04:00 UTC 2023: API response:
Sun Apr 30 12:04:00 UTC 2023: Fatal error

@thrnz
Copy link
Owner

thrnz commented May 2, 2023

If the connection itself is still working, then I wonder if it might be worth tweaking the curl timeout/retry options to be a bit more lenient before giving up in case its just a temporary issue with the port forwarding api. I'm not sure if restarting the port forwarding script would help, as the (unresponsive) port forwarding api endpoint would remain the same.

Having the container optionally exit if port forwarding fails could probably then be done as a final resort.

@Tailslide
Copy link
Author

Ok as an experiment I edited pf.sh and changed the retry to 3000 will see what happens.

@thrnz
Copy link
Owner

thrnz commented May 2, 2023

Temporary curl errors aren't shown in the log anymore - the odd timeout seems to be common enough and is generally fixed on the first retry. Any curl errors are only now printed once its given up altogether, so with 3000 retires it might take a while before anything shows up in the log.

I've bumped up the normal timeouts/retries a bit anyway. I figure the only downside would be that it would take longer before showing an error when something goes wrong. The upside being that its given more of a chance to recover before giving up.

I think it should've gone from about 3mins worth of retries to about 10mins now before giving up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants