Bug: Issue restarting containers using network in other stack #11

EzekialSA · 2021-10-20T16:29:39Z

I'm trying to configure everything to be automated with updates and availability using watch tower and deunhealth. I was doing testing to see what would happen if gluetun got an update (as you know it breaks things connected to it when it restarts). I get the following errors when stopping/restarting gluetun:

2021/10/20 12:17:21 ERROR failed restarting container: Error response from daemon: Cannot restart container qbittorrent: No such container: 5bc959037ff8fceeca8dfae013347f64162fa759189421d224f07a31810f3aaf,
2021/10/20 12:17:18 INFO container qbittorrent (image ghcr.io/linuxserver/qbittorrent:latest) is unhealthy, restarting it...

I believe that the gluetun container is the one that's referenced by that hash, so it disappears and deunhealth doesn't know how to handle it.

I don't think it's worth noting, but I am using portainer for stack management. Here are my config files of what I'm trying to do:

version: "2.1"
services:
  qbittorrent:
    image: ghcr.io/linuxserver/qbittorrent:latest
    container_name: qbittorrent
    labels:
      - com.centurylinklabs.watchtower.scope=WEEKDAYS
      - deunhealth.restart.on.unhealthy=true
    environment:
      - PUID=1000
      - PGID=1000
      - TZ=Europe/London
      - WEBUI_PORT=8095
      - UMASK=002
    healthcheck:
      test: "curl -sf -o /dev/null example.com || exit 1"
      interval: 1m
      timeout: 10s
      retries: 2
    restart: unless-stopped
    network_mode: "container:gluetun"
---
version: "3"
services:
  gluetun:
    image: qmcgaw/gluetun
    container_name: gluetun
    labels:
      - com.centurylinklabs.watchtower.scope=WEEKDAYS
      - deunhealth.restart.on.unhealthy=true
    cap_add:
      - NET_ADMIN
    ports:
      - 8888:8888/tcp # HTTP proxy
      - 8388:8388/tcp # Shadowsocks
      - 8388:8388/udp # Shadowsocks
      - 6881:6881/tcp
      - 6881:6881/udp
      - 8095:8095/tcp
    volumes:
      - /yes/config/gluetun:/gluetun
    environment:
      - VPNSP=nordvpn
      - REGION=United States
      - UPDATE_PERIOD=24h
    restart: unless-stopped
---
version: "3.7"
services:
  deunhealth:
    image: qmcgaw/deunhealth
    container_name: deunhealth
    labels:
      - com.centurylinklabs.watchtower.scope=WEEKDAYS
      - deunhealth.restart.on.unhealthy=true
    network_mode: "none"
    environment:
      - LOG_LEVEL=info
      - HEALTH_SERVER_ADDRESS=127.0.0.1:9999
      - TZ=America/New_York
    restart: always
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
---
version: "3"
services:
  watchtower:
    image: containrrr/watchtower
    container_name: watchtower
    labels:
      - com.centurylinklabs.watchtower.scope=WEEKDAYS
      - deunhealth.restart.on.unhealthy=true
    environment:
      - WATCHTOWER_INCLUDE_RESTARTING=true
      - WATCHTOWER_CLEANUP=true
      - WATCHTOWER_REVIVE_STOPPED=true
      - WATCHTOWER_ROLLING_RESTART=true
      - TZ=America/New_York
    command: --schedule "0 0 5 * * 1-5" --scope WEEKDAYS
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
      - /etc/docker/daemon.json:/config.json
    restart: always

The text was updated successfully, but these errors were encountered:

kubax · 2021-10-20T17:18:35Z

i seccond that... Thats exactly my problem...

i have disabled updates for gluetun to stop my containers from dangling without network.

if that is fixable, i would be very glad!!!

qdm12 · 2021-10-20T23:13:03Z

That's really strange. So the container can no longer be found with its container ID?! I'll do some more testing.

Meanwhile I'm almost done on a cascaded restart feature which should restart containers labeled for it when a certain container starts (like gluetun).

qdm12 · 2021-10-20T23:18:02Z

Ah got it. It's because the container ID it was relying on (gluetun) disappeared. Ugh, that's also going to be problematic for my cascaded restart feature... I think the (connected) container config needs to be patched somehow, before being restarted 🤔

qdm12 · 2021-10-20T23:49:00Z

Ok so after some research... There is no way to know what the 'vpn' container was since we just have its ID and it no longer exists (the name is not accessible). I guess it could stop it, but it wouldn't be able to start it again, so that's a bit pointless sadly.

Now on my cascaded restart feature, the idea is that you would put a label on the 'connected' containers indicating the container name of the 'vpn' container. That way, this is feasible. Writing out how it should do it (also for myself):

Stream events and monitor every container starting
For every start event (e.g. vpn starting), get all containers labeled with the name of the container starting
For each container found:
- If it is NOT a connected container, just restart it
- If it is a connected container:
  1. Inspect it and get its entire configuration
  2. Extract the expired ID from this config
  3. Use the container ID from the container starting and replace the expired ID with it in the config
  4. Stop the container
  5. Start a new container using the patched config

I have bits and pieces of it ready, I just need to wire everything up and try it out, but it should work fine.

qdm12 · 2021-10-24T16:25:50Z

So... this previous suggestion, let's call it A, won't work if Deunhealth starts and a VPN container has already been shutdown/restarted and existing containers are disconnected, before deunhealth started. The only solution, call it B, I can think of is to use labels for both the VPN container and the connected containers and not rely on container names. For example have a unique label ID for the 'vpn' container, and use it for all the connected containers.

I also came up with another solution, let's call it C, which is also more complex to implement, only relying on container names (no label), although it has the same problem mentioned above. Here's how it would work (notes to myself as well):

When Deunhealth start, gather all containers that are connected to another container, extract each of the 'vpn' container IDs, and find the corresponding container name for each of these IDs (assuming the VPN container is not gone yet)
Stream events and monitor every start events.
- Check if the container is container-connected. If it is, extract the 'vpn' container ID ➡️ get its name and keep a state of the id<->name mapping
- Check if the container name is one of the VPN name from our id<->name mapping. If it is, find all now disconnected containers using the old id (using our mapping), patch all their configurations with the new ID and stop&start them. Update the mapping id<->name.

Solutions comparison

Solution	Works on previously disconnected containers at start	Works without label for VPN container	Works without labels for VPN connected containers	Does not need state
`A`	❎	✔️	❎	✔️
`B`	✔️	❎	❎	✔️
`C`	❎	✔️	✔️	❎

Now what solution do you prefer 😄 ????

I'm leaning towards B to have something that works, although it requires more user fiddling.

EzekialSA · 2021-10-24T17:58:03Z

Personally I lean towards B as well. Involves more up front config with labels, but it allows for more verbosity with what is connected, forcing the user to make that link.

Solution A, Auto monitoring and logging container information isn't a terrific solution to me.

Solution C, dropping context of containers seems like too much effort, and could cause some issue if someone has multiple stacks with overlapping configured names over a cluster...bad practice, but could cause a headache for someone down the line.

kubax · 2021-10-24T18:32:32Z

I pick B. I was elected to lead, not to read! (SCNR)

Labels would be perfectly fine for me.

Also it sounds like a litle less work from your side, with the labels implementation.

oester · 2021-11-11T12:36:17Z

Another vote for option B.

lennvilardi · 2021-11-26T06:15:13Z

+1 for option B and do you know when it will be released ?

nlynzaad · 2021-11-27T10:26:44Z

+1 for option B

qdm12 · 2021-11-28T13:44:45Z

I'm working on it right now! Hopefully we will have something today 😉

EDIT (2021-12-06): still working on it, it's a bit more convoluted than I expected code-spaghetti wise, but it's getting there!

qdm12 · 2021-11-29T19:23:39Z

Note if the 'network container' (aka the vpn) goes down and doesn't restart, there is no way to restart properly the connected containers since the label won't be anywhere unfortunately. I will make the program log it out as a warning if this happens.

kubax · 2021-11-29T19:27:46Z

i'm not sure if i got this right.

you are not able to restart the "child" containers, if the vpn server did kill itself and did not restart, right?

But if the container is updated and did restart without errors, that is still possible to fix with the intended patch?

lennvilardi · 2021-12-04T10:57:16Z

In my case I just need to recreate containers attached to the network container when recreated by watchtower. The network container is always up and running but the others containers are orphans and cannot be restarted.

lennvilardi · 2021-12-26T14:28:50Z

any eta ?

ahmaddxb · 2022-02-15T10:24:14Z

Has this been implemented yet?

sunbeam60 · 2022-04-27T10:22:45Z

A little late to the party here, but definitely also prefer option B and I'm very excited about this feature.

(yes, my gluetun container got updated by watchtower last night and now the whole stack is down 😄 )

qdm12 · 2022-05-01T13:56:24Z

Hello all, good news, I'm working again on this. Sorry for the immense delay I took to get back working on this.
I have some 'new uncommited' code (from like 6 months ago lol) that looks promising, I'm hoping for a solution B implementation soon! 👍

Manfred73 · 2022-07-06T12:31:04Z

Should this already be working in a current version combined with using deunhealth?
I'm still using an older image of gluetun (v3.28.2) so it doesn't get automatically updated by watchtower.
When it does get updated, connectivity to apps using gluetun is lost (#34).
Or should I still manually update gluetun for now?

MajorLOL · 2023-04-24T12:15:03Z

Any update? :)

STRAYKR · 2023-08-04T09:50:07Z

I guess Quentin hasn't had time to implement the deunhealth.restart.on.unhealthy=true label yet, or else it's a more difficult task that initially thought? Doesn't work for me yet.

deunhealth log states 0 containers monitored, despite tagging several containers with deunhealth.restart.on.unhealthy=true

2023/08/04 10:44:19 INFO Monitoring 0 containers to restart when becoming unhealthy

I turn my mini-PC media server off every evening. So I've been able to use a shell script that does a docker compose down && docker compose up -d 2 mins after the server first boots up (Quentin recommends running similar as a workaround). This fixes my stack... at least for some hours. Sometimes something breaks, and if if that happens I just power it off and on again! Looking forward to a more robust solution :-)

NaturallyAsh · 2023-09-11T22:00:16Z

@STRAYKR Is your deun container in the same yml as gluetun? That was my issue. Logs showed "Monitoring 0 containers" when I added the label to gluetun but deun was in its own yml. When I moved deun to the same yml compose as gluetun and qbittorrent, deun registered the labels and started monitoring the containers. I'm thinking, for my case, that the issue might've been that deun couldn't reach gluetun because it wasn't on the same network.

nolimitech · 2023-12-31T00:25:36Z

Hello guys.
It still doesn't work.

`2023/12/30 19:07:39 INFO container qbittorrent (image lscr.io/linuxserver/qbittorrent:latest) is unhealthy, restarting it...
2023/12/30 19:07:43 ERROR failed restarting container: Error response from daemon: Cannot restart container qbittorrent: No such container: 66cfe13371d1b10781c4a0649f96c8a82044f3852a2bbd77524c6f92b1902e35

2023/12/30 19:18:51 INFO container transmission (image lscr.io/linuxserver/transmission:latest) is unhealthy, restarting it...
2023/12/30 19:18:55 ERROR failed restarting container: Error response from daemon: Cannot restart container transmission: No such container: 72a8f02b433e0b443812be3a44171ece10b9cc6191b7d9bcba8fc6cdb012d125`

STRAYKR · 2024-01-01T15:37:22Z

@STRAYKR Is your deun container in the same yml as gluetun? That was my issue. Logs showed "Monitoring 0 containers" when I added the label to gluetun but deun was in its own yml. When I moved deun to the same yml compose as gluetun and qbittorrent, deun registered the labels and started monitoring the containers. I'm thinking, for my case, that the issue might've been that deun couldn't reach gluetun because it wasn't on the same network.

Hi @NaturallyAsh, sorry for the delayed response, yes, all config for deun and gluetun is in the same yml docker compose file, I only have the one docker compose file.

web3dopamine · 2024-01-31T19:32:30Z

hi guys
Any update on this?

jaredbrogan · 2024-07-27T00:13:08Z

Just chiming in to keep this issue at least somewhat active. 😄

qdm12 mentioned this issue Nov 11, 2021

Feature request: Add option to force container re-create rather than restart #16

Closed

nhubert mentioned this issue Jan 28, 2022

Feature request: Self update to avoid Docker restarts qdm12/gluetun#433

Open

qdm12 mentioned this issue Mar 15, 2022

Bug: v3.28.0 keeps restarting, and v3.27.0 gives me random cities (not from the region specified) qdm12/gluetun#892

Closed

cascandaliato mentioned this issue Sep 28, 2023

Got exception when needed to restart cascandaliato/docker-restarter#3

Open

enchained mentioned this issue Aug 17, 2024

Bug: Connectivity is lost once gluetun container is restarted qdm12/gluetun#641

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: Issue restarting containers using network in other stack #11

Bug: Issue restarting containers using network in other stack #11

EzekialSA commented Oct 20, 2021 •

edited by qdm12

Loading

kubax commented Oct 20, 2021

qdm12 commented Oct 20, 2021

qdm12 commented Oct 20, 2021 •

edited

Loading

qdm12 commented Oct 20, 2021 •

edited

Loading

qdm12 commented Oct 24, 2021 •

edited

Loading

EzekialSA commented Oct 24, 2021

kubax commented Oct 24, 2021

oester commented Nov 11, 2021

lennvilardi commented Nov 26, 2021

nlynzaad commented Nov 27, 2021

qdm12 commented Nov 28, 2021 •

edited

Loading

qdm12 commented Nov 29, 2021

kubax commented Nov 29, 2021

lennvilardi commented Dec 4, 2021

lennvilardi commented Dec 26, 2021

ahmaddxb commented Feb 15, 2022

sunbeam60 commented Apr 27, 2022

qdm12 commented May 1, 2022

Manfred73 commented Jul 6, 2022

MajorLOL commented Apr 24, 2023

STRAYKR commented Aug 4, 2023 •

edited

Loading

NaturallyAsh commented Sep 11, 2023

nolimitech commented Dec 31, 2023

STRAYKR commented Jan 1, 2024

web3dopamine commented Jan 31, 2024

jaredbrogan commented Jul 27, 2024

Bug: Issue restarting containers using network in other stack #11

Bug: Issue restarting containers using network in other stack #11

Comments

EzekialSA commented Oct 20, 2021 • edited by qdm12 Loading

kubax commented Oct 20, 2021

qdm12 commented Oct 20, 2021

qdm12 commented Oct 20, 2021 • edited Loading

qdm12 commented Oct 20, 2021 • edited Loading

qdm12 commented Oct 24, 2021 • edited Loading

EzekialSA commented Oct 24, 2021

kubax commented Oct 24, 2021

oester commented Nov 11, 2021

lennvilardi commented Nov 26, 2021

nlynzaad commented Nov 27, 2021

qdm12 commented Nov 28, 2021 • edited Loading

qdm12 commented Nov 29, 2021

kubax commented Nov 29, 2021

lennvilardi commented Dec 4, 2021

lennvilardi commented Dec 26, 2021

ahmaddxb commented Feb 15, 2022

sunbeam60 commented Apr 27, 2022

qdm12 commented May 1, 2022

Manfred73 commented Jul 6, 2022

MajorLOL commented Apr 24, 2023

STRAYKR commented Aug 4, 2023 • edited Loading

NaturallyAsh commented Sep 11, 2023

nolimitech commented Dec 31, 2023

STRAYKR commented Jan 1, 2024

web3dopamine commented Jan 31, 2024

jaredbrogan commented Jul 27, 2024

EzekialSA commented Oct 20, 2021 •

edited by qdm12

Loading

qdm12 commented Oct 20, 2021 •

edited

Loading

qdm12 commented Oct 20, 2021 •

edited

Loading

qdm12 commented Oct 24, 2021 •

edited

Loading

qdm12 commented Nov 28, 2021 •

edited

Loading

STRAYKR commented Aug 4, 2023 •

edited

Loading