Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/etc/docker/daemon.json is read-only in HA OS 9.0 #2135

Closed
unclehack opened this issue Sep 16, 2022 · 27 comments
Closed

/etc/docker/daemon.json is read-only in HA OS 9.0 #2135

unclehack opened this issue Sep 16, 2022 · 27 comments
Labels
board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) bug stale

Comments

@unclehack
Copy link

Describe the issue you are experiencing

/etc/docker/daemon.json used to be writable before version 9.0. This facilitated the customisation of the Docker daemon startup flags.

This is no longer possible with HA OS 9.0. Please make this possible again. HA OS is already somewhat hostile when it comes to debugging.

The alternative is to fork HA OS and keep building new images. It doesn't really make sense to maintain a fork for one or two things which need to be modified. I could just migrate to the HA core container instead.

What operating system image do you use?

generic-x86-64 (Generic UEFI capable x86-64 systems)

What version of Home Assistant Operating System is installed?

9.0

Did you upgrade the Operating System.

Yes

Steps to reproduce the issue

  1. try to edit /etc/docker/daemon.json
  2. the editor says that the file is read-only
    ...

Anything in the Supervisor logs that might be useful for us?

n/a

Anything in the Host logs that might be useful for us?

n/a

System Health information

No response

Additional information

No response

@unclehack unclehack added the bug label Sep 16, 2022
@agners
Copy link
Member

agners commented Sep 16, 2022

This change has been on purpose, see #2116. The reason this folder was writeable is for Docker to store the key.json file. This is now redirected to a wrietable directory. I did not anticipate that people would alter configuration of the Docker daemon.

What is your use case/what do you intend to change in daemon.json?

HA OS is already somewhat hostile when it comes to debugging.

It is completely unsuited for debugging, IMHO 😅 . But that is the nature of a largely stateless, read-only system. 🤷‍♂️

The alternative is to fork HA OS and keep building new images. It doesn't really make sense to maintain a fork for one or two things which need to be modified.

So you use this altered config not just for debugging then? I am really curious what kind of changes you need. Ideally we make them "unnecessary" or configurable through a flag.

I could just migrate to the HA core container instead.

I have a HAOS build directory which I keep rather current. A rebuild is usually quite quick, so for me altering a system service for debugging purpose isn't that big of a deal. But I understand, that is a bit a hassle. Using a regular Debian & Supervised is probably the way to go if you need custom system service configurations which we can't ship by default.

@agners agners added the board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) label Sep 16, 2022
@unclehack
Copy link
Author

This change has been on purpose, see #2116. The reason this folder was writeable is for Docker to store the key.json file. This is now redirected to a wrietable directory. I did not anticipate that people would alter configuration of the Docker daemon.

What is your use case/what do you intend to change in daemon.json?

I add my own configuration options. It wouldn't make sense to force everyone to use these settings.

It's not obvious how this could be handled after the migration to this setup. The change itself makes sense. I also understand the limitations around Docker's daemon.json configuration and the absence of config.d directories with multiple config files.

Supplying multiple daemon.json config files doesn't seem to be possible. The one option would be to have a pre-start script which grabs daemon.json from two different locations and merges them into another one at /etc/docker/daemon.json. The options set in the second one would override the ones from the main daemon.json provided by HA OS. Of course, if the daemon doesn't start because of invalid configuration, that wouldn't be HA OS' problem.

HA OS is already somewhat hostile when it comes to debugging.

It is completely unsuited for debugging, IMHO 😅 . But that is the nature of a largely stateless, read-only system. 🤷‍♂️

The second Ethernet adapter froze on my live HA OS setup. I had to connect locally to the machine to figure out what's going on.

It's great that there's not much to manage for HA OS and there's one less system to look after. Not being able to connect to it to debug the same way as with a regular distribution is a limitation.

The alternative is to fork HA OS and keep building new images. It doesn't really make sense to maintain a fork for one or two things which need to be modified.

So you use this altered config not just for debugging then? I am really curious what kind of changes you need. Ideally we make them "unnecessary" or configurable through a flag.

Indeed, debugging is needed for my main HA instance. I've brought this up in the context of the daemon.json since it makes HA OS less configurable.

I could just migrate to the HA core container instead.

I have a HAOS build directory which I keep rather current. A rebuild is usually quite quick, so for me altering a system service for debugging purpose isn't that big of a deal. But I understand, that is a bit a hassle. Using a regular Debian & Supervised is probably the way to go if you need custom system service configurations which we can't ship by default.

I've built my own testing image to learn how these images are built. The issue is with keeping up with HA OS releases.

Perhaps it might be a good idea to build something similar to HA OS with SSH access and a few other changes. Maybe that would bring back some decent improvements to HA OS.

@HangrilyJon
Copy link

Can confirm this is a super inconvenient change. My ISP router is on 172.17.x.x with no access to change, so moving the docker network to something else is required. I'm still unsure why docker needs a /16 (65k addresses per network), so using these default settings for mostly home use is a waste and inconvenient.

Adding settings like below should be more than enough for most users, but most importantly, being able to update the address pool to something the user prefers, would be ideal.

{ "bip": "172.17.64.1/24", "default-address-pools": [ { "base": "172.17.64.0/18", "size": 24 } ] }

I'm now at an impasse and won't be able to do connectivity checks to make sure my internet is up

@tristansgray
Copy link

Can confirm this is a super inconvenient change. My ISP router is on 172.17.x.x with no access to change, so moving the docker network to something else is required. I'm still unsure why docker needs a /16 (65k addresses per network), so using these default settings for mostly home use is a waste and inconvenient.

Adding settings like below should be more than enough for most users, but most importantly, being able to update the address pool to something the user prefers, would be ideal.

{ "bip": "172.17.64.1/24", "default-address-pools": [ { "base": "172.17.64.0/18", "size": 24 } ] }

I'm now at an impasse and won't be able to do connectivity checks to make sure my internet is up

Seconding this, the ability to change the default docker network would be extremely helpful.

@unclehack
Copy link
Author

jq -s '.[0] * .[1]' base-daemon.json local.json is able to merge the json configuration properly. It even overrides existing configuration options.

We can use this command as a pre-start script for the systemd unit. I'll give it a shot.

The plan is to try this:

  1. put the read only config file in /etc/default/docker/daemon.json
  2. bind mount /etc/docker/
  3. put our custom daemon.local.json there
  4. the startup script merges the two JSON files into one file at /etc/docker/daemon.json

Another thing to keep in mind is that we can check if we already have an up to date /etc/docker/daemon.json with the same content. That makes sure we don't write to the disk when that's not necessary.

@unclehack
Copy link
Author

@agners: This should work. You'll probably want to modify this.
implement-docker-local-conf.txt

The initial build of the images takes some time.

@unclehack
Copy link
Author

@agners: Did you take a look at the posted patch? Is there something you want to have changed in that patch?

@agners
Copy link
Member

agners commented Oct 24, 2022

Perhaps it might be a good idea to build something similar to HA OS with SSH access and a few other changes. Maybe that would bring back some decent improvements to HA OS.

There is SSH access, see SSH access to the host.

Back to the topic:

What is your use case/what do you intend to change in daemon.json?

I add my own configuration options. It wouldn't make sense to force everyone to use these settings.

HAOS has lots of hard coded configuration and decisions, and that is the intention: Limit the amount of variability, so it is manageable to get reliable updates etc. If maximum flexibility is our intention, we could just offer a pre-built Debian image with Supervisor pre-installed...

There is always a trade-off. With OS 9.0 we decided to standardize on daemon.json, since we did not see a good use case to customize it, and at the same time it has a lot of options which can break the setup in subtle ways, potentially hard to debug as well.

Did you take a look at the posted patch? Is there something you want to have changed in that patch?

I like the merging aspect of the implementation. What isn't super nice is that we need to copy the file on every boot (this has the potential of infamous 0 byte files when power cutting at the wrong point).

That said, I like to understand the use cases we are trying to solve here, and rather prefer to bring "native" support for this use cases.

@HangrilyJon @tristansgray bring up a good point: The hard coded network configuration.

Now we have actually two networks: docker0 and hassio.

The docker0 network is 172.17.0.1/16, and the hassio is 172.30.32.1/23.

To run Home Assistant Supervisor, and any of the plug-ins and add-ons the docker0 is actually not required. So we could consider just downright dropping that.

Is the hassio network in a problematic range for you? Maybe this should be configurable.

@tristansgray
Copy link

@HangrilyJon @tristansgray bring up a good point: The hard coded network configuration.

Now we have actually two networks: docker0 and hassio.

The docker0 network is 172.17.0.1/16, and the hassio is 172.30.32.1/23.

To run Home Assistant Supervisor, and any of the plug-ins and add-ons the docker0 is actually not required. So we could consider just downright dropping that.

Is the hassio network in a problematic range for you? Maybe this should be configurable.

In my case, it's the docker0 range. However, allowing modification of the hassio network would be useful as well.

@unclehack
Copy link
Author

unclehack commented Oct 24, 2022

Perhaps it might be a good idea to build something similar to HA OS with SSH access and a few other changes. Maybe that would bring back some decent improvements to HA OS.

There is SSH access, see SSH access to the host.

Thanks. I'll try that again. It didn't work for me the last time I tried it.

Back to the topic:

What is your use case/what do you intend to change in daemon.json?

I add my own configuration options. It wouldn't make sense to force everyone to use these settings.

HAOS has lots of hard coded configuration and decisions, and that is the intention: Limit the amount of variability, so it is manageable to get reliable updates etc. If maximum flexibility is our intention, we could just offer a pre-built Debian image with Supervisor pre-installed...

There is always a trade-off. With OS 9.0 we decided to standardize on daemon.json, since we did not see a good use case to customize it, and at the same time it has a lot of options which can break the setup in subtle ways, potentially hard to debug as well.

This is most likely valid for lots of things, including writing YAML by hand for Home Assistant. It can be broken in many ways. In my case, I don't expect the Home Assistant developers and the community to provide support for different settings. It's just that it doesn't make sense to have hardcoded and read-only settings because someone will break their system. It can also be perceived as saying that users need to not be allowed to make changes because they'll break something. This is a Linux system. People do expect to be able to make changes within reason.

Did you take a look at the posted patch? Is there something you want to have changed in that patch?

I like the merging aspect of the implementation. What isn't super nice is that we need to copy the file on every boot (this has the potential of infamous 0 byte files when power cutting at the wrong point).

The file isn't copied on every boot. This is the script:

#!/bin/bash

DEFAULT_CONFIG=/etc/default/docker/daemon.json
LOCAL_CONFIG=/etc/docker/daemon.local.json
TARGET_CONFIG=/etc/docker/daemon.json
TEMP_CONFIG=/tmp/daemon.json

# use the default if local config doesn't exist
if [[ -f "$LOCAL_CONFIG" ]]; then
	jq -s '.[0] * .[1]' $DEFAULT_CONFIG $LOCAL_CONFIG >/tmp/daemon.json
else
	TEMP_CONFIG=$DEFAULT_CONFIG
fi

# copy the generated or the default config if the current config is different
cmp -s $TEMP_CONFIG $TARGET_CONFIG
if [ "$?" -ne "0" ]; then
	cp $TEMP_CONFIG $TARGET_CONFIG
fi

If there's no local config, the default is copied. Otherwise a new daemon config is generated and stored in /tmp.
The existing target config file and the new temporary config file are compared. If the comparison of the files fails (missing target config, different files), the new config is copied to the destination.

Only the temporary file is generated on every startup. Does that address your concern? Is the fact that we copy the temporary file to /tmp a concern?

I've attached the modified version of the patch which replaces diff with cmp. diff isn't present in the target image.
implement-docker-local-conf-v2.txt

I'd like to make an improvement to make the bind mount read-only in /etc/docker and operate directly on the files in /mnt/overlay/etc/docker.

That said, I like to understand the use cases we are trying to solve here, and rather prefer to bring "native" support for this use cases.

Regarding the use case, Home Assistant Operating System is still a Linux system. Is it unreasonable to expect it to allow some configuration changes? I expect to be able to customize the running Docker daemon's configuration as I can on every other distribution. HA OS isn't the right solution for Home Assistant if the answer is no. That'd be a shame as it could save some time with maintaining yet another system.

@HangrilyJon @tristansgray bring up a good point: The hard coded network configuration.

Now we have actually two networks: docker0 and hassio.

The docker0 network is 172.17.0.1/16, and the hassio is 172.30.32.1/23.

To run Home Assistant Supervisor, and any of the plug-ins and add-ons the docker0 is actually not required. So we could consider just downright dropping that.

Is the hassio network in a problematic range for you? Maybe this should be configurable.

@agners
Copy link
Member

agners commented Oct 24, 2022

It's just that it doesn't make sense to have hardcoded and read-only settings because someone will break their system.

Why does that not make sense? It does a lot to me, we use a read-only root file system to avoid any corruption, intentional or unintentional (mis-)configuration, and generally make the system more predictable and with that solid. We push that as much as possible, without taking away necessary features.

Regarding the use case, Home Assistant Operating System is still a Linux system.

We remove a lot of features of a traditional distribution to make it largely stateless, single purpose operating system with a fixed set of packets "pre-installed", distributed in a monolithic image. Compared to a general purpose Linux operating system large parts are missing, e.g. no package manager. Is it "Linux"? I guess that depends on the definition 🤷‍♂️ is Andorid Linux? 🤷‍♂️ IMHO, it's a purpose built OS using Linux kernel and other OSS packages as its building blocks. Arch Linux is a proper Linux 😅

In the end, the question is, what is the exact definition of "without taking away necessary features". It seems that in your opinion we overstepping that line with making /etc/docker/daemon.json read-only...

Is it unreasonable to expect it to allow some configuration changes?

No, that is not unreasonable. However, for HAOS, we try to limit the amount of configuration options to make the system manageable for us. For the set of core features we want to provide forward compatibility (e.g. network and host name configuration). We try to make those settings as much as possible accessible through the frontend as well.

The limited feature set also allows us to advance the development of the system. E.g. there is the idea to move away from Docker entirely for HAOS and using containerd (or even runc) only. Docker "hides" some nice features available in containerd (like dynamic device permission updates for containers for hot-plugging). And it would make the distribution lean. These are just ideas at this point, and not sure if that will ever materialize. But "supporting" custom daemon.json would definitely hinder such an effort. With a read-only daemon.json, we know exactly what users are having out there, and a migration becomes possibles.

And that is also why I'd like to know what features are used in daemon.json, to better understand what we need to support. From this thread, I know that out of 71k people which are on 9.x (and opt-in to diagnostics), we have three users speaking up. Two users are mainly interested in customizing the IP address range of docker0, and user @unclehack I am still not sure what the use case is 😅

@unclehack
Copy link
Author

It's just that it doesn't make sense to have hardcoded and read-only settings because someone will break their system.

Why does that not make sense? It does a lot to me, we use a read-only root file system to avoid any corruption, intentional or unintentional (mis-)configuration, and generally make the system more predictable and with that solid. We push that as much as possible, without taking away necessary features.

Regarding the use case, Home Assistant Operating System is still a Linux system.

We remove a lot of features of a traditional distribution to make it largely stateless, single purpose operating system with a fixed set of packets "pre-installed", distributed in a monolithic image. Compared to a general purpose Linux operating system large parts are missing, e.g. no package manager. Is it "Linux"? I guess that depends on the definition 🤷‍♂️ is Andorid Linux? 🤷‍♂️ IMHO, it's a purpose built OS using Linux kernel and other OSS packages as its building blocks. Arch Linux is a proper Linux 😅

In the end, the question is, what is the exact definition of "without taking away necessary features". It seems that in your opinion we overstepping that line with making /etc/docker/daemon.json read-only...

Is it unreasonable to expect it to allow some configuration changes?

No, that is not unreasonable. However, for HAOS, we try to limit the amount of configuration options to make the system manageable for us. For the set of core features we want to provide forward compatibility (e.g. network and host name configuration). We try to make those settings as much as possible accessible through the frontend as well.

The limited feature set also allows us to advance the development of the system. E.g. there is the idea to move away from Docker entirely for HAOS and using containerd (or even runc) only. Docker "hides" some nice features available in containerd (like dynamic device permission updates for containers for hot-plugging). And it would make the distribution lean. These are just ideas at this point, and not sure if that will ever materialize. But "supporting" custom daemon.json would definitely hinder such an effort. With a read-only daemon.json, we know exactly what users are having out there, and a migration becomes possibles.

And that is also why I'd like to know what features are used in daemon.json, to better understand what we need to support. From this thread, I know that out of 71k people which are on 9.x (and opt-in to diagnostics), we have three users speaking up. Two users are mainly interested in customizing the IP address range of docker0, and user @unclehack I am still not sure what the use case is 😅

Private registries and registry mirrors are two such settings. I don't think they break your assumptions regarding the supported configuration. I keep it if I break it. The HA OS images can even be patched by hand if you prefer not to support this at all.

@agners
Copy link
Member

agners commented Oct 24, 2022

Private registries and registry mirrors are two such settings. I don't think they break your assumptions regarding the supported configuration. I keep it if I break it. The HA OS images can even be patched by hand if you prefer not to support this at all.

Private registries are possible, in the add-on store page (top right corner):

image

There is no support for registry mirrors afaik.

That said, we are using ghcr.io mostly these days, afaik, the registry-mirrors only applies to Docker Hub.

@unclehack
Copy link
Author

@agners: I don't see a way to open a simple ticket which isn't a bug report. The following patch would be helpful for HA OS:

diff --git a/buildroot-external/kernel/docker.config b/buildroot-external/kernel/docker.config
index 40b07857..4c133bbc 100644
--- a/buildroot-external/kernel/docker.config
+++ b/buildroot-external/kernel/docker.config
@@ -3,6 +3,7 @@ CONFIG_CFQ_GROUP_IOSCHED=y
 CONFIG_CFS_BANDWIDTH=y
 CONFIG_FAIR_GROUP_SCHED=y
 CONFIG_NET_SCHED=y
+CONFIG_NET_SCH_FQ_CODEL=m
 # CONFIG_RT_GROUP_SCHED is not set

 CONFIG_CGROUPS=y

You can try it out since you've mentioned you build HA OS often. It'd help to have it in HA OS.

HA OS uses an unsupported version of Go from its fork of buildroot: https://github.com/home-assistant/buildroot/blob/2022.02.x-haos/package/go/go.mk#L7
The upstream version of buildroot uses a newer version of Go: https://github.com/buildroot/buildroot/blob/master/package/go/go.mk#L7

The currently supported versions of Go are 1.18 and 1.19.

I've put together a patch script to patch Docker's daemon.json config file:

#!/bin/bash

set -euo pipefail

function unpack() {
	unsquashfs -d hasqfs $1
}

function pack() {
	mksquashfs hasqfs haos.sqfs -comp lz4 -Xhc -b 131072
}

function patch() {
	echo '{ADD OPTIONS AS VALID JSON HERE}' > conf.json
	jq -s '.[0] * .[1]' hasqfs/etc/docker/daemon.json conf.json > daemon.json
	cp daemon.json hasqfs/etc/docker/daemon.json
	rm daemon.json conf.json
}

function copytodisk() {
	dd if=haos.sqfs of=$1 bs=1024KB
}

function patch_two() {
	unpack $1
	patch
	pack
	copytodisk $1
	rm -rf haos.sqfs hasqfs
}

patch_two /dev/nvme0n1p3

patch_two /dev/nvme0n1p5

Warning: use this only if you understand what needs to be modified to make this work. An unbootable HA OS and data loss are guaranteed.

@agners
Copy link
Member

agners commented Oct 26, 2022

@agners: I don't see a way to open a simple ticket which isn't a bug report. The following patch would be helpful for HA OS:

Feature requests should be submitted through the HA forum or Discussions here on GitHub.

HA OS uses an unsupported version of Go from its fork of buildroot: https://github.com/home-assistant/buildroot/blob/2022.02.x-haos/package/go/go.mk#L7

We inherit this from the current LTS version of Buildroot, see https://github.com/buildroot/buildroot/blob/2022.02.x/package/go/go.mk#L7. Probably best is to complain in the Buildroot ML about this 😅

I don't feel its worth our time to backport a newer Go version to Buildroot 2022.02, hence Go version in 9.x release series will remain what it is today.

I intend to move to the master branch of Buildroot soon, ultimately the intent is to use Buildroot 2023.02 LTS in OS 10.

@jfossheim-skyfritt
Copy link

Any hope for a workaround, a dirty hack is also ok, on this issue?

We are planning several instances of Homeassistant, and use the 172.17.0.0/16-range internally, so without the abiltiy to change the docker IP range, Homeassistant is basically useless 😳

@agners
Copy link
Member

agners commented Nov 22, 2022

Going back to modifiable daemon.json is the wrong solution, and also kinda pain at this point.

What we discussed is to limit the scope of the default network to 172.17.234.0/24 for now. Would that help in your case?

@jfossheim-skyfritt
Copy link

We use the entire range 172.17.0.0/16 over meny location, .234 is also in use. But it would be less catastrophic, only one location that would be unable to access HA-instances.

Isn’t it possible just to have some config/override parameter for the docker bridge?

@agners
Copy link
Member

agners commented Nov 22, 2022

It is not only the Docker bridge, it's also the custom network which uses a 172.17 prefix.

Ideally we make both configurable, but that is a bit more involved.

@jfossheim-skyfritt
Copy link

jfossheim-skyfritt commented Nov 22, 2022 via email

@jfossheim-skyfritt
Copy link

I have skimmed quickly through this guide:

https://community.home-assistant.io/t/installing-home-assistant-supervised-on-debian-11/200253

Will we get something that is functional equivalent to the virtual-machine if set up HA this way?

I asume there won’t be any read-only restrictions or enforcement of 172.17-prefixes there?

It would be very cumbersome to deploy and maintain all instances this way, but to get done some initial testing the next couple of months it would probably be doable.

@jfossheim-skyfritt
Copy link

jfossheim-skyfritt commented Nov 23, 2022

We have now installed and deployed several debian VMs with in the "Supervised" setup mentioned in the documentation, using the guide above, so we are helped for now. They can be both scripted and cloned, but pre-made Hassio-OS VMs would be a preffered choice.

So I still cannot emphasis enough, how important we consider a solution for this issue.

@felipecrs
Copy link

felipecrs commented Jan 17, 2023

My use case for this is to be able to configure the shm-size for Frigate, and since it can't be configured per/for add-ons, I'd need to configure a global shm size for the whole system.

Refs blakeblackshear/frigate#5123
Refs home-assistant/supervisor#1807

But well, it would be perfect if it could be configured per add-on. :)

@felipecrs
Copy link

Actually, never mind! It seems that by default all add-ons have access to the whole memory under /dev/shm.

Refs home-assistant/supervisor#2499

@agners
Copy link
Member

agners commented Jan 25, 2023

HAOS 9.5 uses two smaller subnets in 172.30.x.x range (see #2246). This should help at least some folks which have Class B range overlap with their router/ISP.

@yiqian987
Copy link

yiqian987 commented Mar 7, 2023

In China, GHCR.IO cannot be accessed, so I cannot update through the update button normally. I need to configure the docker registry mirror to normalize the Docker Pull command to download the latest image.

@github-actions
Copy link

github-actions bot commented Jun 5, 2023

There hasn't been any activity on this issue recently. To keep our backlog manageable we have to clean old issues, as many of them have already been resolved with the latest updates.
Please make sure to update to the latest Home Assistant OS version and check if that solves the issue. Let us know if that works for you by adding a comment 👍
This issue has now been marked as stale and will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the stale label Jun 5, 2023
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Jun 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
board/generic-x86-64 Generic x86-64 Boards (like Intel NUC) bug stale
Projects
None yet
Development

No branches or pull requests

7 participants