Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] default service name must be docker not dockerd #273

Closed
hatifnatt opened this issue Mar 5, 2021 · 15 comments · Fixed by #277
Closed

[BUG] default service name must be docker not dockerd #273

hatifnatt opened this issue Mar 5, 2021 · 15 comments · Fixed by #277

Comments

@hatifnatt
Copy link
Contributor

Your setup

Formula commit hash / release tag

https://github.com/saltstack-formulas/docker-formula/releases/tag/v1.1.1

Versions reports (master & minion)

master
# salt --versions-report
Salt Version:
          Salt: 3002.2

Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.5.3
     docker-py: Not Installed
         gitdb: 2.0.0
     gitpython: 2.1.1
        Jinja2: 2.9.4
       libgit2: Not Installed
      M2Crypto: Not Installed
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: Not Installed
  pycryptodome: 3.6.1
        pygit2: Not Installed
        Python: 3.5.3 (default, Jul  9 2020, 13:00:10)
  python-gnupg: Not Installed
        PyYAML: 3.12
         PyZMQ: 17.1.2
         smmap: 2.0.1
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.2.1

System Versions:
          dist: debian 9 stretch
        locale: UTF-8
       machine: x86_64
       release: 4.9.0-13-amd64
        system: Linux
       version: Debian GNU/Linux 9 stretch
minion
# salt-minion --version-report
Salt Version:
          Salt: 3002.5

Dependency Versions:
          cffi: Not Installed
      cherrypy: Not Installed
      dateutil: 2.6.1
     docker-py: 4.4.4
         gitdb: Not Installed
     gitpython: Not Installed
        Jinja2: 2.10.1
       libgit2: Not Installed
      M2Crypto: 0.35.2
          Mako: Not Installed
       msgpack: 0.6.2
  msgpack-pure: Not Installed
  mysql-python: Not Installed
     pycparser: Not Installed
      pycrypto: Not Installed
  pycryptodome: Not Installed
        pygit2: Not Installed
        Python: 3.6.8 (default, Aug 24 2020, 17:57:11)
  python-gnupg: Not Installed
        PyYAML: 3.12
         PyZMQ: 19.0.0
         smmap: Not Installed
       timelib: Not Installed
       Tornado: 4.5.3
           ZMQ: 4.3.4

System Versions:
          dist: centos 8 n/a
        locale: UTF-8
       machine: x86_64
       release: 4.18.0-240.1.1.el8_3.x86_64
        system: Linux
       version: CentOS Linux 8 n/a

Pillar / config used

docker:
  pkg:
    docker:
      use_upstream: repo

Bug details

Describe the bug

Default service name in current official docker packages for CentOS / Debian is docker not dockerd but formula default is dockerd.

name: docker-ce
service:
name: dockerd
env: null

On CentOS default service status after package installed is Loaded: loaded (/usr/lib/systemd/system/docker.service; disabled; vendor preset: disabled)
As a result service is not started, not enabled after formula applied.

On Debian 10 default service status after package installed is Loaded: loaded (/lib/systemd/system/docker.service; enabled; vendor preset: enabled)
Therefore this issue can be missed while using formula with Debian.

Also although service dockerd is clearly missing formula fails silently due unnecessary onlyif condition

- onlyif: systemctl list-unit-files | grep {{ d.pkg.docker.service.name }} >/dev/null 2>&1

It is not clear why this condition is added in general. It's only masking issues like this.

Steps to reproduce the bug

On CentOS 8

# install docker with formula using upstream repository
#  docker:pkg:docker:use_upstream: repo
$ salt 'minion-centos8' state.sls docker
# check service
$ salt 'minion-centos8' service.missing docker
minion-centos8:
    False
$ salt 'minion-centos8' service.missing dockerd
minion-centos8:
    True

Expected behaviour

After formula is applied to the minion, docker service is running and operational.

Attempts to fix the bug

Issue can be fixed locally by adding correct service name to pillars

docker:
  pkg:
    docker:
      use_upstream: repo
      service:
        name: docker

But defaults must be fixed too.

Additional context

I think this change can be considered as BREAKING. After this change duplicate service will be created on systems where archive method is used. Therefor I'm creating issue to discuss.

@hatifnatt hatifnatt added the bug label Mar 5, 2021
@myii
Copy link
Member

myii commented Mar 7, 2021

Thanks for the report, @hatifnatt. Just a useful update to share here -- I've reintroduced the CI in GitHub Actions, which includes running all 3 suites on CentOS 8:

The output there should help define the specific issues. For example, two of the suites manage the service and confirms that it is currently using dockerd:

[DEBUG   ] Rendered data from file: /tmp/kitchen/var/cache/salt/minion/files/base/docker/software/service/running.sls:
# -*- coding: utf-8 -*-
# vim: ft=sls

include:
  - docker.software.archive.install
  - docker.software.config.file

docker-software-service-running-unmasked:
  service.unmasked:
    - name: dockerd
    - onlyif: systemctl list-unit-files | grep dockerd >/dev/null 2>&1
    - require_in:
      - service: docker-software-service-running-docker

docker-software-service-running-docker:
  service.running:
    - name: dockerd
    - enable: True
    - onlyif: systemctl list-unit-files | grep dockerd >/dev/null 2>&1

docker-software-service-running-docker-fail-notify:
  test.show_notification:
    - text: |
 * Rebooting your host is recommended!
 In certain circumstances the docker service will not start.
 Your kernel is missing some modules, or not in ideal state.
 See https://github.com/moby/moby/blob/master/contrib/check-config.sh
 * Rebooting your host is recommended!
    - onfail:
      - service: docker-software-service-running-docker
  service.enabled:
    - onlyif: True
    - name: dockerd
    - onfail:
      - service: docker-software-service-running-docker
[DEBUG   ] Rendered data from file: /tmp/kitchen/var/cache/salt/minion/files/base/docker/software/service/running.sls:
# -*- coding: utf-8 -*-
# vim: ft=sls

include:
  - docker.software.package.install
  - docker.software.config.file

docker-software-service-running-unmasked:
  service.unmasked:
    - name: dockerd
    - onlyif: systemctl list-unit-files | grep dockerd >/dev/null 2>&1
    - require_in:
      - service: docker-software-service-running-docker
    - require:
      - sls: docker.software.config.file

docker-software-service-running-docker:
  service.running:
    - name: dockerd
    - require:
      - sls: docker.software.config.file
    - enable: True
    - onlyif: systemctl list-unit-files | grep dockerd >/dev/null 2>&1

docker-software-service-running-docker-fail-notify:
  test.show_notification:
    - text: |
 * Rebooting your host is recommended!
 In certain circumstances the docker service will not start.
 Your kernel is missing some modules, or not in ideal state.
 See https://github.com/moby/moby/blob/master/contrib/check-config.sh
 * Rebooting your host is recommended!
    - onfail:
      - service: docker-software-service-running-docker
  service.enabled:
    - onlyif: True
    - name: dockerd
    - onfail:
      - service: docker-software-service-running-docker

@noelmcloughlin
Copy link
Member

noelmcloughlin commented Mar 10, 2021

Ref: #256 (comment) the unless/onlyif condition was added to workaround fact service.running did not work in old Travis CI/CD.

Based on that PR and Travis output the correct value for service name is unclear to me...

  • Windows/MacOS service name: OK or irrelevant
  • Archive service name: OK? dockerd
  • Package [Redhat (Centos, Amazon, Fedora), Debian (Ubuntu, Debian) and Other (Arch, Suse)] service name: NOK? docker?

Here is the older CI/CD. @myii does your testing include this matrix?

package Debian
https://travis-ci.com/github/saltstack-formulas/docker-formula/jobs/440075134#L2783

archive ubuntu18
https://travis-ci.com/github/saltstack-formulas/docker-formula/jobs/440075135#L2959

package ubuntu18
https://travis-ci.com/github/saltstack-formulas/docker-formula/jobs/440075137#L2794

archive centos8
https://travis-ci.com/github/saltstack-formulas/docker-formula/jobs/440075138#L3325

package centos8
https://travis-ci.com/github/saltstack-formulas/docker-formula/jobs/440075139#L2641

package fedora31
https://travis-ci.com/github/saltstack-formulas/docker-formula/jobs/440075140#L2639

archive opensuse
https://travis-ci.com/github/saltstack-formulas/docker-formula/jobs/440075141#L2764

package opensuse
https://travis-ci.com/github/saltstack-formulas/docker-formula/jobs/440075142#L2328

archive ami2
https://travis-ci.com/github/saltstack-formulas/docker-formula/jobs/440075143#L2850

archive archlinux
https://travis-ci.com/github/saltstack-formulas/docker-formula/jobs/440075144#L2947

package archlinux
https://travis-ci.com/github/saltstack-formulas/docker-formula/jobs/440075145#L2497

@noelmcloughlin
Copy link
Member

I think I'm okay with #277 in general.

@noelmcloughlin
Copy link
Member

noelmcloughlin commented Mar 10, 2021

I reviewed travis again and it looks like package=docker and archive=dockerd

Following is output from archive format travis jobs..

       [INFO    ] Executing command 'systemctl list-unit-files | grep dockerd >/dev/null 2>&1' in directory '/home/kitchen'

       [DEBUG   ] Last command return code: 0

       [INFO    ] Executing command ['systemctl', 'is-active', 'dockerd.service'] in directory '/home/kitchen'

       [DEBUG   ] stdout: inactive

       [DEBUG   ] retcode: 3

       [INFO    ] Executing command ['systemctl', 'is-enabled', 'dockerd.service'] in directory '/home/kitchen'

       [DEBUG   ] stdout: disabled

@myii
Copy link
Member

myii commented Mar 10, 2021

Here is the older CI/CD. @myii does your testing include this matrix?

@noelmcloughlin Yep, it contains all of those and more! Some of the jobs are running all 3 suites in one go.

Just running a test here with the service tests re-enabled:

@noelmcloughlin
Copy link
Member

Looks like docker service handling is not trivial.

@myii
Copy link
Member

myii commented Mar 10, 2021

Looks like docker service handling is not trivial.

@noelmcloughlin It seems to be only the package suite that's failing. Just going to figure out the right service names...

@myii
Copy link
Member

myii commented Mar 10, 2021

@noelmcloughlin Some progress, the service name using the package suite seems to be docker but it's trying to start dockerd instead:

    docker-compose archive: should be installed
       File /usr/local/docker-compose-latest/bin is expected to exist
       File /usr/local/docker-compose-latest/bin is expected to be directory
       File /usr/local/docker-compose-latest/bin type is expected to eq :directory
       File /usr/local/docker-compose-latest/bin/docker-compose is expected to exist
       File /usr/local/docker-compose-latest/bin/docker-compose is expected to be file
       File /usr/local/docker-compose-latest/bin/docker-compose is expected not to be directory
       File /usr/local/docker-compose-latest/bin/docker-compose mode is expected to cmp == "0755"
       File /usr/local/docker-compose-latest/bin/docker-compose type is expected to eq :file
       File /usr/local/bin/docker-compose is expected to be symlink
       File /usr/local/bin/docker-compose is expected to be file
       File /usr/local/bin/docker-compose is expected not to be directory
  ×  Docker service: should be running and enabled (1 failed)
       Service docker is expected to be installed
       Service docker is expected to be enabled
     ×  Service docker is expected to be running
     expected that `Service docker` is running
    Docker packages: should be installed
       System Package python3-docker is expected to be installed
       System Package gnupg-agent is expected to be installed
       System Package software-properties-common is expected to be installed
       System Package docker-ce is expected to be installed
    Docker configuration: should match desired lines
       File /etc/default/docker is expected to be file
       File /etc/default/docker owner is expected to eq "root"
       File /etc/default/docker group is expected to eq "root"
       File /etc/default/docker mode is expected to cmp == "0640"
       File /etc/default/docker content is expected to include "DOCKER_OPTS=\"-s btrfs --dns 8.8.8.8\""
       File /etc/default/docker content is expected to include "export http_proxy=\"http://172.17.42.1:3128\""

When I use kitchen login and then start the docker service, all is working fine:

    docker-compose archive: should be installed
       File /usr/local/docker-compose-latest/bin is expected to exist
       File /usr/local/docker-compose-latest/bin is expected to be directory
       File /usr/local/docker-compose-latest/bin type is expected to eq :directory
       File /usr/local/docker-compose-latest/bin/docker-compose is expected to exist
       File /usr/local/docker-compose-latest/bin/docker-compose is expected to be file
       File /usr/local/docker-compose-latest/bin/docker-compose is expected not to be directory
       File /usr/local/docker-compose-latest/bin/docker-compose mode is expected to cmp == "0755"
       File /usr/local/docker-compose-latest/bin/docker-compose type is expected to eq :file
       File /usr/local/bin/docker-compose is expected to be symlink
       File /usr/local/bin/docker-compose is expected to be file
       File /usr/local/bin/docker-compose is expected not to be directory
    Docker service: should be running and enabled
       Service docker is expected to be installed
       Service docker is expected to be enabled
       Service docker is expected to be running
    Docker packages: should be installed
       System Package python3-docker is expected to be installed
       System Package gnupg-agent is expected to be installed
       System Package software-properties-common is expected to be installed
       System Package docker-ce is expected to be installed
    Docker configuration: should match desired lines
       File /etc/default/docker is expected to be file
       File /etc/default/docker owner is expected to eq "root"
       File /etc/default/docker group is expected to eq "root"
       File /etc/default/docker mode is expected to cmp == "0640"
       File /etc/default/docker content is expected to include "DOCKER_OPTS=\"-s btrfs --dns 8.8.8.8\""
       File /etc/default/docker content is expected to include "export http_proxy=\"http://172.17.42.1:3128\""

So just going to need to identify where the service names aren't being used correctly. But perhaps @hatifnatt's PR will render that all moot. Perhaps I should try a build on top of that.

@noelmcloughlin
Copy link
Member

Sure. Thanks @myii that is a super impressive CI/CD you have built. Archive jobs are passing.....

  ✔  Docker service: should be running and enabled
     ✔  Service dockerd is expected to be installed
     ✔  Service dockerd is expected to be enabled
     ✔  Service dockerd is expected to be running

@myii
Copy link
Member

myii commented Mar 10, 2021

So just going to need to identify where the service names aren't being used correctly. But perhaps @hatifnatt's PR will render that all moot. Perhaps I should try a build on top of that.

Looking pretty good so far: #277 (comment). Will continue the discussion there.

@hatifnatt
Copy link
Contributor Author

Ref: #256 (comment) the unless/onlyif condition was added to workaround fact service.running did not work in old Travis CI/CD.

Based on that PR and Travis output the correct value for service name is unclear to me...

* Windows/MacOS service name: OK or irrelevant

* **Archive** service name: OK? `dockerd`

* **Package** [Redhat (Centos, Amazon, Fedora), Debian (Ubuntu, Debian) and Other (Arch, Suse)] service name: NOK?  `docker`?

Archive service name is dockerd because it's defined as dockerd in defaults

name: dockerd
and then this value is used when service file is created
- name: {{ d.dir.service }}/{{ d.pkg.docker.service.name }}.service
and as binary name
start: {{ d.pkg.docker.path }}/{{ d.pkg.docker.service.name }}

I think dockerd have been chosen because binary name is dockerd. But to unify name with Package install method I think it's worth to change service name to docker.

@myii
Copy link
Member

myii commented Mar 12, 2021

I think dockerd have been chosen because binary name is dockerd. But to unify name with Package install method I think it's worth to change service name to docker.

Since we already have a breaking change in #277, we should include this there as well (if we agree that should be done). @hatifnatt Note that semantic-release can handle BREAKING CHANGE messages from multiple commits, so you don't have to amend your existing commit to add this in. A question, though:

  • What should happen to any existing dockerd service? Wouldn't that have to be removed as well?

@noelmcloughlin What are your thoughts on this?

@noelmcloughlin
Copy link
Member

noelmcloughlin commented Mar 13, 2021

I think dockerd have been chosen because binary name is dockerd.
@noelmcloughlin What are your thoughts on this

I think using {{ d.pkg.docker.service.name }} for both service name and binary name is the root cause. Trying to be clever and the YAML optimization causes the bug. I suggest following:

  • New default variable docker.pkg.docker.binary.name=docker and use that for systemd.
  • Set default variable docker.pkg.docker.service.name="docker" so package format is correct.

@hatifnatt
Copy link
Contributor Author

* New default variable `docker.pkg.docker.binary.name=docker` and use that for systemd.

* Set default variable `docker.pkg.docker.service.name="docker"` so package format is correct.

Actually binary name for Docker daemon is dockerd no matter is it archive or package installation. Although I'm not sure about binary names for other OS like for FreeBSD, I doubt very much that it differs from the standard dockerd. I don't think we need additional variable for binary name. We don't manage service file in case of package installation therefore we don't need to bother about binary name, and for archive method binary name will always be the same.

@saltstack-formulas-travis

🎉 This issue has been resolved in version 2.0.0 🎉

The release is available on GitHub release

Your semantic-release bot 📦🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants