Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker service not starting in centos 7.9 #1161

Closed
2 tasks done
tisc0 opened this issue Dec 9, 2020 · 12 comments
Closed
2 tasks done

Docker service not starting in centos 7.9 #1161

tisc0 opened this issue Dec 9, 2020 · 12 comments

Comments

@tisc0
Copy link

tisc0 commented Dec 9, 2020

  • This is a bug report
  • [] This is a feature request
  • I searched existing issues before opening this one

Hello there,
Not sure you are the ones managing the rpms repositories, but I think there's a problem with a version not supposed to be in stable repo (20.10.0-3, since I can't find it in edge, only in test)
Centos 7 stable

Would have been fine with we, if it wasn't breaking our postInstall playbooks, which is systematically blocking since this morning on enable+start the systemctl service/daemon.

Expected behavior

Enable + start the daemon smoothly

Actual behavior

Playbook stuck

TASK [gitlab-runner : Install docker-CE] ***************************************
changed: [localhost] => (item=docker-ce)

TASK [gitlab-runner : Enable docker service] ***********************************
^C

Steps to reproduce the behavior

  • centos 7 well udpated
  • Playbook Ansible
  - name: Add Docker-CE repository
    copy:
      src: files/docker-ce.repo
      dest: /etc/yum.repos.d/docker-ce.repo
      owner: root
      group: root
      mode: 0600
      backup: yes
    notify: yum-clean-all
    tags: install-dockerce

  - name: yum-clean-all
    command: yum clean all
    args:
      warn: no
    tags: install-dockerce

  - name: Install docker-CE
    package:
      name: "{{ item }}"
      state: present
    with_items:
      #- 'docker-ce'
      - 'docker-ce'
    tags: install-dockerce

  - name: Enable docker service
    service:
      name: docker.service
      enabled: yes
      state: started
    tags: install-dockerce

Output of docker --version:

[root@runnerx-docker-trzp ~]# docker --version
Docker version 20.10.0, build 7287ab3

[root@runnerx-docker-trzp ~]# rpm -qa | grep docker
docker-ce-rootless-extras-20.10.0-3.el7.x86_64
docker-ce-cli-20.10.0-3.el7.x86_64
docker-ce-20.10.0-3.el7.x86_64

Output of docker info:

[root@runnerx-docker-trzp ~]# docker info
Client:
 Context:    default
 Debug Mode: false
 Plugins:
  app: Docker App (Docker Inc., v0.9.1-beta3)
  buildx: Build with BuildKit (Docker Inc., v0.4.2-docker)

Server:
 Containers: 0
  Running: 0
  Paused: 0
  Stopped: 0
 Images: 5
 Server Version: 20.10.0
 Storage Driver: overlay2
  Backing Filesystem: xfs
  Supports d_type: true
  Native Overlay Diff: true
 Logging Driver: json-file
 Cgroup Driver: cgroupfs
 Cgroup Version: 1
 Plugins:
  Volume: local
  Network: bridge host ipvlan macvlan null overlay
  Log: awslogs fluentd gcplogs gelf journald json-file local logentries splunk syslog
 Swarm: inactive
 Runtimes: io.containerd.runc.v2 io.containerd.runtime.v1.linux runc
 Default Runtime: runc
 Init Binary: docker-init
 containerd version: 269548fa27e0089a8b8278fc4fc781d7f65a939b
 runc version: ff819c7e9184c13b7c2607fe6c30ae19403a7aff
 init version: de40ad0
 Security Options:
  seccomp
   Profile: default
 Kernel Version: 3.10.0-1127.13.1.el7.x86_64
 Operating System: CentOS Linux 7 (Core)
 OSType: linux
 Architecture: x86_64
 CPUs: 8
 Total Memory: 15.51GiB
 Name: runnerx-docker-trzp
 ID: CVTG:55YA:7626:ZRHJ:B7QF:VJXL:OS6M:KJ5U:VZRP:727A:G76Z:WBIE
 Docker Root Dir: /var/lib/docker
 Debug Mode: false
 Registry: https://index.docker.io/v1/
 Labels:
 Experimental: false
 Insecure Registries:
  127.0.0.0/8
 Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.)

Hosted on GCP preemptible VMs

@thaJeztah
Copy link
Member

Not sure I understand your question;

Not sure you are the ones managing the rpms repositories, but I think there's a problem with a version not supposed to be in stable repo (20.10.0-3, since I can't find it in edge, only in test)

Edge has been deprecated since docker 18.06 (but to help migrating users, packages were still pushed to that as well) see #1159 (comment)

docker 20.10 has been released to stable, that's intentional. (the test channel will get both "beta", "rc", and "stable" packages)

@tisc0
Copy link
Author

tisc0 commented Dec 9, 2020

Not sure I understand your question;

So just remember there's a problem with the 20.10.
Forcing the version in the playbook to the last 19.x brought things back to normal.

@tisc0 tisc0 changed the title RPMs packages not supposed to be in Stable ? Docker service not starting in centos 7.9 Dec 9, 2020
@cpuguy83
Copy link
Collaborator

cpuguy83 commented Dec 9, 2020

Can you get the logs?

journalctl -u docker

@thaJeztah
Copy link
Member

I think this may be the same issue as described in moby/moby#41767.

@tisc0
Copy link
Author

tisc0 commented Dec 10, 2020

Can you get the logs?

journalctl -u docker

No logs since nothing happens (and sorry, I didn't keep any not working instance).
But it really sounds like the same problem than in moby/moby#41767

For testing, I was also able to get the daemon started sometimes by just yum reinstall docker-ce*, but more efficiently with yum remove && yum install, which is not an acceptable workaround for runners starting automatically every morning, of course.

@Chaffelson
Copy link

I also found this issue with a fresh install via userdata script on Centos7.9 on EC2 instances - docker 3:20.10.0-3.el7 fails to start or log anything, pinning to 19.3 allows docker to start and install to continue.

This script version fails: Chaffelson/whoville@d644dbd#diff-85ce9455752e7166b94949bb8f50aaf6818a1165b3916230b6b56e1014978f85
This script version is fine: Chaffelson/whoville@607ffde#diff-85ce9455752e7166b94949bb8f50aaf6818a1165b3916230b6b56e1014978f85

@thaJeztah
Copy link
Member

Docker 20.10.1 was released, which contains an updated systemd unit that probably fixes this issue; if someone could try if the issue is resolved, that would be appreciated 👍

@SecT0uch
Copy link

SecT0uch commented Dec 15, 2020

Docker 20.10.1 was released, which contains an updated systemd unit that probably fixes this issue; if someone could try if the issue is resolved, that would be appreciated +1

Not fixed here :

$ docker --version
Docker version 20.10.1, build 831ebea

EDIT: When I try to remove the packages, yum hangs at Running transaction
EDIT2: I could successfully uninstall by killing dockerd process before

@thaJeztah
Copy link
Member

Does it work after a reinstall? Curious what's causing it (bit hard to debug if there's no longs that indicate what's failing)

@tisc0
Copy link
Author

tisc0 commented Dec 18, 2020

Hi @thaJeztah
We've tested a deployment this morning with the latest version and everything is back to normal for us.

Reminder : our runners are spawned every morning, with an ansible postinstall playbook installing docker-ce

Thank you very much !

@SecT0uch
Copy link

SecT0uch commented Dec 18, 2020

Does it work after a reinstall? Curious what's causing it (bit hard to debug if there's no longs that indicate what's failing)

Finally updated today, I confirm everything is running well again after:

  • uninstalling,
  • installing 19.03.*
  • upgrading to 20.10.*

Thanks :)

@sam-thibault
Copy link

Glad it's working!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants