Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wait_for_connection gives errors #655

Closed
mlaurense opened this issue Oct 14, 2019 · 7 comments · Fixed by #710
Closed

wait_for_connection gives errors #655

mlaurense opened this issue Oct 14, 2019 · 7 comments · Fixed by #710

Comments

@mlaurense
Copy link

mlaurense commented Oct 14, 2019

I'm using ansible + mitogen for automated server updates on Ubuntu. I'm using these tasks to check if a reboot is necessary after the apt upgrade task:

  - name: Check if reboot is required
    stat:
      path: /var/run/reboot-required
    register: reboot_required

  - name: Reboot server
    shell: sleep 2 && shutdown -r now "Ansible updates triggered"
    async: 1
    poll: 0
    when: reboot_required.stat.exists == True

  - name: Wait 300 seconds for server to become available
    wait_for_connection:
      delay: 30
      timeout: 300
    when: reboot_required.stat.exists == True

This can be manually reproduced by just creating a /var/run/reboot-required file.

I recently updated to the latest ansible (2.8.5) and mitogen (master branch). Previously the wait_for_connection would trigger a warning ([WARNING]: Reset is not implemented for this connection), but now the following error is shown every second, from the moment the reboot is complete:

ERROR! [mux  23628] 12:00:35.305036 E mitogen.service: Pool(1c50, size=32, th='mitogen.Pool.1c50.24'): while invoking u'reset' of u'ansible_mitogen.services.ContextService'
Traceback (most recent call last):
  File "/home/epartment/epaflex/mitogen-master/mitogen/service.py", line 621, in _on_service_call
    return invoker.invoke(method_name, kwargs, msg)
  File "/home/epartment/epaflex/mitogen-master/mitogen/service.py", line 307, in invoke
    response = self._invoke(method_name, kwargs, msg)
  File "/home/epartment/epaflex/mitogen-master/mitogen/service.py", line 293, in _invoke
    ret = method(**kwargs)
  File "/home/epartment/epaflex/mitogen-master/ansible_mitogen/services.py", line 186, in reset
    mitogen.core.listen(context, 'disconnect', l.put)
  File "/home/epartment/epaflex/mitogen-master/mitogen/core.py", line 431, in listen
    _signals(obj, name).append(func)
  File "/home/epartment/epaflex/mitogen-master/mitogen/core.py", line 421, in _signals
    obj.__dict__
AttributeError: 'NoneType' object has no attribute '__dict__'

When disabling mitogen, the playbook runs fine.

@davidklaftenegger
Copy link

I can reproduce the issue with any use of wait_for_connection

As a workaround you can specify strategy: linear in plays that use this particular module, but to still benefit from mitogen it might be necessary to split this one task out into a separate play within the playbook.

@ubermug
Copy link

ubermug commented Apr 14, 2020

I have also experienced this issue in relation to the wait_for_connection module. Using @davidklaftenegger suggestion of switching back to linear worked for me, but I would love to see this fixed to be able to use the mitogen_linear strategy everywhere. Using ansible 2.7.0

@davidklaftenegger
Copy link

as this is still an issue and I don't know how long it will be, here's a more elaborate explanation of the workaround for everyone finding this bug through google:

My example playbook is this:

- name: example play with wait_for_connection
  hosts: all
  remote_user: root
  become: yes           
  any_errors_fatal: true
  tasks:
          - name: some tasks
            debug:
                    msg: "here some tasks or roles are applied before rebooting"
          - name: wait for machines to boot
            wait_for_connection:                   
                    timeout: 900
            register: boot_wait_time
          - debug:
                  msg: "{{ inventory_hostname }} needed to wait {{ boot_wait_time.elapsed }} seconds"
            when: boot_wait_time.elapsed > 3
          - name: some tasks
            debug:
                    msg: "here some tasks or roles are applied after rebooting"

As this fails with mitogen_linear set as the default strategy in my .ansible.cfg, I will instead write it like this:

- name: example play before wait_for_connection
  hosts: all
  remote_user: root
  become: yes           
  any_errors_fatal: true
  tasks:
          - name: some tasks
            debug:
                    msg: "here some tasks or roles are applied before rebooting"

- name: wait for machines to boot
  hosts: all
  remote_user: root
  become: yes
  any_errors_fatal: true
  strategy: linear # mitogen bug: https://github.com/dw/mitogen/issues/655
  tasks:             
          - name: wait for machines to boot
            wait_for_connection:                   
                    timeout: 900
            register: boot_wait_time
          - debug:
                  msg: "{{ inventory_hostname }} needed to wait {{ boot_wait_time.elapsed }} seconds"
            when: boot_wait_time.elapsed > 3

- name: example play after wait_for_connection
  hosts: all
  remote_user: root
  become: yes           
  any_errors_fatal: true
  tasks:
          - name: some tasks
            debug:
                    msg: "here some tasks or roles are applied after rebooting"

This will use mitogen for everything but wait_for_connection, but is obviously not as nice to read or write down.

zhongwm added a commit to zhongwm/mitogen that referenced this issue Apr 15, 2020
zhongwm added a commit to zhongwm/mitogen that referenced this issue Apr 15, 2020
@s1113950
Copy link
Collaborator

I took a look at this today #710 . I fixed the __dict__ error you encountered but am now running into an issue where wait_for_connection loops its ping test over and over until success, but never succeeds. That is because of:

mitogen/mitogen/service.py(92)get_or_create_pool()
-> recv=mitogen.core.Dispatcher._service_recv,
(Epdb) n
AttributeError: 'NoneType' object has no attribute 'add_handler'

coming from Mitogen's NewStylePlanner's should_fork() method.

I'm out of time to look at this today but I hope to finish it tomorrow!

@s1113950
Copy link
Collaborator

Should have fixed it here: #710 It passed tests on my side. Let me know if it works for you @davidklaftenegger @mlaurense @ubermug . My tests use Ansible 2.8.8 and I ran stuff on an Ubuntu 18.04 container. I'm gonna add the test block in this ticket description to Mitogen so the error doesn't happen again!

@s1113950
Copy link
Collaborator

s1113950 commented Apr 30, 2020

I integrated the test scenario mentioned in the ticket description into #710 (had to use docker though because testing shutdown is a bit hard on a localhost machine 😅 ). Once all tests pass/PR is approved I'll merge and this issue should be fixed 👍

@vaisov
Copy link

vaisov commented Oct 5, 2021

Still getting this issue with 2.9-tools. The error:

ERROR! [mux  62] 11:30:12.827456 E mitogen.service: Pool(2c70, size=32, th='mitogen.Pool.2c70.3'): while invoking 'reset' of 'ansible_mitogen.services.ContextService'
Traceback (most recent call last):
  File "/usr/lib/python3.8/site-packages/mitogen/service.py", line 621, in _on_service_call
    return invoker.invoke(method_name, kwargs, msg)
  File "/usr/lib/python3.8/site-packages/mitogen/service.py", line 307, in invoke
    response = self._invoke(method_name, kwargs, msg)
  File "/usr/lib/python3.8/site-packages/mitogen/service.py", line 293, in _invoke
    ret = method(**kwargs)
  File "/usr/lib/python3.8/site-packages/ansible_mitogen/services.py", line 186, in reset
    mitogen.core.listen(context, 'disconnect', l.put)
  File "/usr/lib/python3.8/site-packages/mitogen/core.py", line 431, in listen
    _signals(obj, name).append(func)
  File "/usr/lib/python3.8/site-packages/mitogen/core.py", line 421, in _signals
    obj.__dict__
AttributeError: 'NoneType' object has no attribute '__dict__'

Ansible task:

- name: Reboot after kernel update
  reboot:
    reboot_timeout: 3600
  when: kernel_ml_install.changed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants