Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2018.3 regression - unable to create Caller in a runner #46905

Closed
golmaal opened this issue Apr 5, 2018 · 34 comments
Closed

2018.3 regression - unable to create Caller in a runner #46905

golmaal opened this issue Apr 5, 2018 · 34 comments
Assignees
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt severity-high 2nd top severity, seen by most users, causes major problems

Comments

@golmaal
Copy link
Contributor

golmaal commented Apr 5, 2018

Description of Issue/Question

Runners which create salt.client.Caller throw exception listed below.
This stops us from upgrading to 2018.3 release.

Steps to Reproduce Issue

On a CentOS 7.4 minimally installed VM, do the following -

# curl -o bootstrap-salt.sh -L https://bootstrap.saltstack.com
# sudo sh bootstrap-salt.sh -M git 2018.3

Make sure that following returns true -

[root@base2 ~]# salt-call test.ping
local:
    True

Now create a test runner with the following -
cat /srv/_runners/test_runner.py

# -*- coding: utf-8 -*-
'''
test_runner.py - implementation of test runner api
Custom salt runner module
'''

import logging
import salt.client

__virtualname__ = 'test_runner'

#pylint: disable=bad-indentation, line-too-long, invalid-name

def __virtual__():
  return __virtualname__

def _get_logger():
  '''
  Get current logger.
  '''
  logger = logging.getLogger(__name__)
  return logger

def ping():
  '''
  Verify presence and installation of the module
  CLI Example:
     salt-run test_runner.ping
  '''
  local = salt.client.Caller()
  _get_logger().info('%s pinged', __name__)
  return True

Running (after configuring runners etc.) the salt-run test_runner.ping returns the following exception -

2018-04-05 12:41:06,259 [tornado.application:611 ][ERROR   ][9823] Exception in callback <functools.partial object at 0x2800ec0>
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 591, in _run_callback
    ret = callback()
  File "/usr/lib64/python2.7/site-packages/tornado/stack_context.py", line 342, in wrapped
    raise_exc_info(exc)
  File "/usr/lib64/python2.7/site-packages/tornado/stack_context.py", line 313, in wrapped
    ret = fn(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 597, in <lambda>
    self.add_future(ret, lambda f: f.result())
  File "/usr/lib64/python2.7/site-packages/tornado/concurrent.py", line 214, in result
    raise_exc_info(self._exc_info)
  File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 230, in wrapper
    yielded = next(result)
  File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 595, in _authenticate
    io_loop=self.io_loop)
  File "/usr/lib/python2.7/site-packages/salt/transport/client.py", line 108, in factory
    return salt.transport.zeromq.AsyncZeroMQReqChannel(opts, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 133, in __new__
    obj.__singleton_init__(opts, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 197, in __singleton_init__
    kwargs={'io_loop': self._io_loop})
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 927, in __init__
    super(AsyncReqMessageClientPool, self).__init__(AsyncReqMessageClient, opts, args=args, kwargs=kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/__init__.py", line 71, in __init__
    self.message_clients = [tgt(*args, **kwargs) for _ in range(sock_pool_size)]
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 974, in __init__
    self._init_socket()
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 1028, in _init_socket
    self.stream = zmq.eventloop.zmqstream.ZMQStream(self.socket, io_loop=self.io_loop)
  File "/usr/lib64/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 114, in __init__
    self._init_io_state()
  File "/usr/lib64/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 535, in _init_io_state
    self.io_loop.add_handler(self.socket, self._handle_events, self._state)
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 703, in add_handler
    self._impl.register(fd, events | self.ERROR)
TypeError: argument must be an int, or have a fileno() method.

If I manually update the pyzmq using pip to 17.0.0 version, this exception disappears.

Observation - By default repo.saltstack.com is supplying python-zmq version 15.3, should this be updated?

Versions Report

Salt Version:
           Salt: 2018.3.0-n/a-a064a3e

Dependency Versions:
           cffi: Not Installed
       cherrypy: Not Installed
       dateutil: Not Installed
      docker-py: Not Installed
          gitdb: Not Installed
      gitpython: Not Installed
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: Not Installed
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.1
   mysql-python: Not Installed
      pycparser: Not Installed
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: Not Installed
         Python: 2.7.5 (default, Aug  4 2017, 00:39:18)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: Not Installed
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4

System Versions:
           dist: centos 7.4.1708 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-693.17.1.el7.x86_64
         system: Linux
        version: CentOS Linux 7.4.1708 Core
@garethgreenaway garethgreenaway added this to the Blocked milestone Apr 5, 2018
@garethgreenaway garethgreenaway added the Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged label Apr 5, 2018
@garethgreenaway
Copy link
Contributor

@dmurphy18 @rallytime @Ch3LL thoughts?

@rallytime
Copy link
Contributor

We should be supported both the new 17.0 version and the older versions of pyzmq, so we need to get this fixed. @garethgreenaway Were you able to reproduce this? Looks like @golmaal is working off of the 2018.3 branch.

@isbm was the one who updated the support for 17.0 somewhat recently in preparation for the 2018.3.0 release. @isbm Perhaps you know where the fix for this might be since you were in this code most recently.

@isbm
Copy link
Contributor

isbm commented Apr 6, 2018

PyZMQ 15... OK, I have to look at this then. I will be anyway putting all that together for 2017.7

@binocvlar
Copy link
Contributor

Overview

It looks like we're also striking this bug that @golmaal reported (and note that we're also using a custom runner which makes use of salt.client.Caller).
This bug was induced by upgrading from 2017.7.5 to 2018.3.0 on one of our salt master nodes.

Traceback

# salt-run state.event
[ERROR   ] Exception in callback <functools.partial object at 0x3e901b0>
Traceback (most recent call last):
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 591, in _run_callback
    ret = callback()
  File "/usr/lib64/python2.7/site-packages/tornado/stack_context.py", line 274, in null_wrapper
    return fn(*args, **kwargs)
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 597, in <lambda>
    self.add_future(ret, lambda f: f.result())
  File "/usr/lib64/python2.7/site-packages/tornado/concurrent.py", line 214, in result
    raise_exc_info(self._exc_info)
  File "/usr/lib64/python2.7/site-packages/tornado/gen.py", line 230, in wrapper
    yielded = next(result)
  File "/usr/lib/python2.7/site-packages/salt/crypt.py", line 596, in _authenticate
    io_loop=self.io_loop)
  File "/usr/lib/python2.7/site-packages/salt/transport/client.py", line 108, in factory
    return salt.transport.zeromq.AsyncZeroMQReqChannel(opts, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 132, in __new__
    obj.__singleton_init__(opts, **kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 196, in __singleton_init__
    kwargs={'io_loop': self._io_loop})
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 929, in __init__
    super(AsyncReqMessageClientPool, self).__init__(AsyncReqMessageClient, opts, args=args, kwargs=kwargs)
  File "/usr/lib/python2.7/site-packages/salt/transport/__init__.py", line 71, in __init__
    self.message_clients = [tgt(*args, **kwargs) for _ in range(sock_pool_size)]
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 976, in __init__
    self._init_socket()
  File "/usr/lib/python2.7/site-packages/salt/transport/zeromq.py", line 1030, in _init_socket
    self.stream = zmq.eventloop.zmqstream.ZMQStream(self.socket, io_loop=self.io_loop)
  File "/usr/lib64/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 114, in __init__
    self._init_io_state()
  File "/usr/lib64/python2.7/site-packages/zmq/eventloop/zmqstream.py", line 535, in _init_io_state
    self.io_loop.add_handler(self.socket, self._handle_events, self._state)
  File "/usr/lib64/python2.7/site-packages/tornado/ioloop.py", line 703, in add_handler
    self._impl.register(fd, events | self.ERROR)
TypeError: argument must be an int, or have a fileno() method.

Versions Report

Salt Version:                                                                                                                                                                                                                                                                                                                
           Salt: 2018.3.0
 
Dependency Versions:
           cffi: 1.6.0
       cherrypy: Not Installed
       dateutil: Not Installed
      docker-py: Not Installed
          gitdb: 0.6.4
      gitpython: 1.0.1
          ioflo: Not Installed
         Jinja2: 2.7.2
        libgit2: 0.24.6
        libnacl: Not Installed
       M2Crypto: Not Installed
           Mako: Not Installed
   msgpack-pure: Not Installed
 msgpack-python: 0.5.1
   mysql-python: Not Installed
      pycparser: 2.14
       pycrypto: 2.6.1
   pycryptodome: Not Installed
         pygit2: 0.24.2
         Python: 2.7.5 (default, Aug  4 2017, 00:39:18)
   python-gnupg: Not Installed
         PyYAML: 3.11
          PyZMQ: 15.3.0
           RAET: Not Installed
          smmap: 0.9.0
        timelib: Not Installed
        Tornado: 4.2.1
            ZMQ: 4.1.4
 
System Versions:
           dist: centos 7.4.1708 Core
         locale: UTF-8
        machine: x86_64
        release: 3.10.0-693.21.1.el7.x86_64
         system: Linux
        version: CentOS Linux 7.4.1708 Core

@isbm
Copy link
Contributor

isbm commented Apr 12, 2018

Yuck... Sorry for this, guys. It was pretty heavily tested with 14 and then 17. Part of this because at SUSE we had 14 and then upgraded it to the latest in upcoming SLE 15 and Leap. And so I had naїve trust to ZMQ's claim it is different only after 16, but till 16 it is same as 13. But seems I have shouldn't trust that... Also strange that this popped up only now. No Beta testers around?

If you are unable to wait, just upgrade your PyZMQ to 17 and libzmq (or upgrade to Leap over CentOS, then it will work flawlessly out of the box — Salt is default there now, hehe 😛 ). It has anyway quite dramatic bugfixes to ZMQ you want to have.

Jokes aside, in the meanwhile I will try to fix this for you.

@rallytime
Copy link
Contributor

Thanks @isbm!

@gtmanfred gtmanfred added severity-critical top severity, seen by most users, serious issues ZRELEASED - 2018.3.1 and removed Pending-Discussion The issue or pull request needs more discussion before it can be closed or merged labels May 8, 2018
@gtmanfred gtmanfred modified the milestones: Blocked, Approved May 8, 2018
@gtmanfred gtmanfred added severity-high 2nd top severity, seen by most users, causes major problems P1 Priority 1 Core relates to code central or existential to Salt TEAM Core labels May 8, 2018
@isbm
Copy link
Contributor

isbm commented May 14, 2018

Some update: I cannot reproduce this anymore in develop branch. Since I am on virtenv, I have my configs elsewhere, so I modified the thing a bit:

import logging
import salt.client
import salt.config
from salt.client import Caller

__opts__ = salt.config.minion_config('/home/isbm/sandbox/salt/etc/minion')
__opts__['file_client'] = 'local'

def __virtual__():
    return __virtualname__

def _get_logger():
    '''
    Get current logger.
    '''
    logger = logging.getLogger(__name__)
    return logger

def foo():
    Caller(mopts=__opts__)
    _get_logger().info('%s pinged', __name__)
    return True

Call:

$ salt -V | grep PyZMQ
          PyZMQ: 15.0.0

$ salt-run --version
salt-run 2018.3.0-386-g8f3f314 (Oxygen)

$ salt-run test_runner.foo
True

Is this problem is still relevant?

@rallytime rallytime removed ZRELEASED - 2018.3.1 severity-critical top severity, seen by most users, serious issues labels May 14, 2018
@isbm
Copy link
Contributor

isbm commented May 14, 2018

@binocvlar BTW, the whole way of doing this sort of things in runner is wrong. You should not import modules directly, but use utils, as per documentation.

@isbm
Copy link
Contributor

isbm commented May 14, 2018

Needless to say, there was couple of PRs regards to ZMQ from @DmitryKuzmenko and myself, so probably this was fixed in between. It just looks like totally irrelevant to the issue, because ZMQ registrar was touched indeed. And in some cases it probably was wrongly registering itself with the specific ZMQ version, and thus could not further process the salt-run which itself initialises all that stuff. I remember the latter PRs was exactly addressing this sort of type of issues.

I will try to look at other branches later.

@binocvlar
Copy link
Contributor

@binocvlar BTW, the whole way of doing this sort of things in runner is wrong. You should not import modules directly, but use utils, as per documentation.

@isbm, when you write "the whole way of doing this sort of things in runner is wrong", are you saying that @golmaal's specific example is wrong? Or that ever using import salt.client within a runner is wrong?

For context, what I've done is as follows:

  1. Setup a reactor, which triggers a runner
  2. When the runner is triggered, it transforms some input data, and then passes that data into a call to a statefile.

Are you suggesting that instead of running the statefile via the runner, I instead run a custom utility module instead?

I've read over the documentation for runners and utils, and can't see anything that seems to suggest that what I'm doing is in error. Can you please elaborate?

@isbm
Copy link
Contributor

isbm commented May 22, 2018

@binocvlar well, I meant that you normally don't do import foo in runners, but using it as properly documented. 😉 On the other hand, seems like you're one of many people stumbling upon this, since naturally it is indeed just to import modules as usual. But in this case.

@isbm
Copy link
Contributor

isbm commented May 22, 2018

As of the issue, do you still getting the problem? Since I cannot reproduce it anymore (unless my environment has something hidden I am missing and I am blind seeing it).

@binocvlar
Copy link
Contributor

I meant that you normally don't do import foo in runners, but using it as properly documented

@isbm - there is an import statement in the very simple demo runner that's listed on the Salt Runners documentation page:

# Import salt modules
import salt.client

def up():
    '''
    Print a list of all of the minions that are up
    '''
    client = salt.client.LocalClient(__opts__['conf_file'])
    minions = client.cmd('*', 'test.ping', timeout=1)
    for minion in sorted(minions):
        print minion

Additionally, I just sampled five runners from https://github.com/saltstack/salt/tree/develop/salt/runners , and they all contain import statements.

As I mentioned earlier:

I've read over the documentation for runners and utils, and can't see anything that seems to suggest that what I'm doing is in error.

With regard to this question:

As of the issue, do you still getting the problem?

You mentioned earlier that something seems to have been fixed within the develop branch. When I performed my testing, I was using the official 2018.3.0 package. I'll probably wait for an updated package prior to testing again.

@gtmanfred
Copy link
Contributor

We import salt.client into runners all the time.

https://github.com/saltstack/salt/blob/2017.7/salt/runners/salt.py#L39

The clients are not exposed through utils.

@gtmanfred
Copy link
Contributor

@binocvlar we made some changes in after that package has been released, can you try using the 2018.3.1 branch and install from git to test and see if the new release is going to fix your issues?

@dawidmalina
Copy link

I have quite similar problem. I see a lot of errors like this:

2018-11-29 16:27:58,655 [py.warnings      :156 ][WARNING ][12991] /usr/lib/python2.7/dist-packages/zmq/eventloop/ioloop.py:211: RuntimeWarning: IOLoop.current expected instance of <class 'zmq.eventloop.ioloop.ZMQIOLoop'>, got <tornado.platform.epoll.EPollIOLoop object at
0x7fe7f1712c50>
  ioloop.IOLoop.instance() is IOLoop.instance(), "tornado IOLoop already initialized"

Upgrading pyzmq is not an option as python-zmq is provided by ubuntu and this package is required by salt-minion and salt-master:

Collecting pyzmq
  Using cached https://files.pythonhosted.org/packages/ab/00/57fec02fdd27aaa38f8d570a1736a6aa9534d242690a89b55eeee902a757/pyzmq-17.1.2-cp27-cp27mu-manylinux1_x86_64.whl
Installing collected packages: pyzmq
  Found existing installation: pyzmq 16.0.2
Cannot uninstall 'pyzmq'. It is a distutils installed project and thus we cannot accurately determine which files belong to it which would lead to only a partial uninstall.
The following packages have unmet dependencies:
 salt-master : Depends: python-zmq (>= 14.4.0) but it is not going to be installed
 salt-minion : Depends: python-zmq (>= 14.4.0) but it is not going to be installed

I'm using 2018.3.3 with tcp transport protocol.

@cmclaughlin
Copy link

I found another variation of this problem... and the workaround listed above doesn't seem to help.

I have several runners that create caller clients. Here's a simple example:

# my_file_roots/my_state_dir/_runnners/charles_test.py

def test():
    caller_client = salt.client.Caller()
    pillar_key = 'salt'
    pillar = caller_client.cmd('pillar.get', pillar_key, pillarenv='base')
    return pillar

After my recent upgrade to 2018.3.4, I made sure to have pyzmq==17.1.0 as mentioned here.

Runners which create caller clients work for me after pinning pyzmq when I run them via salt-run, but calling them over the http API (rest_cherrypy) brings me back to the IOLoop is already running error..

[root@ops-saltmastertest-1c001 salt]# salt-run charles_test.test
[WARNING ] The 'environment' minion config option has been renamed to 'saltenv'. Using cmclaughlin as the 'saltenv' config value.
[ERROR   ] Broken CPE_NAME format in /etc/os-release!
[WARNING ] The 'environment' minion config option has been renamed to 'saltenv'. Using cmclaughlin as the 'saltenv' config value.
[ERROR   ] Broken CPE_NAME format in /etc/os-release!
minion:
    ----------
    master_tries:
        -1
    mine_functions:
        ----------
        network.interface_ip:
            - eth0
    pillarenv_from_saltenv:
        True
restart_via_at:
    True
[INFO    ] Runner completed: 20190520203558135448

# Same runner over http...
[root@ops-saltmastertest-1c001 salt]# pepper --client runner charles_test.test
{
    "return": [
        "Exception occurred in runner charles_test.test: Traceback (most recent call last):\n  File \"/usr/lib/python2.7/site-packages/salt/client/mixins.py\", line 387, in _low\n    data['return'] = self.functions[fun](*args, **kwargs)\n  File \"/salt/extension_modules/runners/charles_test.py\", line 10, in test\n    caller_client = salt.client.Caller()\n  File \"/usr/lib/python2.7/site-packages/salt/client/__init__.py\", line 2114, in __init__\n    self.sminion = salt.minion.SMinion(self.opts)\n  File \"/usr/lib/python2.7/site-packages/salt/minion.py\", line 806, in __init__\n    lambda: self.eval_master(self.opts, failed=True)\n  File \"/usr/lib64/python2.7/site-packages/tornado/ioloop.py\", line 439, in run_sync\n    self.start()\n  File \"/usr/lib64/python2.7/site-packages/zmq/eventloop/ioloop.py\", line 162, in start\n  File \"/usr/lib64/python2.7/site-packages/tornado/ioloop.py\", line 730, in start\n    raise RuntimeError(\"IOLoop is already running\")\nRuntimeError: IOLoop is already running\n"
    ]
}

Should this be considered a new issue?

@sagetherage sagetherage added Magnesium Mg release after Na prior to Al and removed TEAM Core labels May 23, 2020
@sagetherage sagetherage removed the P1 Priority 1 label Jun 3, 2020
@sagetherage sagetherage modified the milestones: Approved, Magnesium Jul 14, 2020
@sagetherage sagetherage removed the Magnesium Mg release after Na prior to Al label Oct 8, 2020
@sagetherage sagetherage modified the milestones: Magnesium, Approved Oct 8, 2020
@sagetherage sagetherage added Bug broken, incorrect, or confusing behavior Phosphorus v3005.0 Release code name and version labels Jun 18, 2021
@sagetherage sagetherage modified the milestones: Approved, Phosphorus Jun 18, 2021
@Ch3LL Ch3LL removed the Phosphorus v3005.0 Release code name and version label Mar 30, 2022
@dmurphy18
Copy link
Contributor

@golmaal Given the long passage of time and 2018.3.0 and Python world has moved to Python 3.
Can you you retry this with the latest release of Salt 3006 and see if you still have an issue.

Or if you are not interested in this issue anymore, please consider closing the issue.

Thanks.

@dmurphy18 dmurphy18 self-assigned this Apr 26, 2023
@anilsil anilsil removed this from the Chlorine v3007.0 milestone May 11, 2023
@dmurphy18
Copy link
Contributor

@golmaal Closing this issue since no response to query about it. Please feel free to reopen with a reply to my question concerning Python 3 and the latest version of Salt if the problem still occurs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug broken, incorrect, or confusing behavior Core relates to code central or existential to Salt severity-high 2nd top severity, seen by most users, causes major problems
Projects
None yet
Development

No branches or pull requests